Why 75% might be an overcount and 1 an undercount, but maybe not

Wow, it’s dusty around here. I couldn’t figure out where else to make this point, so came back here to share a quick thought.

Image Source: Wonkblog, http://wapo.st/1qI4oxJ Data Source: Public Religion Research Institute, http://bit.ly/1paM2tR

This story‘s been circulating on social media today. The basic punchline is how few non-white friends most whites have (the title comes from the estimated 75% of whites’ friendship networks that have no non-whites and the estimated 1 average black friend in whites’ networks). It then interprets a lot of the potential implications of this conclusion for recent reactions to/interpretations of events in Ferguson, MO. It’s not those implications that I want to take issue with here. In fact, I have few qualms with that part of the story. That’s in no small part because decades of homophily research wouldn’t question the general thrust their finding.

However, the method used here is overly simplistic, and shouldn’t be used to estimate these sorts of questions. Basically what they did is take the “important matters” network name generator and elicit the first 7 people respondents nominated. There’s been a lot of important methodological ink spilled on that data collection strategy, but that’s actually not the issue I have here either. (Let’s assume they’ve dealt with the data collection aspects well, which is potentially a problematic assumption itself, but I don’t think the main limitation of the report on which these estimates are based.) With those responses in hand, what the researchers appear to have done is basically compute the racial composition of those truncated personal networks then extrapolate those proportions up to presumed actual network size, or at least 100 person projections thereof (i.e., percentages).

Here’s the thing, truncated friendship lists like that (i.e., just eliciting the first 7 “important matters” partners) have severe problems in estimating actual proportions of events that have highly skewed distributions. This is why a series of strategies collectively known as the “Network Scale Up Method” were developed. In practice, this isn’t the most common use of the NSUM (which is more often used to estimate the size of hard to enumerate populations). But this is something the approach is able to handle quite nicely. What the NSUM basically does is recognize that various dimensions of overly dispersed traits can be elicited at once. The estimation requires that you then compare those that have known distributions in the population (e.g., how many people there are of particular races, ages, etc. in the population – not among the elicited names). This allows one to “scale up” from the elicitations on these numerous dimensions to allow one to estimate the “size” of someone’s personal network. These corrections could then be used (instead of the direct extrapolation of proportions) to estimate the number of friends of particular characteristics within particular folks’ personal networks of estimated (rather than arbitrarily fixed) size.

I don’t know enough about current homophily statistics (paging Matt Brashears, David Schaefer, or Matt Salganik) to suggest whether this approach would give substantially different point estimates than those arrived at in the report above. But, I can tell you with certainty that it would give you different error estimates (particularly the shape of them) than would the direct extrapolation used. Ok, I’ve soap-boxed enough, so I’ll end with the Youtube clip of the Chris Rock bit that the Wonkblog version of this story kicked off with.


4 Responses to Why 75% might be an overcount and 1 an undercount, but maybe not

  1. Ján Tiliki says:

    Actually, no, they did not “take the ‘important matters’ network name generator and elicit the first 7 people respondents nominated.” As Christopher Ingraham stated in the methodological note at the bottom of the Wonkblog post, they “asked respondents to name UP TO seven people with whom they regularly discussed important matters” [my emphasis].

    Only 17% of their respondents actually provided the maximum of seven names. More than a quarter provided one or zero names, and an absolute majority of respondents (58%) provided three or fewer. The average number of “people with whom [respondents] discussed matters important” to them over the previous 6 months was only 3.45.

    And it turns out that, of these 3.45 core discussion alters, an absolute majority were spouses, partners, or immediate kin. The average number of “friends” per respondent was a mere *1.45*. It’s likely that a substantial percentage of the respondents did not name a single friend in response to the question.

    The basic punchline is that the PRRI CEO and the Washington Post “data journalist” who “previously worked at the Brookings Institution and the Pew Research Center” are making claims about grapefruit on the basis of an analysis of a grape.

    It’s not so much that “the method used here is overly simplistic” as that there’s been a stupendous methodology failure, compounded by an epic journalism failure.

    Also, if Jones and Ingraham had spent a few minutes on Google before going public under the auspices of the Atlantic Monthly and the Washington Post, they most likely would have happened upon the results of a Reuters/Ipsos poll conducted a few weeks before the PRRI’s 2013 American Values Survey that asked, not about important matters discussants or “friends,” but specifically about *close* friends of a different race or ethnicity. According to this poll, about 40% of white Americans don’t have any close non-white friends; nearly 60% of white Americans do have one or more non-white close friends. And again, this is CLOSE FRIENDS, not people in a person’s “social networks” (Jones) or Ingraham’s “100-friend scenario.”

    Retractions and apologies are in order.


  2. jimi adams says:

    Sorry. As a person who works in this area, I relied on a shorthand I maybe shouldn’t have. Your note of the “UP TO” usage is completely consistent with the way I read the stories/report, how the “important matters” question is typically deployed, and how my interpretation intended it (if maybe failed to explicitly describe it). I think it can be both methodological flaw AND failure of journalistic interpretation. You’re right that if they elicited names via a different prompt, results almost definitely would have differed. But it’s also true that their scaling up of THIS question has severe flaws in its own right.

  3. Ján Tiliki says:

    (Actually in a hurry I cut and pasted the wrong number for average “friends” per respondent; it’s even lower: 1.27.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: