how many, indeed?

October 23, 2012

From class to news to research question. So, this morning in class I taught an article using the network scale-up method. It’s a great technique that’s been used to explore a number of interesting questions (e.g., war casualties, and HIV/AIDS).

I came back from that class to this article pointing to a debate on voter ID laws, and I couldn’t help but think that there has to be a meaningful way to throw this method at this question to estimate plausible bounds for the actual potential impact of these laws. And furthermore, it seems especially important because people without IDs are likely quite hard to accurately enumerate on there own (as are those who’ve engaged in voter fraud).

So, has this study already been published and i just missed it? Else, does someone have the data we’d need for that? I’m hoping it’s a solved question, as i assume its something it would be better to have known a few months ago than a few weeks from now. Anywho, just puzzling over a salient question that linked together some events from my day.

(Cross-posted)


Help / Discussion lists for R packages

May 17, 2011

If you want to learn a methodology, there may be an email list you should be on.  The two big network analysis packages in R  Statnet and igraph each have one (sign up: Statnet, igraph, Mixed Models).  If you join them, you can ask questions when you get stuck.  But you may end up learning even more from other people’s questions.  Jorge M Rocha stimulated Carter Butts to write a mini-essay on exponential random graph models which I received permission to repost.  Dave Hunter also adds some thoughts at the bottom.

Read the rest of this entry »


a (more positive?) nod to Christakis & Fowler

October 7, 2010

Yes, i am aware of my elongated absence from this blog. And i have to plead…well i don’t know what my excuse is, so i’ll just say “howdy all” instead.

A recent article from PLoS One by Christakis and Fowler seems to be getting much less publicity than did their series of papers from the Framingham Heart Study. We’ve talked briefly about some contentions with that work here before.* The thing is that, by my reading, this newest paper is much more compelling and interesting than even the sum of their previous networks-based research, imo.

The new paper is an elegant finding – in essence that we would be better equipped for predicting flu epidemics if our estimates were based on surveillance of the nominated friends of a random sample, than we would get from tracking the random sample itself. It is firmly rooted in previous social networks research and a core idea/finding therein – Felds’ 1991 (gated) “Why your friends have more friends than you do.” And perhaps more importantly, is very clearly and simply potentially useful.

_________
*Incidentally, while i have been somewhat critical of their FHS work, i ended up trying out their Connected book for my current Intro Sociology class. i’ll have to get back to you on how effective a book it is for those purposes.


Social Network Packages Poll

May 6, 2010

Gabriel Rossman is running it here.


Scaling Social Science

April 6, 2010

A friend at Cloudera recently invited me to write a post for their corporate blog about how social scientists are using large scale computation.
I’ve been using Hadoop and MapReduce to study some really large datasets this year. I think it’s going to become more and more important and open the world of scientific computing to social scientists. I’m happy to evangelize for it.

One of the ideas that didn’t make its way into the final version is that even though the tools and data are becoming more widely available to laypeople, asking good social science questions — and answering them correctly — is still hard. It’s comparatively easy to ask the wrong question, use the wrong data, draw the wrong inference, and so on, epecially if the wrongness is subtle. As an example, I think the OkCupid blog is interesting, but it’s not social science.

Social science has long been concerned with sampling methods precisely because it’s dangerously easy to incorrectly extrapolate findings from a non-representative sample to an entire population. Drawing conclusions from internet-based interactions can be problematic because the sample frame doesn’t match the population of interest. Even though I learned to make a cigar box guitar from Make Magazine, I don’t assume I know that much about acoustic engineering. Likewise, recreational data analysis is fun, illuminating and perhaps suggestive of how our social world works, but one ought not conclude that correlations or trends tell the whole, correct story. However, if exploring and experimenting with data can spark an interest in quantitative analysis of our social world, then I think it’s all for the better.

Link: http://www.cloudera.com/blog/2010/04/scaling-social-science-with-hadoop


Network Analysis Bleg for Help

March 16, 2010

So I’ve been working with the National Longitudinal Study of Adolescent Health (Add Health) for a while but I’ve only recently began looking at the raw friendship nomination data.  I’m hoping that someone can give me some practical advice.

My first question this: would you recommend using the network or igraph package?

I’m working in R, and I want to create some measures of centrality.    I wasn’t planning on doing ERG models or anything else complicated at the moment, just simple stuff.   If you want to recommend a different programming environment I’m happy to hear you make your case.


facebook has a soul?*

February 10, 2010

i genuinely hope to get back to more (semi-)regular blogging here soon. But, in the meantime, in case you haven’t seen this one yet – here‘s a wild potential data release that may interest some of you. (ht BW)

____
*it’s highly possible i saw this same title in someone else’s mention of this elsewhere today. but if so, i can’t for the life of me recall where.


Free Textbooks on Networks

December 10, 2009

David Easley and Jon Kleinberg have a textbook coming out called Networks, Crowds, and Markets: Reasoning About a Highly Connected World.  It looks great, and for now you can download a preprint of the whole thing for free.  Cornell, home of founding co-blogger Matthew Brashears seems like a great place to do work on networks.

Robert Hanneman (with coauthors Riddle and Izquierdo) also has a free textbook or three for you to download.  I won’t try to summarize any of these books since you are just a click away from viewing them, but I will point out that they aren’t competitors… they each have a lot of unique material.

See more discussion of social network curriculum/pedagogy at Jimi’s post here.


Are Social Networks Fundamental?

November 19, 2009

Are social networks fundamental?  That is how Daniel Little frames this interesting post.  At first, I wasn’t quite sure what he meant.  I thought, “No, social networks can’t be understood without understanding the people that comprise them, the society they exist within, etc.,” but then I actually started reading his post and realized he is asking whether the concept of a social network is central to most social explanation.  This is something I am more inclined to agree with.  Was I the only one briefly thrown for a loop by that title?


homophily in sexual networks

November 10, 2009

Today in my social networks class, we covered Bearman, Moody and Stovel’s 2004 AJS paper. Virtually every time i have heard that paper presented (by Peter or Jim, i’ve never actually met Stovel in person), and the couple of times i’ve talked about it in class, some variant of the following has been part of that discussion:

So, we know that there are any number of factors on which homophily operates in the formation of social networks in general. And in the particular case of romantic networks, the story’s much the same. However, there is one particular trait on which homophily is a much more robust predictor of a relationship than any others. Any guesses what that might be?

At which point the audience guesses a long list of traits, and virtually never hits #1 on the head (my class today got it on about guess 14, when i made it multiple choice from about 15 options). So, i pose the question to you, fair reader. What do you think it is?* If you’ve read the footnotes/appendices to the paper closely you can likely guess, but i’m not sure whether its actually explicitly stated anywhere in that particular paper or not.

*If no one gets it, i’ll post the answer in the comments in a day or two.