April 28, 2010
Teppo Felin already blogged this at Org Theory but I thought I’d raise one question about Herb Gintis’s proposal for the unification of the behavioral sciences (paper and lecture). My question is this: would unification of the behavioral sciences discourage methodological and theoretical innovation?
As an interdisciplinary scholar, I am often frustrated but my fellow social scientists lack of regard for the insights gained in sister disciplines. Unification would seem to fix that problem, but some might argue that more unified academic standards would discourage innovation. The idea is that each discipline is currently like a separate experiment, and unifying them would be putting all our eggs in one basket.
I’d be interested to hear what other people think about this argument, but I’m inclined to believe we can pursue unification and intellectual diversity at the same time. (post edited for clarity)
April 21, 2010
A revision control system, for those with even less programming experience than myself, manages “changes to documents, programs, and other information stored as computer files.” The most advanced ones are used by teams of programmers who simultaneously edit the same code. Simpler revision control is built in to things like wikis and word processors.
I’m wondering whether a revision control system would be helpful for me now, or in the future, even if all I’m doing is statistics.
I’m working with a big dataset (ok Scott, not that big) and I’ve written a fair bit of code. Nothing too complicated, it is half data preparation, and half analysis and graphics. Every so often I save my code under a new name, that way, if I accidentally save bad changes, I can always revert to a previous state. I do the same thing with the dataset itself, and, in R, with my workspace. In fact, I have an extra reason to do this with the data and my R workspace: memory management. R often complains that its running out of memory so I respond by deleting variables that I probably won’t need or could recreate without too much trouble.
It is sometimes annoying to find code that I wrote simply because there is a lot of text to go through. I can only organize it one way, e.g. I could put all the code that makes graphs together, but then the code that makes graphs wouldn’t be placed next to the code that creates the data the graphs are based on.
Is a revision control system overkill for what I’m doing? Any other thoughts?
April 16, 2010
This sounds like a great conference! Though the distinguished participants have a great deal of wisdom, and I fully support attempts to tackle big questions, the summary reminds me of the limits of this approach. How accurately do you think we can predict where the big advances will come in the next fifty years? Determining which advances in the past fifty years were most important would seem to be much easier, but there would be a lot of disagreement on that question too, even if you restricted the sample to sociologists. Shouldn’t we be a bit disturbed by this?
If you want to take a shot at these questions I’d start by specifying the value of different types of advances, and then outline which type of advances are most likely.
April 6, 2010
A friend at Cloudera recently invited me to write a post for their corporate blog about how social scientists are using large scale computation.
I’ve been using Hadoop and MapReduce to study some really large datasets this year. I think it’s going to become more and more important and open the world of scientific computing to social scientists. I’m happy to evangelize for it.
One of the ideas that didn’t make its way into the final version is that even though the tools and data are becoming more widely available to laypeople, asking good social science questions — and answering them correctly — is still hard. It’s comparatively easy to ask the wrong question, use the wrong data, draw the wrong inference, and so on, epecially if the wrongness is subtle. As an example, I think the OkCupid blog is interesting, but it’s not social science.
Social science has long been concerned with sampling methods precisely because it’s dangerously easy to incorrectly extrapolate findings from a non-representative sample to an entire population. Drawing conclusions from internet-based interactions can be problematic because the sample frame doesn’t match the population of interest. Even though I learned to make a cigar box guitar from Make Magazine, I don’t assume I know that much about acoustic engineering. Likewise, recreational data analysis is fun, illuminating and perhaps suggestive of how our social world works, but one ought not conclude that correlations or trends tell the whole, correct story. However, if exploring and experimenting with data can spark an interest in quantitative analysis of our social world, then I think it’s all for the better.