Ok, in my research methods class, we are hitting an overview of statistics in the closing weeks of the semester. As such, i would prefer to include some empirical examples to visualize the things we’re going to talk about that are fun / outside my typical wheelhouse. So, do you have any favorite (read: typical, atypical, surprising, bizarre, differentially distributed, etc.) examples of univariate distributions and/or bivariate associations that may “stick” in their memories when they see them presented visually? I have plenty of “standard” examples i could draw from, but they’re likely bored with the one’s i think of first by this point in the term. So, what are yours? It’s fine if you just have the numbers, i can convert them to visualizations, but if you have visual pointers, all the better.
From class to news to research question. So, this morning in class I taught an article using the network scale-up method. It’s a great technique that’s been used to explore a number of interesting questions (e.g., war casualties, and HIV/AIDS).
I came back from that class to this article pointing to a debate on voter ID laws, and I couldn’t help but think that there has to be a meaningful way to throw this method at this question to estimate plausible bounds for the actual potential impact of these laws. And furthermore, it seems especially important because people without IDs are likely quite hard to accurately enumerate on there own (as are those who’ve engaged in voter fraud).
So, has this study already been published and i just missed it? Else, does someone have the data we’d need for that? I’m hoping it’s a solved question, as i assume its something it would be better to have known a few months ago than a few weeks from now. Anywho, just puzzling over a salient question that linked together some events from my day.
I need some help visualizing quite a bit of categorical information. Any assistance welcomed.
Originally posted on re-musing:
I’ve been thinking about how to visualize a set of categorical correspondence comparisons for a while now, and haven’t really come up with a solution i’m satisfied with. So, i’m asking if someone out there can help me out. Basically what i have is 30 observations, and they are each differentially distributed over 6 groups. I want a way to (ideally visually) convey those differences. I.e., which observations are in which group(s). The problem is i need all 30 and all 6, so it’s a lot of information trying to be conveyed at once. One near solution i’ve come up with is a “Mosaic” plot, available in R (see example after the jump). There are lots of ways i could tweak what’s there, but it just doesn’t quite feel right. Any pointers to better options gratefully accepted.
View original 180 more words
Originally posted on re-musing:
Yesterday and today i’ve been at our AU faculty “retreat” in Cambridge (no, not that one, or that one, Cambridge, MD on the Chesapeake). Yesterday’s keynote was by Scott Page, who basically worked his way through some of the main insights from his book The Difference. I really enjoyed both the book and the talk (which was really TED-like, in the good way, not the bad way*). A gross oversimplification of the main idea from the book is that when problems are “hard” it’s frequently the case that diversely comprised teams are better performing than those comprised of the “best” members. The book does a good job of working through the math of this. And specifying when it is and is not the case. There are a few questions I’m left thinking about following the book/talk that i don’t think are yet fully developed.
First, I’d think…
View original 377 more words
Dan Hirschman has a great review of the new book on quantitative and qualitative methodology by Goertz and Mahoney.
One of the things Goertz and Mahoney offer are two lists describing the different tendencies of quantitative and qualitative work. I’d like to briefly comment on a couple of the contrasts which are accurate descriptions of common practice in quantitative methodology, but less so of best practice. The first issue is how quants and quals think about non-linearity, the second is about their preference for within vs. across case variation.
After describing how qual researchers account for non-linearity, Dan says:
Of course, a quantitative model could accommodate these sorts of conceptual mass points, but it’s very much against the norms of the culture. Instead, we’d tend to load GDP/capita (or maybe log GDP/capita) into a regression equation, which thus implicitly assumes that all variation is meaningful, and that an extra $1000 is equally meaningful across the spectrum (or that a change of 10% is equally meaningful, in the log context).
I wouldn’t say modeling non-linearity is against the norms of the culture. In fact, a failure to do so is something quant experts consider an elementary flaw. Its interesting that it nonetheless gets through peer review so often. Even if modeling non-linearity is part of agreed upon best practices, it is interesting and important that, as Dan says, it often isn’t done.
The book also observes that quants, compared to quals, are more likely to emphasize between case variation as compared to within case variation. I think there is something to this, but one of the things that distinguishes the most rigorous quantitative research is that it often capitalizes on within case variation from panel data.
Keep in mind that I haven’t read the book, so I’m not sure the extent to which I’m responding to Dan vs. responding to Goertz and Mahoney. But regardless, you should go read Dan’s review… its quite interesting.
I love blogging about blogs, so let me point you to a new working paper entitled “Do Political Blogs Matter? Corruption in State-Controlled Companies, Blog Postings, and DDoS Attacks.” I certainly like the idea that blogs can be tools to fight corruption. But, and I say this as someone who hasn’t read the paper, I don’t know how much should we care about the result that online criticism caused very short-term changes in stock prices. Perhaps Brayden King, with his interest in activism directed towards private companies, would have an interesting comment.
The authors are economists, Ruben Enikolopov, Maria Petrova, and Konstantin Sonin. The paper is here and the abstract:
Though new media has become a popular source of information, it is less clear whether or not they have a real impact on economic activity. In authoritarian regimes, where the traditional media are not free, this potential impact might be especially important. We study consequences of blog postings of a popular Russian anti-corruption blogger and shareholder activist Alexei Navalny on the stock prices of state-controlled companies. In an event-study analysis, we find a negative effect of company-related blog postings on both daily abnormal returns and within-day 5-minute returns. To cope with identification problem, we use the incidence of distributed denial-of-services (DDoS) attacks as a variable that negatively affects blog postings, but is uncorrelated with other determinants of asset prices. There is a substantial positive effect of the DDoS attacks on abnormal returns of the companies Navalny wrote about, and this effect is increasing in amount of his attention to these companies. The effect is decreasing in attention to posts of other top bloggers, increasing in visitors’ attention to Navalny’s posts, and is consistent with more pronounced individual, in contrast to institutional, trading. Finally, there are long-term effects of certain types of posts on stock returns, trading volume, and volatility. Overall, our evidence implies that blog postings about corruption in state-controlled companies have a negative causal impact on stock performance of these companies.
I just completed Gabriel Rossman‘s Climbing the Charts: What Radio Airplay Tells Us about the Diffusion of Innovation. Basically the question at the heart of the book is what makes a song (or songs in general) popular? As with Fabio Rojas’s take on it, I found the book really interesting, enjoyable to think through and useful to think with. He summarizes one aspect i especially liked about the book:
Rossman has a simple, but powerful, idea. The different stories imply different diffusion curves (graphs that map market saturation vs. time). Each story comes with a different curve. The “lightning in a bottle” story (hot songs diffuse through market networks) has a classical S-shaped curve. Promotion by the record industry has a discontinuous step function…
I agree that’s one of the particular strengths of the book. I also think it’s readily teachable, and will likely make an appearance in a future iteration of intro and/or my undergrad networks class. I have only a couple of minor quibbles with it, which largely stem from my not being in the sociology of culture inner-circle, and may be readily apparent to those who are.