forecasting poorly

March 23, 2013

(moderately tweaked excerpt from here)

How hard would it be to get ALL of the first round games in the NCAA men’s basketball tournament wrong? I mean, that would be pretty tough, right? Given that among the multiple millions of brackets submitted to ESPN this year, none got all the first round games right, it would seem hard to do the inverse too, right? So i’m thinking that next year i organize the “anti-confidence” NCAA pool. Instead of gaining points for every game you correctly predict, it’ll consist of losing points for every game you get right. I.e., your aim will be to incorrectly pick as many games as possible. It would seem easy to incorrectly pick the champ, final-four and even the elite 8. But my hunch is that people would even struggle to get all Sweet 16 teams wrong (see e.g., this year’s Kansas State, Wisconsin, La Salle, Ole Miss “pod”), and missing every team making the round of 32 would be almost impossible.

I think we’re going to have to put this to the test. Something like -1 point for every first round game right, -2 for round 2, -4 for sweet 16, -8 for elite 8, -16 for final 4 picks, -32 for final 4 winners and -64 for getting the champ right. Highest score (closest to zero) wins. How poorly do you think you could do?


a case for single-blind review

January 23, 2013

(Cross posted from here)
When i was in grad school, at one of the academic meetings i regularly participate in, it became regular fare for 2 particular folks in my circles to engage in a prolonged debate about how we should overhaul the academic publishing system. This was so regular (i recall them having portions of this debate for 3 consecutive years over dinner) that the grad students in the bunch thought of this as a grenade in our back pockets we could toss into the fray if ever conversations took an unwelcome turn to the boring. I bring this up because there are lots of aspects of this process that i have quite a few thoughts on, but have never really formalized them too much more than is required for such elongated dinner conversations. And one particular aspect of that was raised on Facebook yesterday by a colleague – asking about the merits of single blind review. I started my answer there, but wanted to engage this a little more fully. So, i’m going to start a series of posts (not sure how many there will be at this point) on the publication/review process here, that i think could be interesting discussions. I hope others will chime in with opinions, questions, etc. These posts will likely be slightly longer than typical fare around here. I expect that some of my thoughts on these will be much more formulated than others.

So, let’s start with a case for single blind review. I think think there are quite a few merits to single blind review (for a few other takes, see here and here). I won’t presume to cover them all here, but i will get a start. Feel free to add others, or tell me i’m completely off my rocker in the comments. Read the rest of this entry »


Neal Caren is on github, replication in social science!

December 11, 2012

I’m passionate about open-source science, so I had to give Big Ups to Neal Caren who I just learned is sharing code on github.  His latest offering  essentially replicates the Mark Regnerus study of children whose parents had same-sex relationships.  The writeup of this exercise is at Scatterplot.

My previous posts on github and sharing code are here and here.  If you’re on github, follow me.


Statistical Teaching (bleg)

November 20, 2012

Ok, in my research methods class, we are hitting an overview of statistics in the closing weeks of the semester. As such, i would prefer to include some empirical examples to visualize the things we’re going to talk about that are fun / outside my typical wheelhouse. So, do you have any favorite (read: typical, atypical, surprising, bizarre, differentially distributed, etc.) examples of univariate distributions and/or bivariate associations that may “stick” in their memories when they see them presented visually? I have plenty of “standard” examples i could draw from, but they’re likely bored with the one’s i think of first by this point in the term. So, what are yours? It’s fine if you just have the numbers, i can convert them to visualizations, but if you have visual pointers, all the better.

(cross posted)


how many, indeed?

October 23, 2012

From class to news to research question. So, this morning in class I taught an article using the network scale-up method. It’s a great technique that’s been used to explore a number of interesting questions (e.g., war casualties, and HIV/AIDS).

I came back from that class to this article pointing to a debate on voter ID laws, and I couldn’t help but think that there has to be a meaningful way to throw this method at this question to estimate plausible bounds for the actual potential impact of these laws. And furthermore, it seems especially important because people without IDs are likely quite hard to accurately enumerate on there own (as are those who’ve engaged in voter fraud).

So, has this study already been published and i just missed it? Else, does someone have the data we’d need for that? I’m hoping it’s a solved question, as i assume its something it would be better to have known a few months ago than a few weeks from now. Anywho, just puzzling over a salient question that linked together some events from my day.

(Cross-posted)


visualization bleg

October 16, 2012

Reblogged from re-musing:

Click to visit the original post

I've been thinking about how to visualize a set of categorical correspondence comparisons for a while now, and haven't really come up with a solution i'm satisfied with. So, i'm asking if someone out there can help me out. Basically what i have is 30 observations, and they are each differentially distributed over 6 groups. I want a way to (ideally visually) convey those differences.

Read more… 261 more words

I need some help visualizing quite a bit of categorical information. Any assistance welcomed.

Scott E. Page - The Difference

October 15, 2012

Reblogged from re-musing:

Yesterday and today i've been at our AU faculty "retreat" in Cambridge (no, not that one, or that one, Cambridge, MD on the Chesapeake). Yesterday's keynote was by Scott Page, who basically worked his way through some of the main insights from his book The Difference. I really enjoyed both the book and the talk (which was really TED-like, in the good way, not the bad way*).

Read more… 484 more words


Dan reviews Goertz & Mahoney on Quant vs Qual

October 9, 2012

Dan Hirschman has a great review of the new book on quantitative and qualitative methodology by Goertz and Mahoney.

One of the things Goertz and Mahoney offer are two lists describing the different tendencies of quantitative and qualitative work.  I’d like to briefly comment on a couple of the contrasts which are accurate descriptions of common practice in quantitative methodology, but less so of best practice.  The first issue is how quants and quals think about non-linearity, the second is about their preference for within vs. across case variation.

After describing how qual researchers account for non-linearity, Dan says:

Of course, a quantitative model could accommodate these sorts of conceptual mass points, but it’s very much against the norms of the culture. Instead, we’d tend to load GDP/capita (or maybe log GDP/capita) into a regression equation, which thus implicitly assumes that all variation is meaningful, and that an extra $1000 is equally meaningful across the spectrum (or that a change of 10% is equally meaningful, in the log context).

I wouldn’t say modeling non-linearity is against the norms of the culture.  In fact, a failure to do so is something quant experts consider an elementary flaw.  Its interesting that it nonetheless gets through peer review so often.  Even if modeling non-linearity is part of agreed upon best practices, it is interesting and important that, as Dan says, it often isn’t done.

The book also observes that quants, compared to quals, are more likely to emphasize between case variation as compared to within case variation.  I think there is something to this, but one of the things that distinguishes the most rigorous quantitative research is that it often capitalizes on within case variation from panel data.

Keep in mind that I haven’t read the book, so I’m not sure the extent to which I’m responding to Dan vs. responding to Goertz and Mahoney.  But regardless, you should go read Dan’s review… its quite interesting.


New Media Matters!

October 1, 2012

I love blogging about blogs, so let me point you to a new working paper entitled “Do Political Blogs Matter? Corruption in State-Controlled Companies, Blog Postings, and DDoS Attacks.”  I certainly like the idea that blogs can be tools to fight corruption.  But, and I say this as someone who hasn’t read the paper, I don’t know how much should we care about the result that online criticism caused very short-term changes in stock prices.  Perhaps Brayden King, with his interest in activism directed towards private companies, would have an interesting comment.

The authors are economists, Ruben Enikolopov, Maria Petrova, and Konstantin Sonin.  The paper is here and the abstract:

Though new media has become a popular source of information, it is less clear whether or not they have a real impact on economic activity. In authoritarian regimes, where the traditional media are not free, this potential impact might be especially important. We study consequences of blog postings of a popular Russian anti-corruption blogger and shareholder activist Alexei Navalny on the stock prices of state-controlled companies. In an event-study analysis, we find a negative effect of company-related blog postings on both daily abnormal returns and within-day 5-minute returns. To cope with identification problem, we use the incidence of distributed denial-of-services (DDoS) attacks as a variable that negatively affects blog postings, but is uncorrelated with other determinants of asset prices. There is a substantial positive effect of the DDoS attacks on abnormal returns of the companies Navalny wrote about, and this effect is increasing in amount of his attention to these companies. The effect is decreasing in attention to posts of other top bloggers, increasing in visitors’ attention to Navalny’s posts, and is consistent with more pronounced individual, in contrast to institutional, trading. Finally, there are long-term effects of certain types of posts on stock returns, trading volume, and volatility. Overall, our evidence implies that blog postings about corruption in state-controlled companies have a negative causal impact on stock performance of these companies.


Diffusion and the Pop Song

September 29, 2012

(cross-posted here)
I just completed Gabriel Rossman‘s Climbing the Charts: What Radio Airplay Tells Us about the Diffusion of Innovation. Basically the question at the heart of the book is what makes a song (or songs in general) popular? As with Fabio Rojas’s take on it, I found the book really interesting, enjoyable to think through and useful to think with. He summarizes one aspect i especially liked about the book:

Rossman has a simple, but powerful, idea. The different stories imply different diffusion curves (graphs that map market saturation vs. time). Each story comes with a different curve. The “lightning in a bottle” story (hot songs diffuse through market networks) has a classical S-shaped curve. Promotion by the record industry has a discontinuous step function…

I agree that’s one of the particular strengths of the book. I also think it’s readily teachable, and will likely make an appearance in a future iteration of intro and/or my undergrad networks class. I have only a couple of minor quibbles with it, which largely stem from my not being in the sociology of culture inner-circle, and may be readily apparent to those who are.

Read the rest of this entry »


Follow

Get every new post delivered to your Inbox.