This past weekend I found myself listening to *This American Life,* a quirky show that tells a variety of stories about the American experience. The most recent show included a discussion of the potential and pitfalls in economic forecasting. And, as it turns out, predictive models of the national economy aren’t very good- with margins of error wide enough to straddle the range from “sluggish with rising unemployment” to “robust with decreasing unemployment.” That’s a little bit like going to your doctor and being told that, given your test results, you’re either going to life for another thirty years or be dead in six months. Most of us would probably not find such a prognosis terribly useful. Yet, what emerged on the show was not only an acknowledgement that economic forecasting is chancy at best- there was some discussion actually that predictions should be made along the lines of “growth will be two-ish percent”- but a wry commentary on the degree of precision in those estimates. Indeed, while the margin of error is so wide as to encompass both boom and bust, the predictions themselves regularly include two or more decimal points. It is as though the doctor has said that you will live for either thirty years and one hundred days, or six months and eight hours. By the time the range is that large, including the extra bits seems a tad silly. Yet, pointless or no, the precision is in the estimates and, more to the point, is actually demanded by the consumers. Even though the people who use these forecasts are aware of how inaccurate they can be they nevertheless seem to want all those extraneous decimal points.

The commentators on *This American Life* tried to tackle the issue of why, but they didn’t get very far. I won’t get very far either. But it seems to me that the desire for those decimal points stems from a belief that measuring and analyzing something using mathematics necessarily makes it more accurate, reliable, or even useful. And if you believe that, then obviously using math with more decimal points must work even better, right? Well, no, and that’s the problem. Measuring something using mathematics doesn’t make it more accurate or more reliable, it just makes it numeric. And it can be easy to lose sight of that.

When we measure something we’re making certain assumptions. We’re assuming, for example, that a good indicator of economic health is your salary from primary employment, or the value of your home. When we analyze something mathematically, we assume that the thing itself behaves in a certain way- that it has a linear effect on the dependent variable, for example, or that it has an appropriate distribution. Sometimes we can confirm these assumptions, sometimes we can’t, but they’re always there and they always influence what our results actually mean. Mathematics can be an enormous benefit to the research process- and I am a firm believer in them- but the elegance of our models will never relieve us of the burden of clarifying our ideas.

And this is something that we would do well to remember and, particularly, to remind our grad students of. Often grad students (and I include myself in this, once upon a time) are seduced by the apparent power of modern mathematical methods. On first exposure to multiple regression or formal models it may seem as though they have been given the keys to heaven and, like the sorcerer’s apprentice, may try to use them with reckless abandon. But math isn’t magic and its answers are only as accurate as the questions put to it are well-crafted.

Good quantitative analysis isn’t just about pushing the right buttons and running the right programs, it’s about having the awareness to really think about what you’re doing and what it means.

Good point, and i think your extension to graduate training is spot on. That said, i’m puzzling over potential sources of the extraneous precision.

First, as little as i read economic forecasting, i couldn’t help but wonder how wide the difference is between ““sluggish with rising unemployment” to “robust with decreasing unemployment.”” If a few percentage points are sufficient to cover the full range (say 0-5%), a greater degree of precision is warranted than if we’re talking about possibilities that lie between -100 and 100. But i suspect this isn’t the sort of problem you’re talking about.

Potentially more likely – often times i feel like precision in forecasting is more about being able to delineate accurate from inaccurate predictions than it is about giving expectation bounds for the predicted outcome. Thus the exercise in extraneous precision – while, as you indicate, is largely silly, especially for the latter – provides better opportunity to determine whether a single prediction hit the target or not (i’m thinking along the lines of why sports betting lines are always +/-0.5 points even when “half points” are impossible to score).

Indeed, we must be honest with ourselves about the limits of our models and uncertainty of our predictions.

Interesting post.

As an aside, we get “This American Life” re-broadcast over the summer on ABC Radio National (http://www.abc.net.au/rn/). I particularly liked the episode about prisoners doing Hamlet (http://bit.ly/5XiEjo).

With regards to your point about precision of predictions, this is sometimes raised in the reporting of results in journals. For example, is it appropriate to report correlations in a correlation matrix to three decimal places when the 95% confidence interval is plus or minus 0.2?

The optimal number of decimal places depends somewhat on the purpose of the reader.

If I wanted to run a meta analysis or a structural equation model on a published correlation matrix, more decimal places would be better. More decimal places would prevent rounding errors from cascading through subsequent analyses.

However, If I was a casual reader trying to get an impression of the results and the general trends, any more than two decimal places is likely to be distracting.

Thus, on one level precision sends a signal of over-confidence and can be distracting. However, this precision is useful for some applications.

I wonder how far we could get by greater reporting of confidence intervals and increased training of consumers of how to reason with uncertain predictions.

Jeromy,

One thing we could do is use fewer extra decimal places in the original publication but make supplementary tables (and code) available online. Another opportunity for me to reference: http://www.jeremyfreese.com/docs/freese-reproducibility-webdraft.pdf and http://www.jeremyfreese.com/docs/Freese%20-%20OpenSourceSocialScience%20-%20062807.pdf

Great point. I’m not sure if you’ve seen a link to this research: http://www.boingboing.net/2009/10/15/complex-derivatives.html, but the basic idea is that complex derivatives are intractable. You gotta love it!

Informative post.

I’m going to bookmark this one.