I enjoy reading “The Numbers Guy” column in the WSJ, but today, guest writer Cari Tuna really hit it out of the park with her piece on a problem in statistical analysis called Simpson’s Paradox. Cari reports that the paradox is under-stating the severity of unemployment data.
A few weeks ago, I wrote a piece of three ways to look at the DOL employment data. Now, a fourth view emerges because of Simpson’s Paradox.
First, what is the paradox? Here’s Wikipedia’s version:
Simpson’s paradox (or the Yule-Simpson effect) is an apparent paradox in which the successes of groups seem reversed when the groups are combined. This result is often encountered in social and medical science statistics, and occurs when frequency data are hastily given causal interpretation; the paradox disappears when causal relations are derived systematically, through formal analysis.
Or, more simply (my version): Sometimes, because one subgroup of data can be much larger than another group, the total average looks better (or worse) than what’s really going on.
For unemployment, what Cari reported is that for college graduates and high school graduates, employment is really worse than the 80’s, but the total average doesn’t reflect this reality on the ground. Consider this:
Total Unemployment 25 and Older
For College Grads
For High School Dropouts:
And there’s the paradox at work, both College grads and High School dropouts are worse off, but the total unemployment data does not reflect the reality. Why? There are more college grads today (1/3rd of the working population) vs. 1983 (1/4 of the working population), which skews comparisons between the 2 recessions!
The WSJ article is good reading. It should be noted that the analysis was done by Henry Farber , an economist from Princeton found here.