Friday, September 25, 2009

How to lie with statistics

Here's how to know you're being hoodwinked when consuming polling data - look for a nonrandom pattern of trailing digits. Nate Silver has the lowdown:
One of the things I learned while exploring the statistical proprieties of the Iranian election, the results of which were probably forged, is that human beings are really bad at randomization. Tell a human to come up with a set of random numbers, and they will be surprisingly inept at trying to do so. Most humans, for instance, when asked to flip an imaginary coin and record the results, will succumb to the Gambler's Fallacy and be more likely to record a toss of 'tails' if the last couple of tosses had been heads, or vice versa. This feels right to most of us -- but it isn't. We're actually introducing patterns into what is supposed to be random noise.

Sometimes, as is the case with certain applications of Benford's Law, this characteristic can be used as a fraud-detection mechanism. If, for example, one of your less-trustworthy employees is submitting a series of receipts, and an unusually high number end with the trailing digit '7' ($27, $107, $297, etc.), there is a decent chance that he is falsifying his expenses. The IRS uses techniques like this to detect tax fraud.

Yesterday, I posed several pointed questions to David E. Johnson, the founder of Strategic Vision, LLC, an Atlanta-based PR firm which also occasionally releases political polls. One of the questions, in light of Strategic Vision LLC's repeated failure to disclose even basic details about its polling methodology, is whether the firm is in fact conducting polling at all, or rather, is creating fake but plausible-looking results in order to increase traffic and attention to its core business as a PR and literary firm.

I posed that question largely as a hypothetical yesterday. But today, I pose it much more literally. Certain statistical properties of the results reported by Strategic Vision, LLC suggest, perhaps strongly, the possibility of fraud, although they certainly do not prove it and further investigation will be required.

The specific evidence in question is as follows. I looked at all polling results reported by Strategic Vision LLC since the beginning of 2005; results from 2008 onward are available at their website; other polls were recovered through archive.org. This is a lot of data -- well over 100 polls, each of which asked an average of about 15-20 questions.
Hopefully those first few paragraphs will be a sufficient teaser to get you to read the whole thing, including the charts. Bottom line is that there seems to be something fishy about the data from Strategic Vision's polls. Saying that a person or an organization is doctoring data is pretty damned serious, but from my initial reading, I'd say there is good reason for suspicion from the looks of things. At bare minimum, I would probably look at poll results from this particular organization with even more skepticism than I do as normal practice with other polling organizations' results.

You know that old adage "numbers don't lie"? The truth is a bit more complicated. Numbers themselves are inanimate so in a very narrow sense they can't lie; but the people who present the numbers can - and sometimes do - lie.

No comments:

Post a Comment