March 18, 2012

A Fun DIY Science Goodie: Proof Yourself against Sensationalized Stats

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

For my book Brain Trust, I interviewed Keith Devlin, NPR’s “Math Guy,” a World Economic Forum fellow, and math professor at Stanford. And being a mathematician, Devlin thinks about things differently than the world at large. For example, in his very good monthly column Devlin’s Angle, he quotes the following problem, originally designed by puzzle master Gary Foshee: “I tell you that I have two children, and that (at least) one of them is a boy born on Tuesday. What probability should you assign to the event that I have two boys?”

Does this sound like a bunch of confounding mumbo jumbo meant to obscure the obvious fact that the other kid has exactly 50/50 chance of being a boy and so if one kid’s definitely a boy, the probability of them both being boys is one in two? Yes, yes it does.

But that’s not the case.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Without the “Tuesday” part, this is a famous problem first published in Scientific American by the venerable mathematician and puzzler Martin Gardner. Imagine the possible genders and birth orders of two kids: B-B, B-G, G-B, G-G. Now, in Gardner’s problem you know that at least one child is a boy, so you can nix only G-G as a possibility, leaving B-B, B-G, and G-B. In only one of these remaining three possibilities are both children boys, so instead of the knee-jerk one in two probability any sane person would expect, mathematicians like Devlin give only a one in three probability that, given one child is a boy, both kids are boys.

Yikes.

But the Tuesday bit can’t possibly matter, can it?

“It depends if you ask a mathematician or a statistician,” says Devlin. The mathematician would simply extend the possibilities that were available in the original puzzle and then nix the possibilities that could be nixed. If we didn’t know that one of the kids was born on a Tuesday, our possibilities would be all the possible crosses of: B-Mo, B-Tu, B-We, B-Th, B-Fr, B-Sa, B-Su, with G-Mo, G-Tu, G-We, G-Th, G-Fr, G-Sa, G-Su.

Cool so far?

Now, you know that either the first or the second child is a boy born on Tuesday, and here’s how Devlin lays out the revised possibilities:

First child B-Tu, second child: B-Mo, B-Tu, B-We, B-Th, B-Fr, B-Sa, B-Su, G-Mo, G-Tu, G-We, G-Th, G-Fr, G-Sa, G-Su.
Second child B-Tu, first child: B-Mo, B-We, B-Th, B-Fr, B-Sa, B-Su, G-Mo, G-Tu, G-We, G-Th, G-Fr, G-Sa, G-Su.

Since “both boys born on Tuesday” is already listed in the first set, we don’t need to list it again in the second, making 27 (instead of 28) possible combinations of gender and day of the week for two kids, if at least one is a boy born on Tuesday. And of these 27 possibilities, 13 of them include a second boy. So the answer is (instead of a one in two or one in three chance) a 13/27 chance that both will be boys.

D’you hear that crackling sound? That’s the sound of your neurons trying to deal with the previous five hundred words. Don’t say you weren’t warned. But stick with it. It’s worth it. You can do it.

Now, on to statisticians, who take another view entirely. To them it matters what else could have been said and the interpretations that can pop up when math is released into the real world. “For example,” says Devlin, “we’re taught that multiplication is commutative, that 3 × 4 is the same as 4 × 3; but in the real world three bags of four apples isn’t the same as four bags of three apples.” Similarly, he points out in his blog that if you’re told that a quarter pound of ham costs $2 and then asked what three pounds will cost, a mathematician would tell you $24, but a statistician who’s been to a supermarket knows there’s not enough information to answer the question -- of course, every supermarket discounts for bulk.

In the case of the Tuesday boy problem, imagine you’re from a culture that requires you to speak about an elder child first, before mentioning the younger. That means it’s the eldestchild who’s the boy, and you rule out both G-G (as before) but also G-B, leaving the possibilities BB and BG, and a one in two probability of both being boys.

So there are two broad interpretations of almost all real-world numbers problems -- the stripped-down, mathematicians’ approach and the interpretive statisticians’ approach. And it’s in this wiggle room of interpretation where pure math hits the real world that misleading statistics are born. For example, in 1993 the columnist George Will was mathematically correct when he wrote in the Washington Post that “the ten states with the lowest per-pupil spending included four – North Dakota, South Dakota, Tennessee, Utah – among the ten states with the top SAT scores. Only one of the ten states with the highest per-pupil expenditures – Wisconsin – was among the ten states with the highest SAT scores. New Jersey has the highest per-pupil expenditures, an astonishing $10,561…New Jersey’s rank regarding SAT scores? Thirty-ninth.”

Take a minute and see if you can spot the moment at which pure math became a misleading statistic.

I found this quote in a 1999 article in the Journal of Statistics Educationthat points out one important fact: in New Jersey all college-bound students take the SAT, whereas in North Dakota, South Dakota, Tennessee, and Utah, only the kids applying to out-of-state schools take the SAT. And you can bet these students applying out of state are the cream of the crop. This is selection bias, and it pops up everywhere. Yes, it seems odd that nine out of ten dentists recommend Crass toothpaste, and nine out of ten also recommend Goldgate, but it’s as easy as finding the right ten dentists to ask.

Or take the following headline (from WorldHealth.net), which demonstrates a trick central to a pop science writer’s existence: sincere smiling promotes longevity. Sure enough, the data in the original study show that people who flash sincere smiles in photographs live longer—the original study title is smile intensity in photographs predicts longevity.

Again, take a minute and see if you can spot the difference.

The trick is that the study demonstrates correlation, while the article implies causation. Does a Duchenne smile “predict” longevity? Yes. Does it “promote” longevity? Not necessarily. Mightn’t it be more likely that these smilers are happy and that something in happiness and not the smile itself actually promotes longevity? Similarly, it’s mathematically correct that gun owners have 2.7 times the chance of being murdered compared to non–gun owners. Does owning a gun cause the owner to be murdered, or might it be something in the character of people likely to own guns?

For another, take the 2010 claim by health reform director Nancy-Ann DeParle that due to the then recently passed health care bill, the average annual cost of insurance coverage would drop by one thousand dollars by 2019. Taken at face value, it’s true. But the reason it’s true is that nearly free health care would be extended to 32 million Americans who were currently without care, meaning that the cost to people who were already insured in 2010 would actually go up to cover the newly added.

This is an apples-to-oranges comparison, like decrying the increase in the average cost of a gallon of gas from $0.99/gal in February 1992 to $3.81/gal in March 2012 without adjusting for inflation. You can’t compare the two, because the rules of comparison have changed. On the flip side of the political spectrum, conservative UK politician Chris Grayling cited a 35 percent increase in “violent” crime starting in 2002 as evidence of failed liberal law enforcement policies. But 2002 was the year civilians and not police were given the right to designate a crime “violent,” and many chose to see violence where the police might not have. The "35 percent increase" was the difference between apples and oranges.

Finally, take data showing that the TSA misses 5 percent of people hired to test air security by trying to smuggle dangerous contraband. Yikes! One in twenty people sitting around you on the plane is packing an underwear bomb!

What’s the error?

It’s in sampling. Though some days it feels this way, not everyone is out to get you. In fact, imagine that even one of the two million passengers who try to fly over the United States every day is a deadly terrorist, and imagine the TSA misses 5 percent of them. This means that one in 40 million people flying is a deadly terrorist. Even on a Boeing 767 with a 300-person seating capacity, you’d have to fly more than 130,000 times to sit on a plane with a terrorist. (OK, that’s misleading, too: statisticians would point out that a 1 in 130,000 chance means you could be on a plane with a terrorist at any point, it’s just not ever very likely.) Compare that to a 1 in 100 lifetime chance of dying in a car crash. Actually, please do, because that’s a mathematically correct, misleading statistic, too – what if you don’t drive, or drive cautiously, or are already over age twenty-five?

So the moral of this long and somewhat convoluted tale is that first there’s math, then there’s stats, and finally there are headlines. And like a game of telephone, it’s easy to lose meaning along the way due to things like selection bias, correlation/causation, apples/oranges, and population error.

Mark Twain said there are lies, damned lies, and statistics. Illuminating this, renowned business professor Aaron Levenstein said that statistics are like bikinis – what they reveal is suggestive but what they conceal is vital. But not for you. You now know how to reveal what is vital.