...a companion blog to "Math-Frolic," specifically for interviews, book reviews, weekly-linkfests, and longer posts or commentary than usually found at the Math-Frolic site.

*************************************************************************************************
"Mathematics, rightly viewed, possesses not only truth, but supreme beauty – a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show." ---Bertrand Russell (1907) Rob Gluck

"I have come to believe, though very reluctantly, that it [mathematics] consists of tautologies. I fear that, to a mind of sufficient intellectual power, the whole of mathematics would appear trivial, as trivial as the statement that a four-legged animal is an animal." ---Bertrand Russell (1957)

******************************************************************** Rob Gluck

Sunday, August 16, 2015

Still Legal... Torturing Data


Review of "Standard Deviations" by Gary Smith
"Lying with statistics is a time-honored con. In Standard Deviations, economics professor Gary Smith walks us through the various tricks and traps people use to back up their own crackpot theories. Today, data is so plentiful that researchers spend precious little time distinguishing between goo, meaningful indicators, and total nonsense. Not only do others use data to fool us, we fool ourselves."
-- from the back cover of the book


This is one fun read! And a volume that hasn't received enough attention. It's 300 pages of easy-to-follow, enlightening, illustrative (and non-technical) material, that is actually very important, as a veritable romp through the mined landscape of troublesome statistics, be they from mass media, academia, or from scientists themselves!

Gary Smith's book came out in 2014 at a time when I felt overdosed on popular statistics treatments, so I didn't give it much attention. It's now out in paperback and had I read it earlier it would've been on my "best books" list of 2014. It adds to the growing arsenal of work critiquing our statistical naivete. The phrase "lies, damned lies, and statistics," coined well over a century ago, has never been truer than today.

Smith himself is an economist, but he draws examples for this statistics salad from every nook-and-cranny of life; sports, Wall Street and finance, gambling/lotteries, advertising, medicine, research, ESP, etc. The book offers example after example after example of statistical tomfoolery, shenanigans, trickiness, and plain honest mistakes. If you've read much in this genre, many of Smith's examples will be familiar, even time-worn, but still his firehose spray of cases is well organized, impressive, fun, AND educational.

Various important themes run through the book:

1)  One major theme is how humans are pre-wired to look for and find patterns in their observations... and how easily that can lead them astray. As he writes at one point, "data clusters are everywhere, even in random data." Patterns need to be mitigated by common sense... if a pattern just doesn't make sense, then don't believe it, but look for other confounding variables, or sheer coincidence. The use of 'common sense' and reason in tandem with data, permeates these pages. A theory without good data to back it up isn't worth much, but so too, provocative data without a good theory to explain it is dubious -- data and theory ought go together like hand and glove.

2)  Graphs and visual displays are often a source of bias or distortion -- always check the labeling and scaling/spacing of axes or other depictions. Don't assume that data are collected, analyzed, or reported accurately.

3)  It's not always the data as presented that is a problem... it can also be the data that ISN'T presented -- either it was never collected, or it was collected, but for reasons not spelled out, then deleted from presentation. And what is missing may be more important than what is shown.

4)  "Regression to the mean" is the subject of another whole chapter, emphasizing that extreme or outlying data, performances, or events, often tend to revert to closer-to-the-mean values over time.

Of course sometimes a research study may actually be good, but the popular press reporting of it is flawed or oversimplified -- details and nuances being stripped away for the sake of time or space.

One reviewer faults Smith for being "relentlessly negative." That may be an overstatement, but even if true, I view it as a positive!... the book essentially says, 'Look here, and here, and over there, and at this here; at all these examples of the misuse of data leading us astray.' And THIS is a message we need to hear MORE, not less of, in today's data-saturated lives!

One thing I like about the book is that Smith doesn't mince his words. While he has positive things to say about such heavyweights as Daniel Kahneman and Dan Ariely, he doesn't hesitate to criticize other popular writers, including the authors of "Freakonomics," or a sociologist named David Phillips, or "The Motley Fool" writers, when they have erred. Some may find him too dismissive in a few instances where the issues aren't altogether settled, but I like his blunt, critical approach.
He also disparages some of the common arguments for the famous 'man with two children' probability paradox, which has a number of variations, and has been extensively debated (Smith gives the answer as 1/2 probability, not 1/3, as many do).

Each chapter ends with a short paragraph summary of the main points, and the final chapter of the book also summarizes the essence of each previous chapter. In short, and without being too redundant, Smith drives home the essential ideas he wants you to come away with from the plethora of examples provided.
Bottom-line, it isn't just difficult, but virtually impossible, to take into account all the 'confounding' factors that may affect a scientific study and its reportage, so a watchful, skeptical eye is in order.

One of my beefs with self-described 'skeptics' is how much time they spend on what I call 'low-hanging fruit'... astrology, ESP, UFOs, homeopathy, etc. while giving a light touch to articles in scientific journals that are weak, poorly-done, poorly reported, or even fraudulent. Excellent science is hard to do, but we ought at least be holding out for "good" science.  I HOPE a book like Smith's helps inculcate a greater wariness of assumed reputable scientific evidence. The "evidence" of "evidence-based science" (perhaps better-called 'publish-or-perish-science'!) is often incomplete or skewed, and considerably more subjective, biased, or based on ill assumptions, than acknowledged; it is rarely incontrovertible, and yet all-too-often escapes keen examination (especially via a broken peer-review process).
Professor John Ioannidis is famous for concluding that 'most research findings are false'. I'm more comfortable simply saying that most research findings are oversimplified, potentially-misleading, and ill-contrived, yet too-easily lapped-up by both uncritical skeptics and the public. The main defense against this state-of-affairs is an educated, on-guard citizenry (and more open-source peer review)... and Smith's book is a diligent effort toward that goal.
In the end, this isn't only a fun book; it's actually a highly important treatise!


No comments:

Post a Comment