The Phantom Pattern Problem
"Data don’t speak. The interpreter speaks – often with an agenda and equipped with tools to distort the results to fit a pre-ordained narrative. Smith and Cordes highlight a number of red flags and focus on how we can avoid being misled by bad science in this era of big data." — Campbell R. Harvey, 2016 President, American Finance Association
"Arguing that our instinctive recognition of patterns leads to pervasive perceptual error, the authors provide trenchant analysis of a wealth of examples confusing correlation with causation by layman and academic researchers alike and a critical examination of the analogous pitfalls in 'big data'” — William Brainard, Arthur M Okun Professor of Economics Emeritus, Yale University
“The challenge for data scientists today is to avoid misinterpreting what the data communicate—and this book is an immensely valuable resource of compelling examples and a must-read for professionals who would like to build infallible stories from their data. Professor Smith’s books are well known for their lucidity and clarity of thought—and this book is yet another example, packed with an exciting narrative of real-life examples of the misuse and abuse of data." — Shanti Swaroop Mokkapati, Head - Analytics and Market Intelligence, Abbott Labs, a Fortune 500 Healthcare Company
“It’s refreshing to see a book on paleo that's about distance running, pattern recognition, and jokes, rather than scarfing steaks, pumping iron, and violence. But, as Gary Smith and Jay Cordes explain and demonstrate, pattern recognition can lead to superficially appealing but ultimately misleading conclusions." — Andrew Gelman, Professor of Statistics and Computer Science, Columba University
"Gary and Jay hit the ball out of the park with ‘The Phantom Problem: The Mirage of Big Data.’ Full of fun stories and spurious correlations and patterns, the book excels at its aim: Explaining the hazards of big data, how many can easily be fooled by putting too much trust in blind statistics, as well as highlighting many pitfalls such as overfitting, data mining with out-of-sample data, over-reliance on backtesting, and ‘Hypothesizing after the Results are Known,’ or HARKing. The text is a home run on the importance of building models guided by human expertise, the critical process of theory before data, and is a welcome addition to any reader’s library." — Brian Nelson, CFA,
President, Investment ResearchValuentum Securities, Inc.
"A nice little antidote to big claims about big benefits of Big Data." — Marc Abrahams, Editor of the Annals of Improbable Research, founder of the Ig Nobel Prize ceremony
"In his earlier life as the very successful author of statistics textbooks, Gary Smith had a knack for creating creative applications that helped students learn important statistical concepts in a fun and intuitive way. In this collaboration with Jay Cordes, Smith takes this same approach mainstream with entertaining example after entertaining example highlighting their central point that not all patterns are meaningful. This fundamental fact – that not all patterns are informative – presents a dilemma for the person who wants to use data to make better decisions. Smith and Cordes argue that the solution to dilemma is not more data, but rather more intelligent theorizing about how the world works. Readers should heed their warning – and, those who don’t should not be surprised if they make an appearance in the next Smith and Cordes book as a cautionary tale! Jay and Gary are nothing if not keenly observant." — Shawn Bushway, Senior Policy Researcher, Behavioral and Policy Sciences Department, RAND Corporation
"The legendary economist Ronald Coase once famously said, 'If you torture the data long enough, it will confess.' As Smith and Cordes demonstrate in spades, the era of Big Data has only exacerbated Coase's assertion. Packed with great examples and solid research, The Phantom Pattern Problem is a cri de coeur to those who believe in the unassailable power of data!" — Phil Simon, award winning author of Too Big to Ignore: The Business Case for Big Data
"Using easily understood examples from sports, the stock market, economics, medical testing and gambling, Smith and Cordes illustrate how data analytics and big data can be seductively misleading. I learned a lot." — Robert J. Marks II, Ph.D.
Distinguished Professor of Electrical & Computer Engineering, Baylor University
Director, The Walter Bradley Center for Natural & Artificial Intelligence