You will get a chortle or two from Spurious Correlations, a web page devoted to graphically persuasive relationships among pairs of sets of entirely unrelated data. For example, you can see the graph of “US spending on science, space and technology” superimposed on that of “Suicides by hanging, strangulation and suffocation.” The staggering 99.79% overlap is a classic in correlation without causation.
Likewise, “Per capita cheese consumption” and “People who died by becoming tangled in their bedsheets” has a correlation of 94.71%. And the correlation between “People who drowned after falling out of a fishing boat” and the “Marriage rate in Kentucky” is 95.24%.
Common sense tells us to treat these coincidences as jokes. But in his fascinating new book The AI Delusion, economics professor Gary Smith reminds us that computers don’t have common sense. He also notes that, as data gets larger and larger, nonsensical coincidences become more probable, not less.
Texas Sharpshooter Fallacy #1
In his book, Smith discusses the Texas Sharpshooter Fallacies, to which many fall prey as they search for statistical significance in data. Fallacy #1, for example, is illustrated by a big wall on which a large number of small targets are painted. No matter where the sharpshooter’s bullet hits the wall, it’s close to a bull’s-eye. An example Smith offers is an actual research project into the causes of pancreatic cancer. A great deal of data was gathered about the habits of people with pancreatic cancer. The researchers were searching for possible causes like smoking and alcohol consumption. But the bullet hole was not close to the smoking or alcohol bull’s-eyes. What about cigar smoking or drinking tea? Nope. No correlation. So let’s try coffee.
The bullet hit close to the coffee bulls-eye. Coffee’s statistics showed a direct linkage to pancreatic cancer!1 Brian MacMahon, the lead author of the study, quit coffee, as did others, frightened by the findings, published in the prestigious New England Journal of Medicine.
But MacMahon’s research team’s bullet landed close to one of many bulls-eyes on the wall only coincidentally. It was a fluke. A later, in-depth study, “Coffee Consumption and Pancreatic Cancer Risk: An Update Meta-analysis of Cohort Studies”, tested the McMahon team’s findings using more data. The new study came to the opposite conclusion: “High coffee consumption is associated with a reduced pancreatic cancer risk.”2 Go figure.
Here’s a takeaway: The next time someone tells you something like “Don’t you know eating raisins causes plantar warts? It’s been proven in this peer-reviewed publication, right here!”, be politely skeptical. The conclusion might be a Texas Sharpshooter Fallacy.
Texas Sharpshooter Fallacy #2
Smith’s Texas Sharpshooter Fallacy #2 also uses a target as a metaphor. A bullet is fired and hits the wall. To make the shot look accurate, a target is then painted onto the wall, so that the bullet appears right in the middle of the bull’s-eye.
Here’s how Fallacy #2 works (assuming you wanted to try it): You first example the data secretly. You note a correlation in your data between “Math doctorates awarded annually” and “Uranium stored at US nuclear power plants.”3 But you don’t tell anyone you discovered this curious correlation. Instead, you come up with a hypothesis about a secret conspiracy to kill everyone who has a doctorate in math, using some sort of radiation. The just-in-time storage of sufficient uranium might be evidence for this conspiracy. But you don’t, of course, come right out and say that. With counterfeit innocence, you propose, “Let’s see what the data says.” You look at the data and, Eureka!, your hypothesis was right! You publish a paper, get funded, and get a promotion.
In this hypothetical example, the bullet hole is the curious correlation you “discovered.” Forming the hypothesis and doing the study corresponds to painting the target around the bullet hole. Morally compromised practitioners of the Texas Sharpshooter Fallacy #2 are motivated by the desire to get published, funded, and ultimately promoted in their careers. As with any fraud, the fields of study—and ultimately you and I—are the victims.
Ninety percent of published medical research is flawed?!
How common are Texas Sharpshooter Fallacies? John Ioannidis guesstimates in an open-access PLOS paper that 90% of published medical research is flawed. That’s a jaw-dropping claim! It makes me wonder about all the scare headlines I have read over the years:
· Consuming aspartame causes cancer.4
· Organic food is more healthy that GMO (genetically modified organism) food.
· Drinking wine is good for the blood.
· Living under high voltage transmission lines will slowly kill you.
I know people who defend almost to the death opinions such as “If you eat GMOs, you will die!” Frankly, it’s difficult to know what to believe, in many cases. It’s hard to think of any tempting food that has not been demonized in the headlines at some time in my life. We were even told once to avoid delicious Washington apples because they bore traces of insecticide.
Can we avoid Smith’s Texas Sharpshooter Fallacies?
First, we should recognize that there are obvious facts that are not disputed. For example,
· Smoking causes cancer.
· Driving drunk kills people.
· Sticking your tongue on a frozen metal flagpole has immediate health consequences.
For other more fuzzy issues, Ronald Reagan’s motto, “Trust but verify,” is good practice. I once served as a program co-chair for a conference titled Computational Intelligence for Financial Engineering. The Conference Co-Chair was John Marshall who later became the world’s first Professor of Financial Engineering.
Marshall was repeatedly approached by people who claimed they had trained an artificial neural network to successfully forecast the stock market. He told me he didn’t even need to look at their computer code or results. He assessed the true success of their software with the simple question “What kind of car do you drive?” In other words, if they could predict the stock market with their software, they should be driving a luxury car like a Lamborghini.
In machine learning, performance is assessed using cross validation. How does the trained machine perform on data it was not trained with? Often, data is divided into training and validation sets. As the machine, e.g. a neural network, is trained using training data, its accuracy is determined by comparing the results with those from the validation data.
Some programmers train different types of neural networks with the same training and validation sets. When they finally identify a good neural network architecture, they risk becoming victims of Texas Sharpshooter Fallacy #1: A neural network architecture matching the training and validation set has been found!
No. Not really. The programmers have only hit close to one of the bulls-eyes of the many targets painted on the wall. The best of the many targets on the wall was the most successful neural network type. So further validation is needed. For this, a third hunk of data called test data must be applied. In the case of a neural network that predicts the stock market, the best test data is the real-time data generated by the stock market. And if your AI that forecasts the stock market works, you can go right out and buy your Lamborghini.
Fallacious studies, intended or not, are exposed when subjected to additional test data. The coffee-causes-pancreatic-cancer study was debunked when a fresh set of test data external to the data used in the original study was applied. Many papers with as-of-yet hidden fallacies have not been tested. Until they are, these studies must be viewed with skepticism.
One of Big Data’s Big Problems
If humans mining data fall victim to Texas Sharpshooter Fallacies, what will the consequences when Big Data is tasked with mining correlations using AI autonomously? Enormous data sets have a higher probability of meaningless correlations. More than ever, common sense is needed. And common sense only comes from programmers writing their own common sense into the software.
If you want to learn more on this topic, pick up a copy of Gary Smith’s fun book The AI Delusion. Or, like me, listen to it on Audible. A pinch of background knowledge about statistical testing is helpful but in any case, you will learn a lot. And don’t forget to have a look at Spurious correlations. It’s a hoot.
Note: The aphorism about torturing Big Data in the subtitle above is attributed to Ronald Coase.
1 MacMahon, Brian, Stella Yen, Dimitrios Trichopoulos, Kenneth Warren, and George Nardi. “Coffee and cancer of the pancreas.” New England Journal of Medicine 304, no. 11 (1981): 630-633.
2 They added, “However, the result should be accepted with caution.”
3 According to Spurious correlations, the correlation is 95.23%.
4. The National Cancer Society now says aspartame doesn’t cause cancer.
Robert J. Marks II, Ph.D., is Distinguished Professor of Engineering in the Department of Electrical & Computer Engineering at Baylor University. Marks is the founding Director of the Walter Bradley Center for Natural & Artificial Intelligence and hosts the podcast Mind Matters. He is the Editor-in-Chief of BIO-Complexity and the former Editor-in-Chief of the IEEE Transactions on Neural Networks. He served as the first President of the IEEE Neural Networks Council, now the IEEE Computational Intelligence Society. He is a Fellow of the IEEE and a Fellow of the Optical Society of America. His latest book is Introduction to Evolutionary Informatics coauthored with William Dembski and Winston Ewert. A Christian, Marks served for 17 years as the faculty advisor for CRU at the University of Washington and currently is a faculty advisor at Baylor University for the student groups the American Scientific Affiliation and Oso Logos, a Christian apologetics group.
AI that can read minds? Deconstructing AI hype