Pomona College economics professor Gary Smith, author with Jay Cordes of The Phantom Pattern Problem (Oxford, October 1, 2020), tackles an age-old glitch in human thinking: We tend to assume that if we find a pattern, it is meaningful. Add that to the weaknesses of current artificial intelligence and “Houston, we have a problem,” he warns:
The scientific method tests theories with data. Data-mining computer algorithms dispense with theory and search through data for patterns, often aided and abetted by slicing, dicing, and otherwise mangling data to create patterns.Gary Smith, “Phantom patterns: The big data delusion” at IAI News (August 24, 2020)
Many of the patterns so detected are obviously spurious, for example:
A computer algorithm for evaluating job applicants noticed that many good programmers visited a particular Japanese manga site and concluded that people who visit this site are likely to be good programmers. Another algorithm concluded that job applicants who went to all-women’s colleges are unlikely to be good software engineers because very few of the company’s current engineers went to all-women’s colleges. A car insurance company created an algorithm for evaluating applicants based on Facebook posts, including whether one likes Leonard Cohen.Gary Smith, “Phantom patterns: The big data delusion” at IAI News (August 24, 2020)
He cites a number of similar implausible patterns. But, of course, if a pattern doesn’t sound implausible, we might accept and act on it, even if it is incorrect. In fact, he says, that’s been happening in science today:
John Ioannidis, author of an insightful paper with the scandalous name, “Why Most Published Research Findings Are False,” looked at 45 of the most widely respected medical studies that claimed to have demonstrated effective treatments for various ailments. He found that replication attempts had been done for 34 of these treatments, with the original conclusions confirmed in only 20 cases. The numbers are surely worse for ordinary research in ordinary journals.Gary Smith, “Phantom patterns: The big data delusion” at IAI News (August 24, 2020)
If later researchers cannot reproduce the results, chances are, the pattern detected in the data isn’t really there.
Smith concludes with a warning that Big Data does not necessarily lead to more knowledge: “In the age of Big Data and powerful computers, human wisdom, commonsense, and expertise are needed more than ever.”
You may also enjoy:
Data mining: A plague, not a cure: It is tempting to believe that patterns are unusual and their discovery meaningful; in large data sets, patterns are inevitable and generally meaningless
Ransacking flawed data for hidden treasures seldom ends well The Internet provides a firehose of data that financial market researchers can use to interpret human behavior—but cherry-picked patterns usually vanish.