In May of this year, The Scientist ran a series of pieces suggesting that we could automate the process of acquiring scientific knowledge. That is, AI could replace scientists. One piece discussed automatic processing of biological data (“Robert Murphy Bets Self-Driving Instruments Will Crack Biology’s Mysteries”). Another suggested that the entire process of scientific discovery could be automatable (“This ability to take stock of multiple hypotheses and explore them (that is, conduct experiments and acquire data to validate hypotheses), all while recognizing the cost of exploration, can be a big boost to scientific discovery.”)
While not every piece in the May issue touted artificial intelligence as the universal solution to scientific problems (most pieces listed at least some problems), the overall impression given was that the biggest barrier to new scientific knowledge is the inability to sift through the vast quantity of data and that the solution is that artificial intelligence will help us trudge through that data to find answers.
Unfortunately, that isn’t entirely true. As Mind Matters author Gary Smith has shown in The AI Delusion, without appropriate human supervision, AI is just as likely to find false or unimportant patterns as real ones. Additionally, the overuse of AI in science is actually leading to a reproducibility crisis:
The “reproducibility crisis” in science refers to the alarming number of research results that are not repeated when another group of scientists tries the same experiment. It can mean that the initial results were wrong. One analysis suggested that up to 85% of all biomedical research carried out in the world is wasted effort.
It is a crisis that has been growing for two decades and has come about because experiments are not designed well enough to ensure that the scientists don’t fool themselves and see what they want to see in the results.Pallab Ghosh, “AAAS: Machine learning ‘causing science crisis’” at Mind Matters News
There are three main reasons for the reproducibility crisis:
The first reason is that, because AI only finds patterns and cannot make judgements about those patterns, it will simply find a lot of false correlations. Then, when further work tries to reproduce those patterns in new studies, the new studies will fail to show the effect.
The second problem for reproducibility is that many machine learning algorithms are just as likely to model noise as they are to model data. Mind Matters authors have shown that recent advances in AI using Generalized Information can help alleviate this problem to a large extent. However, many current machine learning tools are still at risk for generating models that fall prey to noise instead of data.
Pete Warden has written on a third, more insidious, reason for reproducibility problems in AI-based science. The tools themselves are non-deterministic. That is, each time you run an AI-based tool, it will likely spit out a different (even if only slightly different) model. Therefore, it is impossible to even reproduce a machine learning model given the exact same data. Each run through will likely show you slightly different results.
In short, modern AI technology aims to find patterns in big datasets. However, the goal of science is to not only find truths but to articulate the supporting reasons for believing them to be truths. For example, suppose we want to know how many exoplanets (planets that orbit stars other than our sun) might have life. Simple pattern-finding is, at best, a small step in this process. Basic correlations in data don’t lead directly to knowledge about causation and certainly don’t tell us why they exist or how to understand them. We could discover that roughly 1600 exoplanets seem similar to Earth in many ways. That is an interesting pattern but it is not decisive about whether those planets have life. Factors we have not taken into account might play a big role and some recurring patterns we have spotted may be irrelevant to the existence of possible life forms.
It is even more important to recognize that many truths simply aren’t discoverable through automation. Automated research, for example, can’t tell us whether we should look for another planet or fix up the one we’ve got. Much as many readers will think that the best answer is obvious, chances are, they did not arrive at it by calculation. And calculation will probably play only a supportive role in developing their future ideas on the topic.
Counting back: 2019 AI Hype Countdown
2019 AI Hype Countdown #7: “Robot rights” grabs the mike. If we could make intelligent and sentient AIs, wouldn’t that mean we would have to stop programming them? AI programs are just that programs. Nothing in such a program could make it conscious. We may as well think that if we make sci-fi life-like enough, we should start worrying about Darth Vader really taking over the galaxy.
8: Media started doing their job! Yes, this year, there has been a reassuring trend: Media are offering more critical assessment of off-the-wall AI hype. One factor in the growing sobriety may be that, as AI technology transitions from dreams to reality, the future belongs to leaders who are pragmatic about its abilities and limitations.
9: Hype fought the law and Autonomy had real software but the hype around Big Data had discouraged Hewlett Packard from taking a closer look. Autonomy CFO Sushovan Hussain was sentenced this year to a five year prison term and a ten million dollar fine because he was held “ultimately responsible for Autonomy’s revenues having been overinflated by $193m between 2009 and the first half of fiscal 2011.”
10: Sophia the Robot Still Gives “Interviews” In other news, few popular media ask critical questions. As a humanoid robot, Sophia certainly represents some impressive engineering. It is sad that the engineering fronts ridiculous claims about the state of AI, using partially scripted interactions as if they were real communication.