Larson did an interesting podcast with the Brookings Institution through its Lawfare Blog shortly after the release of his book. It’s well worth a listen, and Larson elucidates in that interview many of the key points in his book. The one place in the interview where I wish he had elaborated further was on the question of abductive inference (aka retroductive inference or inference to the best explanation). For me, the key to understanding why computers cannot, and most likely will never, be able to perform abductive inferences is the problem of underdetermination of explanation by data. This may seem like a mouthful, but the idea is straightforward. For context, if you are going to get a computer to achieve anything like understanding in some subject area, it needs a lot of knowledge. That knowledge, in all the cases we know, needs to be painstakingly programmed. This is true even of machine learning situations where the underlying knowledge framework needs to be explicitly programmed (for instance, even Go programs that achieve world class playing status need many rules and heuristics explicitly programmed).
The Underdetermination Problem
Humans, on the other hand, need none of this. On the basis of very limited or incomplete data, we nonetheless come to the right conclusion about many things (yes, we are fallible, but the miracle is that we are right so often). Noam Chomsky’s entire claim to fame in linguistics really amounts to exploring this underdetermination problem, which he referred to as “the poverty of the stimulus.” Humans pick up language despite very varied experiences with other human language speakers. Babies born in abusive and sensory deprived environments pick up language. Babies subject to Mozart from the womb and with rich sensory environments pick up language. Language results from growing up with cultured and articulate parents. Language results from growing up with boorish and inarticulate parents. Yet in all cases, the actual amount of language exposure is minimal compared to language ability that emerges and the knowledge of the world that results. On the basis of the language exposure, many different ways of understanding the world might have developed, and yet we seem to get things right (much of the time). Harvard philosopher Willard Quine, in his classic Word and Object (1960), struggled with this phenomenon, arguing for what he called the indeterminacy of translation to make sense of it.
The problem of underdetermination of explanation by data appears not just in language acquisition but in abductive inference as well. It’s a deep fact of mathematical logic (i.e., the Löwenheim-Skolem theorem) that any consistent collection of statements (think data) has infinitely many mathematical models (think explanations). This fact is reflected in ordinary everyday abductive inferences. We are confronted with certain data, such as missing documents from a bank safety deposit box. There are many, many ways this might be explained: a thermodynamic accident in which the documents spontaneously combusted and disappeared, a corrupt bank official who stole the documents, a nefarious relative who got access and stole the documents, etc.
No End to “Et Cetera”
But the “et cetera” here has no end. Maybe it was space aliens. Maybe you yourself took and hid the documents, and are now suffering amnesia. There are a virtually infinite number of possible explanations. And yet, somehow, we are often able to determine the best explanation, perhaps with the addition of more data/evidence. But even adding more data/evidence doesn’t eliminate the problem because however much data/evidence you add, the underdetermination problem remains. You may eliminate some hypotheses (perhaps the hypothesis that the bank official did it, but not other hypotheses). But by adding more data/evidence, you’ll also invite new hypotheses. And how do you know which hypotheses are even in the right ballpark, i.e., that they’re relevant? Why is the hypothesis that the bank official took the documents more relevant than the hypothesis that the local ice cream vendor took them? What about the local zoo keeper? We have no clue how to program relevance, and a fortiori we have no clue how to program abductive inference (which depends on assessing relevance). Larson makes this point brilliantly in his book.
You may also wish to read:
Are we spiritual machines? Are we machines at all? Inventor Ray Kurzweil proposed in 1999 that within the next thirty years we will upload ourselves into computers as virtual persons, programs on machines. The themes and misconceptions about computers and artificial intelligence that made headlines in the late 1990s persist to this day.
A critical look at the myth of “deep learning” “Deep learning” is as misnamed a computational technique as exists. The phrase “deep learning” suggests that the machine is doing something profound and beyond the capacity of humans. That’s far from the case.
Artificial intelligence understands by not understanding The secret to writing a program for a sympathetic chatbot is surprisingly simple… We needed to encode grammatical patterns so that we could reflect back what the human wrote, whether as a question or statement.
Automated driving and other failures of AI How would autonomous cars manage in an environment where eye contact with other drivers is important? In cossetted and sanitized environments in the U.S., we have no clue of what AI must achieve to truly match what humans can do.
Artificial intelligence: Unseating the inevitability narrative. William Dembski: World-class chess, Go, and Jeopardy-playing programs are impressive, but they prove nothing about whether computers can be made to achieve AGI. In The Myth of Artificial Intelligence, Erik Larson shows that neither science nor philosophy back up the idea of an AI superintelligence taking over.