Mind Matters Natural and Artificial Intelligence News and Analysis

Why the Unknowable Number Exists But Is Uncomputable

Sensing that a computer program is “elegant” requires discernment. Proving mathematically that it is elegant is, Chaitin shows, impossible

In this week’s podcast, “The Chaitin Interview IV: Knowability and Unknowability,” Walter Bradley Center director Robert J. Marks interviewed mathematician Gregory Chaitin on his “unknowable number.” That’s the topic of this series, based on the fourth podcast. Last week, we tried getting to know the unknowable number. Today, let’s look at the question of how we know that the number is unknowable — instead of merely non-computable. Lots of things are non-computable but we do not expect that to be true of numbers. Let’s see what’s happening here, as Chaitin offers a walk through his proof that it really is unknowable:

This portion begins at 09:43 min. A partial transcript, Show Notes, and Additional Resources follow.

Robert J. Marks: Let me tell you about pushback that I got from the idea of “unknowable.” I mentioned I’m a big fan of your proof that elegant numbers are unknowable.

Gregory Chaitin: It’s a very simple proof also.

Robert J. Marks: It’s a simple proof, and, I think, according to Paul Erdős (pictured), it would probably go into God’s book as the simplest explanation of something.

Note: Paul Erdős (1913–1996) “posed and solved problems in number theory and other areas and founded the field of discrete mathematics.” (MacTutor) We are told “Famously, the mathematician Paul Erdős, often referred to ‘The Book’ in which God keeps the most elegant proof of each mathematical theorem.” (Gil Kalai’s Blog)

Robert J. Marks: Here’s the pushback that I got. I mentioned this proof that elegance was unknowable to somebody and they said, “Well, that doesn’t mean it’s unknowable. It means only that it’s non-computable.” What’s your take on that?

Gregory Chaitin: There’s no program to calculate it. There’s no way to prove it.

Now, for all practical purposes, you can determine, for example, whether a program is elegant. It’s like the halting problem. If you limit the time to some reasonable amount like one day of calculation, which is certainly a lot, you just run it for a day and you see whether it halts or not. If it hasn’t halted, you just assume it’s never going to halt. For all practical purposes, that may be a good answer. But if you don’t put a time limit…

Now, that’s a mathematical fantasy. But mathematics deals with fantasies because they have clean, beautiful properties. You can prove theorems. So that mathematical fantasy has often served practical purposes in, for example, theoretical physics.

Now, for example, if you toss a coin a lot of times, almost certainly the result is algorithmically unstructured or random. It cannot be compressed into a smaller program. And you can even get estimates on how very probable this is. It can be enormously high probability. So will you allow an argument, “I’ll toss a coin and get a lot of zeros and ones”? You do it a reasonable number of times, not just three times. Most probably it will be very close to maximum program size complexity, which would be n bits. There won’t be any programs substantially shorter than n and you can quantify that, what the cutoff is. So something is almost certainly true, but you could never prove it from mathematical axioms, which are less complicated than the number of bits of the sequence you are trying to show is random, is unstructured.

And there’s also the question, which axioms are you using? My belief is that pure mathematics should evolve, should be creative. You can add new principles. And, in fact, during my lifetime, I’ve seen a number of new principles added to pure mathematics.

So these questions are all complicated to discuss. Like most philosophical questions, “On the one hand, this and on the other hand, that.” And they’re all good arguments, and you pays your money and you takes your choice, right?

Robert J. Marks (pictured): Yes. Just to remind listeners, an elegant program is the smallest program to achieve some objective. I like to think of it in terms of images. If you have a big image, what is the most that you can compress that image. That would be the elegant program for the image.

Greg, I know that you’ve spent a lot of time with your proof that elegant programs are unknowable. Do you have a simple explanation for our listeners? Let’s give it a try.

Gregory Chaitin: In my “Experiment in Autobiography” that’s included in the book, Unravelling Complexity, I go through the essential ideas in the proof. I’ve often tried to prove it when I’m invited to give a talk at a university, and I have a funny test for whether I explained it well. When I explain it well, people laugh. When they realize the paradox of how the proof works, people laugh. And when people just stare at me with glazed eyes, obviously, or they don’t laugh, it means I didn’t explain it well.

Robert J. Marks: Well, Greg, now I’m paranoid. I don’t know. I hope I laugh after your explanation. We’ll see.

Gregory Chaitin: Here’s the idea. “Provable” means provable from a fixed set of axioms and rules of inference, right? This is a formalization of axioms for mathematics. David Hilbert thought there would be one system that all mathematicians could agree on. So once you’ve taken all the subjective element out, it’s like a computer program. You just run through all possible proofs, you check which ones are correct, you filter out the correct proofs, and that way you get all the theorems. All the theorems of mathematics — if Hilbert had come up with a formal axiomatic theory for all of math.

Note: David Hilbert (1862–1943, pictured) is best known for work in geometry that “had the greatest influence in that area after Euclid. A systematic study of the axioms of Euclidean geometry led Hilbert to propose 21 such axioms and he analysed their significance. He made contributions in many areas of mathematics and physics.” (MacTutor)

Gregory Chaitin: Okay. Whatever the system you’re looking at, with whatever axioms and rules of logic, it can be implemented as a computer program. It’s sort of a proof-checking algorithm which always gives an answer … the proof is correct, the proof’s incorrect … to an infinite runtime algorithm that just generates all the theorems in order of the size of the proofs. It is but a small step. So you just look at the size and bits of the algorithm that runs through, checks all possible proofs, and gives you all the theorems. This is the algorithmic information content — or the complexity — of the formal axiomatic system you’re studying to see what it can achieve.

Robert J. Marks: If I remember right, you step through numbers one at a time and check if it’s meaningful or not, right?

Gregory Chaitin: You check through all possible proofs one at a time.

Robert J. Marks: And those correspond to the numbers. Is that correct?

Gregory Chaitin: Well, they correspond to the strings of characters in the alphabet. You can also just go through the tree of all possible proofs. That’s another way to get all possible theorems and one by one… endless computation. So this thing will require a certain number of bits, the sort of bits of axioms you’re using in this version of mathematics.

Gregory Chaitin (pictured, with his children, Juan and Maria Clara): So let’s say the program that does this, the proof checker or the one that has the tree of all possible proofs or the one that checks all possible proofs in size order and gives you all the theorems… Let’s say this is n bits, whatever that is. So now you start running through all possible theorems until you find a proof that a program that is substantially more than n bits long is elegant, which means it’s the smallest program that calculates the output it does.

So you want to see if your formalization of mathematics and all its principles enables you to prove that a program that is larger than the program which is the software embodiment of those mathematical principles, of that formal axiomatic system, it’s axioms and so forth — you keep running through all the theorems, all possible proofs and all the theorems, until you find a proof that a program that is substantially bigger in bits than the number of bits for the software implementation of this process to run through all possible proofs and get all the theorems.

Then you take that program, which is elegant, and you run it, and then you see what its output is. So we have a process, a program, that is basically the number of bits in size of the number of bits of axioms in your mathematical theory. And we’re using this theory to attempt to prove that the programs are elegant, and we keep looking through all the theorems until we find a proof that a string of zeros and ones, a finite string of zeros and ones that is substantially larger than the software implementation of your mathematical system, is elegant.

This process will come to an end and give you an output. Now, look at this program that I’ve described in words. It has a number of bits which is basically the number of bits for the software implementation of your axiomatic theory, and it’s giving you an output which is the output of the first provably elegant program that is substantially larger than the number of bits in your mathematical theory.

Now, this is a contradiction. You’ve proven the program was elegant from these axioms and that means that this is the smallest, most concise program that can calculate the output that it does. But you’ve calculated with a substantially smaller number of bits, because you found it by running through all possible theorems, all possible proofs, in your mathematical theory. And by the construction of this paradox, you keep running this process until you prove that a program is elegant that has substantially more bits than the program which is the software implementation of your mathematical theory.

So now, this has given you a smaller program than the supposedly elegant program to calculate this object. That’s impossible by the definition of elegance. So either you’re proving false theorems or you can never find this program. If you only prove two theorems, the elegant program for the thing you calculated is substantially larger than the number in the program that found this proof that this program is elegant. So you’ve actually compressed the output from this supposedly elegant program even more into a shorter program, and that’s impossible by the definition of elegance.

So if you assume that only elegant programs can be proved to be elegant… in other words, that the theorems you’re proving are true, you never found a proof that a program is elegant if it has substantially more bits than the software implementation of your mathematical theory.

The only way you avoid the contradiction is if this process never finds a proof that this program — that is substantially larger than your software implementation of your mathematical theory — is elegant.

In other words, any mathematical theory can only prove that finitely many programs are elegant; they have to be smaller in size than the software implementation of the mathematical theory. But there are an infinite number of elegant programs, because…

Robert J. Marks: They get bigger and bigger and bigger.

Gregory Chaitin: Yeah. Because for any programming task, there is a most concise program for it. The problem is you’re not going to be able to prove that you’ve got it if the number of bits in your program is larger than the number of bits in your mathematical theory. I don’t know if this was understandable or not. It depends on the notion of a completely formalized objective, not subjective, notion of mathematics.

What I’m criticizing is not the ability of mathematics to do things. Gödel thinks that mathematicians are not limited by his theorem but they can directly intuit facts from the Platonic world of ideas. So what I’ve given you only applies to totally formalized, computerized mathematical theories. But Gödel doesn’t think it applies human mathematicians, in fact.

But to be able to say what you can prove and show that there are limitations, you have to give a very precise definition of the methods you’re allowing in the proofs. And once you do that, you’re in trouble because, if it’s n bits of methods that you’re allowing for mathematical proofs, then elegant programs that are more than n bits long, you’re not going to be able to prove that they’re elegant. Does that sound more understandable?

Robert J. Marks: So it’s a proof by contradiction. You assume that the elegant program detector was algorithmic and then you showed that there was a contradiction in your assumption. Therefore, it can’t exist.

Gregory Chaitin: Yeah. Nobody laughed so I guess I fumbled the ball.

Robert J. Marks: Okay. What was the joke? I missed it?

Gregory Chaitin: Well, it’s the contradiction!

Robert J. Marks: Okay. The contradiction. Maybe it was because of my familiarity with the proof.

Gregory Chaitin: I explained it badly.

Robert J. Marks: No, no. In my class when I explain proof by contradiction, let me tell you my favorite example. It’s the proof that all positive integers are interesting.

Next: Why all positive integers are interesting: A proof by contradiction.

Note: The photo of Paul Erdős is courtesy Kmhkmh (Creative Commons 3.0)

Don’t miss the stories and links from the previous podcasts:

From Podcast 4:

Getting to know the unknowable number (more or less). Only an infinite mind could calculate each bit. Gregory Chaitin’s unknowable number, the “halting probability omega,” shows why, in general, we can’t prove that programs are “elegant.”

From Podcast 3:

A question every scientist dreads: Has science passed the peak? Gregory Chaitin worries about the growth of bureaucracy in science: You have to learn from your failures. If you don’t fail, it means you’re not innovating enough. Robert J. Marks and Gregory Chaitin discuss the reasons high tech companies are leaving Silicon Valley for Texas and elsewhere.

Gregory Chaitin on how bureaucracy chokes science today. He complains, They’re managing to make it impossible for anybody to do any real research. You have to say in advance what you’re going to accomplish. You have to have milestones, reports. In Chaitin’s view, a key problem is that the current system cannot afford failure — but the risk of some failures is often the price of later success.

How Stephen Wolfram revolutionized math computing. Wolfram has not made computers creative but he certainly took a lot of the drudgery out of the profession. Gregory Chaitin also discusses the amazing ideas early mathematicians developed without the software-based methods we are so lucky to have today.

Why Elon Musk, and others like him, can’t afford to follow rules. Mathematician Gregory Chaitin explains why Elon Musk is, perhaps unexpectedly, his hero. Very creative people like Musk often have quirks and strange ideas (Gödel and Cantor, for example) which do not prevent them from making major advances.

Why don’t we see many great books on math any more? Decades ago, Gregory Chaitin reminds us, mathematicians were not forced by the rules of the academic establishment to keep producing papers, so they could write key books. Chaitin himself succeeded with significant work (see Chaitin’s Unknowable Number) by working in time spared from IBM research rather than the academic rat race.

Mathematics: Did we invent it or did we merely discover it? What does it say about our universe if the deeper mathematics has always been there for us to find, if we can? Gregory Chaitin, best known for Chaitin’s Unknowable Number, discusses the way deep math is discovered whereas trivial math is merely invented.

From the transcripts of the second podcast: Hard math can be entertaining — with the right musical score! Gregory Chaitin discusses with Robert J. Marks the fun side of solving hard math problems, some of which come with million-dollar prizes. The musical Fermat’s Last Tango features the ghost of mathematician Pierre de Fermat pestering the math nerd who solved his unfinished Last Conjecture.

Chaitin’s discovery of a way of describing true randomness. He found that concepts f rom computer programming worked well because, if the data is not random, the program should be smaller than the data. So, Chaitin on randomness: The simplest theory is best; if no theory is simpler than the data you are trying to explain, then the data is random.

How did Ray Solomonoff kickstart algorithmic information theory? He started off the long pursuit of the shortest effective string of information that describes an object. Gregory Chaitin reminisces on his interactions with Ray Solomonoff and Marvin Minsky, fellow founders of Algorithmic Information Theory.

Gregory Chaitin’s “almost” meeting with Kurt Gödel. This hard-to-find anecdote gives some sense of the encouraging but eccentric math genius. Chaitin recalls, based on this and other episodes, “There was a surreal quality to Gödel and to communicating with Gödel.”

Gregory Chaitin on the great mathematicians, East and West: Himself a “game-changer” in mathematics, Chaitin muses on what made the great thinkers stand out. Chaitin discusses the almost supernatural awareness some mathematicians have had of the foundations of our shared reality in the mathematics of the universe.

and

How Kurt Gödel destroyed a popular form of atheism. We don’t hear much about logical positivism now but it was very fashionable in the early twentieth century. Gödel’s incompleteness theorems showed that we cannot devise a complete set of axioms that accounts for all of reality — bad news for positivist atheism.

You may also wish to read: Things exist that are unknowable: A tutorial on Chaitin’s number (Robert J. Marks)

and

Five surprising facts about famous scientists we bet you never knew: How about juggling, riding a unicycle, and playing bongo? Or catching criminals or cracking safes? Or believing devoutly in God… (Robert J. Marks)

Show Notes

• 00:23 | Introducing Gregory Chaitin
• 00:40 | What is unknowability?
• 06:07 | Does non-computable mean unknowable?
• 09:43 | A simple explanation
• 21:34 | Is creativity non-computable?
• 25:55 | Defining creativity
• 28:19 | Panpsychism