Good Math/Bad Math: Another Take on Information Theory

Saturday, March 18, 2006

Another Take on Information Theory

Jeffrey Shallit, over at his Recursivity blog, has also taken on the creationists misuse of information theory. Alas, he calls it "Kolmogorov" information theory; I come from the school that says Kolmogorov made a critical mistake that can result in invalid results; Greg Chaitin started off at exactly the same place as Kolmogorov, but a few years later, caught on to the mistake and fixed it. (For those
interested, the difference is that in Kolmogov's version, the program that you're analyzing doesn't signal when it's done generating its output; in Chaitin, it does.)

3 Comments:

Mark, the difference between Kolmogorov's version of complexity and Chaitin's is pretty minor -- the only difference being that Chaitin insists on self-delimiting strings. This is not a "mistake" of Kolmogorov, but rather a different interpretation. Chaitin's version is more useful in some circumstances; Kolmogorov's is more useful in others. And it only affects the complexity of a string of length n with an additive factor that is about log n.

Have you read Li and Vitanyi's book on Kolmogorov complexity? They certainly distinguish between these two different notations, and give them two different notations: K(x) and C(x). This book is widely acknowledged to be the reference in the area, and Li and Vitanyi do not claim that Kolmogorov made a "mistake". They also don't title their book "Chaitin complexity" or even "Kolmorogov-Chaitin complexity". In fact, there is a detailed historical section that goes over the history of who discovered what when.

What are the "invalid" results that you believe can result from Kolmogorov complexity?

By Jeffrey Shallit, at 8:36 AM
To be honest, I don't remember an example of the error. What I do remember is that it was an example of an error where the lower bound complexity of an algorithm computed using K was smaller than C.

As I said before, I know Greg Chaitin personally, and I've heard several of his lectures. He's mighty convincing :-).

In grad school, when I studied this stuff, it was part of a seminar on algorithmic complexity. The prof (an appalling bad teacher, one of the two worst I ever had... It still amazes me that even after his class, I still enjoy this stuff.) used an LNCS volume with a collection of complexity papers in honor of Juris Hartmanis. It wasn't specifically a Kolmogorov complexity text, but rather a collection of papers on different techniques of analyzing algorithmic complexity. The prof actually didn't particularly like Kolmogorov complexity (he was more interested in the complexity classes and how they were related - the whole BPP, P, NP, etc...), so I mostly studied it on my own, because I thought it was fascinating. I initially used the papers in that volume, and others I found around the net. Then after I came to IBM, I met Greg.

If you haven't met Greg, all I can say is, don't miss the opportunity if it comes up. He's an amazing lecturer, and a thoroughly pleasant, if somewhat egotistical, guy. (But don't ever talk with him in a small room :-) Greg always talks like he's in a lecture hall with 100 people. In a small room, he'll blast your eardrums out.)

By MarkCC, at 9:40 AM
About the difference between Kolmogorov complexity and Chaitin complexity: for finite strings, it doesn't make much difference. However, for infinite strings, Kolmogorov runs into a problem.

For an infinite string of bits, Kolmogorov initially tried to say that a string is "random" if it is incompressible; that is, if for each number n, it takes at least an n-bit program to compute the first n bits of the string.

It turned out that this definition is vaccuous; any sequence of bits will have infinitely many "complexity dips", spots where it takes less than n bits to compute the first n bits.

Chaitin's small change, requiring that the programs be self-delimiting, fixes this problem. With self-delimiting codes, the complexity of an n-bit string can never be less than n. So he can define a "random" string to be one such that the complexity of the first n bits is always n or greater.

This is explained here

By Daryl McCullough, at 4:44 PM

Good Math/Bad Math

Saturday, March 18, 2006

Another Take on Information Theory

3 Comments:

About

About Me

Previous