Good Math/Bad Math: March 2006

Friday, March 31, 2006

Turing Machine Tricks

Now that you know what Turing machines are, we can play some games with them. But as things stand, talking or writing about Turing machine programs is really painful. But we can work out a bunch of extensions to the Turing machine. They're not really extensions in the sense of adding a capability to the TM that it didn't have before, but more in the sense of shorthands: things that we can do with a normal turing machine, but where the extension gives us an easier, more concise way of saying it.

The point of these extensions isn't to give the machine the ability to do anything it couldn't do before; it's to give us a way to talk about the algorithms that turing machines use in an easier to understand way, without using state tables.

Non-writing Transitions

In our original turing machine description, we said that on every step, the transition function had to specify a target state, a symbol to write to the tape, and a direction to move the head. But as we saw in the subtraction machine state 1, there are some transitions that just don't change the symbol on the tape cell. In the basic Turing machine, that means we have to have an entry for every symbol specifying a transition to the same target state, moving in the same direction, and writing the same symbol that was read from the tape.

For example, imagine we had 4 symbols like the subtraction machine: "1", " ", "-", and "=". Remember, for state one, the part of the transition function was:

(State,Symbol)	(State,Symbol,Direction)
(1, " ")	(1 , " ", forward)
(1, "1")	(1 , "1", forward)
(1, "-")	(1 , "-", forward)
(1, "=")	(2 , " ", back)

We need three entries in the transition function just to say "move forward wihout changing the symbol on the tape". Instead, we can just say that the transition function can say "nothing" for the value to write, which means that the transition shouldn't change the value on the cell; and we can say that the input symbol can be any of a set of choices: (1, {"1", "-", " "}) -> (1, nothing, forward).

Moving the Head

We can add a couple of alternative ways to talk about moving the tape head.

move forward/backward to symbol X, then switch to state S

This is usually called a seek operation. The way you add it to a Turing machine is by adding one state to the machine; that state moves in the correct direction, leaving the tape unchanged, until it gets to the correct symbol, and switches to the new state. That's state 1 in the subtraction machine - it moves left, not changing the state, until it reaches an "=". We can just say "In state 1, move forwards until '=', then switch to state 2".

Move forward/backward N cells, then switch to state S

This is a tad more tricky, but still not bad. We need to add N states s_1, s_2, ... s_n:

(State,Symbol)	(State,Symbol,Direction)
(s_1, any)	(s_2, nothing, forward)
(s_1, any)	(s_2, nothing, forward)
...	...
(s_(n-1), any)	(s_n, nothing, forward)
(s_n, any)	(S, nothing, forward)

Marking the tape and Moving to a mark

We can add the ability to add a marker to a cell on the tape without erasing the symbol already there, and then use a version of the "move to symbol A" instruction. The way we do that is by modifying the set of symbols that we can have on the tape.

Suppose we wanted to be able to have two marks on a machine that uses the same four symbols as our original subtraction machine. We would expand the alphabet from four symbols to twelve: that is, from { '1', '-', ' ', '=' } to { ('1', none), ('1', m1), ('1', m2), (' ', none), (' ', m1), (' ', m2), ('-', none), ('-', m1), ('-', m2), ('=', none), ('=', m1), ('=', m2) }.

So when we write a mark, we're really changing the symbol on the tape from something like ('1', none) to ('1', m1). And when we move forward to a mark, we're just doing move forward to (anything, m1).

Adding Variables

We can add "variables" to the machine by increasing the set of states. A variable is a part of the state of the machine where we can store the a symbol or a marker. The easiest way to add them is to just multiply the number of states by the number of symbols that can be stored in the variable. For example, in our "subtraction" machine, we had four states and three symbols. If we wanted to have a variable that could remember a symbol, we would expand to 12 states: each state would have the "original" state from the machine, plus it would "encode" the value of the variable. So if we called the variable x, the states would be {(1, x='1'), (1,x='-'), (1,x='='), (1,x=' '),(2, x='1'), (2,x='-'), (2,x='='), (2,x=' '),(3, x='1'), (3,x='-'), (3,x='='), (3,x=' '),(4, x='1'), (4,x='-'), (4,x='='), (4,x=' ')}.

Using the variable, we can add "instructions" to read and write variables. Read instructions change both the numbered state and set the value of the variable to the symbol on the tape; write instructions change the numbered state and write the symbol stored in the variable onto the tape.

Copying Sections of Tape

We can copy sections of tape using the pieces we described above. To be able to do it, we need to be able to add a few things to the machine.

We need to add some states to the machine: a few states for each character in the alphabet that can be copied (one set of states for copying forward; one for copying backward).
We need to add three markers: one to mark the beginning of the region to be copied (the source region); one to mark the end of the source region; and one to mark the beginning of the place to copy to (the target region).
We need to add states for seeking markers.

To set up to do a copy, you need to place the three marks; and you need to know whether the place to copy to is before or after the section you want to copy. For now, let's assume it's after. Then you need to position the head at the start of of the target region. (Which you can do with a seek for the mark.)

Now, to do the copy:

Seek back from the target mark until you reach either the start or the
end of copy region marks.
If you get the start mark before you see an end mark, that means that you finished the copy; so just seek back to the target mark, and then stop.
If you get the end mark first, then there's more to copy. So seek
back to the start mark.
Read the symbol from the start mark into a variable; erase the start mark, and move forward one cell. Write the start mark onto the cell. (If the "end" mark is on that cell, it gets erased by the start mark.)
Seek forward to the target mark. When you find it, write the symbol stored in the variable, and erase the target mark, step forward, and write the target mark.
Now, you've finished copying one character; go back to the beginning,
and keep doing this until the "end" mark gets erased.

Using the Extensions: Multiplication on the Turing Machine

Now, to show why it was worth all the trouble of figuring out how to add all of those extensions, we'll describe how to build a Turing machine that does multiplication. Basically, it takes two numbers in unary on the tape, like "111*1111=", and runs until it's written the product after the "=": "111*1111=111111111111".

So here's what the machine does:

Put "start" and "end" marks at the beginning and end of the second number on the tape.
Put a "counter" mark at the start of the first number on the tape.
Put a "target" mark at the "=" on the tape.
Scan back to the counter mark. If it's on "*", halt. If not, move it forward by one cell.
Do a copy from the start and end marks to the target mark.
Put the target mark on the cell after the last digit that you copied.
Scan back to the "start" mark, and replace it with the end mark.
Scan back to the character after the "*", and write the start mark.
Go back to step 4.

Just to give you an idea of how an algorithmic description of the machine these extensions makes things easier to read, here's the state transition table for a turing machine, which includes some of our extensions, for essentially this multiplication algorithm (I found this program here after I wrote the algorithm above, so it's not an exact match.)

# Turing machine to multiply two numbers:
# Number n is represented by string of n+1 1's.
# Initially, tape is assumed to be in form of _n1_n2_ (where _ is blank),
# and head is assumed to be positioned at leftmost non-blank square.
# Initial internal machine state is "start".
# At halt, tape is _n1_n2_n3_, where n3=n1*n2,
# and head is positioned at leftmost non-blank square.

State,Symbol	State,SymbolToWrite, Direction	comment
start, 1	move1right, W, R	# mark first bit of 1st argument
move1right, 1	move1right, 1, R	# move right til past 1st argument
move1right, _	mark2start, _, R	# square between 1st and 2nd arguments found
mark2start, 1	move2right, Y, R	# mark first bit of 2nd argument
move2right, 1	move2right, 1, R	# move right til past 2nd argument
move2right, _	initialize, _, R	# square between 2nd argument and answer found
initialize, _	backup, 1, L	# put a 1 at start of answer
backup, _	backup, _, L	# move back to leftmost unused bit of 1st arg
backup, 1	backup, 1, L	# ditto
backup, Z	backup, Z, L	# ditto
backup, Y	backup, Y, L	# ditto
backup, X	nextpass, X, R	# in position to start next pass
backup, W	nextpass, W, R	# ditto
nextpass, _	finishup, _, R	# if square is blank, we're done. finish up
nextpass, 1	findarg2, X, R	# if square is not blank, go to work. mark bit
findarg2, 1	findarg2, 1, R	# move past 1st argument
findarg2, _	findarg2, _, R	# square between 1st and 2nd arguments
findarg2, Y	testarg2, Y, R	# start of 2nd arg. skip this bit, copy rest
testarg2, _	cleanup2, _, L	# if blank, we are done with this pass
testarg2, 1	findans, Z, R	# if not, increment ans. mark bit, move there
findans, 1	findans, 1, R	# still in 2nd argument
findans, _	atans, _, R	# square between 2nd argument and answer
atans, 1	atans, 1, R	# move through answer
atans, _	backarg2, 1, L	# at end of answer--write a 1 here, go back
backarg2, 1	backarg2, 1, L	# move left to first unused bit of 2nd arg
backarg2, _	backarg2, _, L	# ditto
backarg2, Z	testarg2, Z, R	# just past it. move right, and test it
backarg2, Y	testarg2, Y, R	# ditto
cleanup2, 1	cleanup2, 1, L	# move back through answer
cleanup2, _	cleanup2, _, L	# square between 2nd arg and answer
cleanup2, Z	cleanup2, 1, L	# restore bits of 2nd argument
cleanup2, Y	backup, Y, L	# done with that. backup to start next pass
finishup, Y	finishup, 1, L	# restore first bit of 2nd argument
finishup, _	finishup, _, L	# 2nd argument restored, move back to 1st
finishup, X	finishup, 1, L	# restore bits of 1st argument
finishup, W	almostdone, 1, L	# restore first bit of 1st arg. almost done
almostdone, _	halt, _, R	# done with work. position properly and halt

Friday Random Ten - now with links!

It's friday again, so it's time for a very geeky random ten, this time with links to the more obscure groups. It turned out pretty folky this week.

Ode to Big Blue, Trout Fishing in America. No, this song is not about my employer. It's about a blue whale. TFiA is a really great duo that does albums for adults, and also albums for children that won't drive adults out of their heads. And who wouldn't love a group that could write lyrics like "Is is 'till it isn't, then it was; was was till it wasn't. Maybe might be is in a second."
Garden Party, Marillion. Marillion's a favorite band of mine, and has been for a long time. This is a delightful little obscene song off of their first album.
The World Turned Upside Down, Broadside Electric. Broadside Electric is a local band that mostly does modernized electric versions from very, very old folk music - Child's ballads, etc. I've done sound for them at a local festival, and aside from being a great band, they're tremendously nice people.
Happy Hour on Planet Zarg, ProjeKct Two. ProjecKct two is a King Crimson spinoff: Robert Fripp, Adrian Belew, and Trey Gunn. They call themselves P2 when they're doing unplanned, unstructured free improv. Cool stuff, but mightly wierd.
Big Gravel, Psychograss. Psychograss made the random 10 last week too. And they've got a new album out!
Mato Grosso, from "ITAIPU" by Phillip Glass. Magnificent piece of music. Glass has lately been doing some very ambitious work in a more traditional classical vein; this is a great example of that style. If you can listen to this and not be moved by the music, you have no soul.
Flutopia, Flook. Flook makes my R10 again. Flook is the greatest.
Kelo, Miles Davis. What can I say about Miles Davis? I'm not worthy.
Christianopel, The Flower Kings. The Flower Kings are a neo-progressive band. If you like progressive rock, you've got to give the Kings a listen.
Working on a Building, the Wayfaring Strangers. The Strangers are an odd little group - a bit of jazz, a bit of bluegrass, a bit of folk, a bit of rock. Tony Trischka on banjo, Matt Glaser on violin, and several other deities on their respective instruments.

Thursday, March 30, 2006

Berlinski's Bad Math

In my never-ending quest for bad math to mock, I was taking a look at the Discovery Institute's website, where I found an essay, On the Origin of Life, by David Berlinksi. Bad math? Oh, yeah. Bad, sloppy, crappy math. Some of which is just duplication of things I've criticized before, but there's a few different tricks in this mess.

Before I jump in to look at a bit of it, I'd like to point out a general technique that's used in this article. It's very wordy. It rambles, it wanders off on tangents, it mixes quotes from various people into its argument in superfluous ways. The point of this seems to be to keep you, the reader, somewhat off balance; it's harder to analyze an argument when the argument is so scattered around, and it's easier to miss errors when the steps of the argument are separated by large quantities of cutesy writing. Because of this, the section I'm going to quote is fairly long; it's the shortest I could find that actually contained enough of the argument I want to talk about to be coherent.

The historical task assigned to this era is a double one: forming chains of nucleic acids from nucleotides, and discovering among them those capable of reproducing themselves. Without the first, there is no RNA; and without the second, there is no life.

In living systems, polymerization or chain-formation proceeds by means of the cell’s invaluable enzymes. But in the grim inhospitable pre-biotic, no enzymes were available. And so chemists have assigned their task to various inorganic catalysts. J.P. Ferris and G. Ertem, for instance, have reported that activated nucleotides bond covalently when embedded on the surface of montmorillonite, a kind of clay. This example, combining technical complexity with general inconclusiveness, may stand for many others.

In any event, polymerization having been concluded—by whatever means—the result was (in the words of Gerald Joyce and Leslie Orgel) “a random ensemble of polynucleotide sequences”: long molecules emerging from short ones, like fronds on the surface of a pond. Among these fronds, nature is said to have discovered a self-replicating molecule. But how?

Darwinian evolution is plainly unavailing in this exercise or that era, since Darwinian evolution begins with self-replication, and self-replication is precisely what needs to be explained. But if Darwinian evolution is unavailing, so, too, is chemistry. The fronds comprise “a random ensemble of polynucleotide sequences” (emphasis added); but no principle of organic chemistry suggests that aimless encounters among nucleic acids must lead to a chain capable of self-replication.

If chemistry is unavailing and Darwin indisposed, what is left as a mechanism? The evolutionary biologist’s finest friend: sheer dumb luck.

Was nature lucky? It depends on the payoff and the odds. The payoff is clear: an ancestral form of RNA capable of replication. Without that payoff, there is no life, and obviously, at some point, the payoff paid off. The question is the odds.

For the moment, no one knows how precisely to compute those odds, if only because within the laboratory, no one has conducted an experiment leading to a self-replicating ribozyme. But the minimum length or “sequence” that is needed for a contemporary ribozyme to undertake what the distinguished geochemist Gustaf Arrhenius calls “demonstrated ligase activity” is known. It is roughly 100 nucleotides.

Whereupon, just as one might expect, things blow up very quickly. As Arrhenius notes, there are 4^100 or roughly 10^60 nucleotide sequences that are 100 nucleotides in length. This is an unfathomably large number. It exceeds the number of atoms contained in the universe, as well as the age of the universe in seconds. If the odds in favor of self-replication are 1 in 10^60, no betting man would take them, no matter how attractive the payoff, and neither presumably would nature.

“Solace from the tyranny of nucleotide combinatorials,” Arrhenius remarks in discussing this very point, “is sought in the feeling that strict sequence specificity may not be required through all the domains of a functional oligmer, thus making a large number of library items eligible for participation in the construction of the ultimate functional entity.” Allow me to translate: why assume that self-replicating sequences are apt to be rare just because they are long? They might have been quite common.

They might well have been. And yet all experience is against it. Why should self-replicating RNA molecules have been common 3.6 billion years ago when they are impossible to discern under laboratory conditions today? No one, for that matter, has ever seen a ribozyme capable of any form of catalytic action that is not very specific in its sequence and thus unlike even closely related sequences. No one has ever seen a ribozyme able to undertake chemical action without a suite of enzymes in attendance. No one has ever seen anything like it.

The odds, then, are daunting; and when considered realistically, they are even worse than this already alarming account might suggest. The discovery of a single molecule with the power to initiate replication would hardly be sufficient to establish replication. What template would it replicate against? We need, in other words, at least two, causing the odds of their joint discovery to increase from 1 in 10^60 to 1 in 10^120. Those two sequences would have been needed in roughly the same place. And at the same time. And organized in such a way as to favor base pairing. And somehow held in place. And buffered against competing reactions. And productive enough so that their duplicates would not at once vanish in the soundless sea.

In contemplating the discovery by chance of two RNA sequences a mere 40 nucleotides in length, Joyce and Orgel concluded that the requisite “library” would require 10^48 possible sequences. Given the weight of RNA, they observed gloomily, the relevant sample space would exceed the mass of the earth. And this is the same Leslie Orgel, it will be remembered, who observed that “it was almost certain that there once was an RNA world.”

To the accumulating agenda of assumptions, then, let us add two more: that without enzymes, nucleotides were somehow formed into chains, and that by means we cannot duplicate in the laboratory, a pre-biotic molecule discovered how to reproduce itself.

Ok. Lots of stuff there, huh? Let's boil it down.

The basic argument is the good old "big numbers". Berlinski wants to come up with some really whoppingly big numbers to make things look bad. So, he makes his first big numbers appeal by looking at polymer chains that could have self-replicated. He argues (not terribly well) that the minimum length for a self-replicating polymer is 100 nucleotides. From this, he then argues that the odds of creating a self-replicating chain is 1 in 10^60.

Wow, that's a big number. He goes to some trouble to stress just what a whopping big number it is. Yes Dave, it's a big number. In fact, it's not just a big number, it's a bloody huge number. The shame of it is, it's wrong; and what's worse, he knows it's wrong. Right after he introduces it, he quotes a biochemist who pointed out the fact that that's stupid odds because there's probably more than one replicator in there. In fact, we can be pretty certain that there's more than one: we know lots of ways of modifying RNA/DNA chains that doesn't affect their ability to replicate. How many of those 10^60 cases self-replicate? We don't know. Berlinski just handwaves. Let's look again at how he works aorund that:

They might well have been. And yet all experience is against it. Why should self-replicating RNA molecules have been common 3.6 billion years ago when they are impossible to discern under laboratory conditions today? No one, for that matter, has ever seen a ribozyme capable of any form of catalytic action that is not very specific in its sequence and thus unlike even closely related sequences. No one has ever seen a ribozyme able to undertake chemical action without a suite of enzymes in attendance. No one has ever seen anything like it.

So - first, he takes a jump away from the math, so that he can wave his hands around. Then he tries to strenghten the appeal to big numbers by pointing out that we don't see simple self-replicators in nature today.

Remember what I said in my post about Professor Culshaw, the HIV-AIDS denialist? You can't apply a mathematical model designed for one environment in another environment without changing the model to match the change in the environment. The fact that it's damned unlikely that we'll see new simple self-replicators showing up today is irrelevant to discussing the odds of them showing up billions of years ago. Why? Because the environment is different. In the days when a first self-replicator developed, there was no competition for resources. Today, any time you have the set of precursors to a replicator, they're part of a highly active, competitive biological system.

And then, he goes back to try to re-invoke the big-numbers argument by making it look even worse; and he does it by using an absolutely splendid example of bad combinatorics:

The odds, then, are daunting; and when considered realistically, they are even worse than this already alarming account might suggest. The discovery of a single molecule with the power to initiate replication would hardly be sufficient to establish replication. What template would it replicate against? We need, in other words, at least two, causing the odds of their joint discovery to increase from 1 in 10^60 to 1 in 10^120. Those two sequences would have been needed in roughly the same place. And at the same time. And organized in such a way as to favor base pairing. And somehow held in place. And buffered against competing reactions. And productive enough so that their duplicates would not at once vanish in the soundless sea.

The odds of one self-replicating molecule developing out of a soup of pre-biotic chemicals is, according to Berlinksi, 1 in 10^60. But then, the replicator can't replicate unless it has a "template" to replicate against; and the odds of that are also, he claims, 1 in 10^60. Therefore, the probability of having both the replicator and the "template" is the product of the probabilities of either one, or 1 in 10^120.

Only that product formulation for combining probabilities only works if the two events are completely independent. They aren't. If you've got a soup of nucleotides and polymers, and you get a self-replicating polymer, it's in an environment where the "target template" is quite likely to occur. So the odds are not independent - and so you can't use the product rule.

Not to mention that he repeats the same error he made before: assuming that there's exactly one "template" molecule that can be used for replication.

And even that is just looking at a tiny aspect of the mathematical part: the entire argument about a template is a strawman; no one argues that the earliest self-replicator could only replicate by finding another perfectly matched molecule of exactly the same size that it could reshape into a duplicate of itself.

And finally, he rehashes his invalid-model argument: because we don't see primitive self-replicators in todays environment, that must mean that they were unlikely in a pre-biotic environment.

This is what mathematicians call "slop", also known as "crap". Bad reasoning, fake numbers pulled out of thing air, assertions based on big numbers, deliberately using wrong numbers, invalid combinatorics, and misapplication of models. It's hard to imagine what else he could have gotten wrong.

Skeptics Circle is Out

The 31st meeting of the Skeptics Circle is out. Go read it!.

Wednesday, March 29, 2006

Calculus - no, not that calculus!

This post is about more logic. It's also about a kind of calculus.

When you hear the word calculus, you probably think of derivatives and integrals, differential equations, and stuff like that. Well, that's a calculus: differential calculus. But it's not the only calculus. In fact, a calculus is really any system of rules for manipulating symbolic expressions, which has a formal mathematical or logical basis for the meanings of its rules.

That's really a fancy way of saying that a calculus is a bunch of mechanical rules with some special properties.

When I talked about propositional logic, the rules for working with propositions were really simple. In basic propositional logic, there's a finite set of propositions, and so there aren't a lot of deep things you can infer. Overall, it's really pretty trivial. When you get into more complicated logics, like first order predicate logic (FOPL), then things can get really interesting. In FOPL, there are variables, and you can find statements where you can create infinite loops using inference rules to try to prove some statement.

In logics like FOPL, the inference rules form a general symbolic/mechanic reasoning system. You can manipulate the rules and derive inferences without understanding the semantics of what you're inferring. You can treat it as a purely mechanical system. But anything you can produce with those purely mechanical manipulations of the symbols, expressions, and inference rules will be meaningful statements in FOPL.

That mechanical reasoning system is a calculus: the first order predicate calculus.

Remember a few days ago, I mentioned the idea of a model? Well, the intuitive meaning of a model is a way of assigning meanings to the symbolic statements in a calculus. For a calculus to be meaningful, you need to be able to show a model for it where every statement that you can derive in a calculus is true in the model - that is, if you take the statement, and figure out what the model says it means, it will be a true statement about something real. The reason that models are important is that it's easy to create a calculus that looks valid, that looks like it's rules should produce meaningful results, but that contains a subtle error, so that the mechanical inferences that you can build using the rules will not be true. But more on that in another post.

Let's tale a look at first order predicate logic and its calculus.

Syntax: Writing statements in FOPL

FOPL statements are made up of a few kinds of elements:

Variables: variables are symbols which represent values which can be reasoned about using the logic. I'll write variables as lower case letters: x, y, z
Constants: variables which are used as the names of specific fixed values. I'll write constants as either numbers, or inside of quotes: 1, 2, 3, "GoodMath", etc.
Predicates: predicates are symbols which are used to make assertions about constants and values. Each predicate has a name, which I'll write starting with an uppercase letter, and a number of parameters. A statement with a predicate is written: "Name(arg,arg,arg)", where each "arg" is either a constant or a variable.
Quantifiers: quantifiers introduce new variables. There are two quantifiers: "for all", written ∀ , and "there exists", written ∃.
Operators: operators are symbols which are used to join other statements together. Operators in first order predicate logic are AND, written ∧; OR, written ∨; IMPLIES, written ⇒; IF AND ONLY IF, written ⇔; and NOT, written ¬.
Parens: parenthesis are used for grouping, and for surrounding the parameters of a predicate.

A statement in FOPL is one of the following:

A predicate, with its parameters all filled in with either variables or constants. If it is filled with variables, it must be part of a statement with a quantifier that defines the variable. Any reference to an undefined variable makes the statement invalid.
Two valid statements joined by AND, OR, IMPLIES, or IFF: "A(x) ∨ B(x)"
A NOT placed in front of valid statement: "¬ A(x)"
A valid statement with a quantifier clause. A quantifier
clause is written "quantifier variable : statement"; for example,
"∀ x : GoesToSchool(x)".
A valid statement surrounded by parens.

Ok. So that's a bit about what it looks like. Let's look at a few examples. I'll translate the examples into english to help you read them, but it's important to remember that we haven't gotten to FOPL semantics yet; the translations are just to help you understand how to read the logical statements.

∀x:IsPerson(x)⇒IsMammal(x)
∀x:IsMammal(x)⇒(∃y:IsMammal(y) ∧ IsMother(y,x))

The first statement says that for every x, if x is a person, then x is a mammal. The second statement says that if x is a mammal, then there is another mammal which is its mother.

∀x:IsMammal(x)⇒¬ColdBlooded(x)

This one says that if x is a mammal, then x is not cold-blooded.

Now we're getting to the interesting part. How do we reason using predicate calculus? We don't need the semantics here: we can just work in terms of the symbols.

We need to start with a set of facts. A fact is a statement of the logic that we know is true. Using the predicate calculus, we can add to this set of facts.

We can name a set of facts; I'll use mainly F (uppercase F) to indicate the set of facts that are known. Reasoning is taking a set of facts F, and a set of inference rules, and seeing what facts you can add using those rules. When you have a set of facts F, F allows you to conclude that some statement s is true, we say that F entails s, or F :- s.

Now, we can write reasoning rules. There are two kinds of rules; inference rules (written X -> Y), which say "If I know X, then I can conclude Y"; and equivalence rules, (written X≡Y) which say "X means exactly the same thing as Y"; it's the same as two inference rules "X -> Y" and "Y -> X".

F :- A ⇒ B, F :- A -> F :- B

That's one of the implication rules from propositional logic: if A implies B, and we know A is true, then we can conclude that B is true. All of the inference rules of propositional logic work in in predicate logic as well.

For FOPL, we need to add inference rules for the new things: quantifiers and variables. There are two terms that come up in the rules about variables: free, and bound. A variable is free if the predicate which encloses it is not inside of a quantifier which defines the variable. A variable is bound if it is inside of a quantifier which defines it.

First, we'll go through the equivalences:

The equivalences basically do two things: they tell you how to switch between universal (forall) and existential (there exists) statements; and how to combine or separate compound clauses inside of a quantifier. Inference rules do pretty much the same kinds of things, but instead of being equivalences, where the two sides of the equivalence mean exactly the same thing, inference rules are implications: they say that given the left side, you can conclude the right. Given the right-hand side of an inference rule, you can't conclude anything about the left: inference is one-way.

And now, the implications:

Hopefully, they're pretty sensible. They mainly show how you can introduce or eliminate quantifiers while reasoning.

Semantics is for another day; but what we have here is enough to do full logical reasoning. There's a straightforward algorithm which you can follow using these rules to test if a particular statement can be inferred using first-order predicate logic, which is essentially why we can call this first-order predicate calculus: because without knowing the semantics, we can do a purely mechanical symbolic manipulation to figure out if a particular statement is true.

Now, this post is getting awfully long, so I think I'll take this as a break point; I'll show you the mechanics of using FOPC in another post.

Off topic, but must be said

It's not math, and I know it's off topic. But I've got to chime in on this. There are times when something must be said, regardless of whether it's appropriate. Most of the media seems to be doing that glowing hindsight thing just because the rotten bastard is dead; well, he got to live to a nice happy old age, smearing people practically to his dying breath; but thousands of innocents didn't get to live a long pleasant life like his, because of what he did.

Times like this, I'm sorry that I'm Jewish; we don't believe in hell. I'd really like to be able to believe that he's suffering now.

Tuesday, March 28, 2006

Playing with mathematical machines

In computer science, one of the tricks that we like to play is to talk about imaginary machines, in order to use them to think about what computing machines can do, and what the limits of mechanical computation are. One of the most fundamental of these is called the Turing machine, after its inventor, the great Alan Turing.

There are actually a few standard variants of the Turing machine, so the particular version that I'm going to talk about might be very slightly different than what you'd see elsewhere; but this is the version I find easiest to understand.

A turing machine is a very simple machine, but it's capable of computing anything that can be computed by any mechanical computing device. A turing machine consists of four real pieces:

A set of states, one of which is the initial state - the state that the machine is in when you turn it on.
A set of symbols.
A transition function from a pair (state,symbol) to a triple (state, symbol, direction). [direction is either "forwards" or "backwards"]
An infinitely long "tape" made up of cells, each of which can contain exactly one symbol.
A head, which can point to a single cell and read the symbol in that cell, and write a symbol onto the cell replacing whatever is there. The head can also move.

The way that a turing machine works is really simple. You write something onto the tape - that's the input to the machine. You position the head on the very first square of the tape. Then you turn it on. Each step, it looks at the symbol on the cell under the head. Then it calls the transition function on the pair of the current state and the symbol under the head. It gets back a triple: a new state, a symbol, and a direction. It changes its state to the new state, writes the symbol onto the tape, and moves once cell in the indicated direction. And it keeps going.

And that is it. Any program that can be written, anything that can be computed, can be computed with something that simple. Hard to imagine, isn't it?

So let's take a look at a few examples of Turing machines.

Here's a simple turing machine that does subtraction, based on a program for a turing machine simulator I found online. It uses unary numbers: any number n in unary consists of a string on n 1's. So 6 would be "111111", 3 would be "111", etc. It starts with a tape that looks like a subtraction problem: "x-y=", where the X and Y are unary numbers. So, for instance, 6-3 would be put on the tape as "111111-111=". After the machine runs, what's left on the tape is the answer in unary, plus some blank space.

The set of symbols that we can have on the tape are: "1", " " (blank), "-" (minus), and "=". The

The machine needs 4 states, plus a special fifth to indicate that the machine has stopped, called the HALT state.

The state transition function is in the following table:

(State,Symbol)	(State,Symbol,Direction)
(1, " ")	(1 , " ", forward)
(1, "1")	(1 , "1", forward)
(1, "-")	(1 , "-", forward)
(1, "=")	(2 , " ", back)
(2, "1")	(3 , "=", back)
(2, "-")	(HALT, " ", back)
(3, "1")	(3 , "1", back)
(3, "-")	(4 , "-", back)
(4, " ")	(4 , " ", back)
(4, "1")	(1 , " ", forward)

Now, it looks pretty mysterious looking at that mess. So here's an explanation, and follow the link above to actually watch the program run.

The machine starts on the leftmost cell, which contains the first digit of the first unary number.
In state one, it scans forward until it finds a "="; while going forward, it never changes the symbol
on the tape - it writes whatever it read.
When it finds an "=", it switches to state 2, and reverses direction.
In state two, it scans backwards one space, and if there's a "1" there, that means it's not
done subtracting, so it erases that one by moving the "=" back one space; and then it shifts to
state 3. If there's a "-" on the cell instead of a "1", that means it's erased all of the digits of
the second number, so it's done, and halts.
In state three, it scans backwards over the rest of the "1"s in the second number of the tape, until it gets
to the "-" that separates the two numbers. Then it switches to state 4.
In state 4, it scans back over and blanks until it finds one of the digits of the first number; when it does, it erases it by replacing it with a space, and then it switches back to state 1.

So how is this subtraction? Basically, it's scanning forward, finding the last digit of the second number, erasing it, and then erasing one digit of the first number; and then it repeats until there are no digits left in the second number, at which point it erases the "-" and the "=", so there's nothing left but the answer.

So let's see what it looks like running. Here's a picture of a turing machine executing each step; the floating box is the head; the arrow points at the current tape cell; the number inside the head is the current state.

Starting at step 1, the machine is in state 1, and it goes forward each step in state 1 until it reaches the equal sign in step 8.
In step 8, it writes a space (deleting the "="), and starts going backwards, and goes to step 2.
In step 10, it looks to see if there's a one under the head. If there is, it's not done subtracting yet. So it writes an equal sign down, switches to state 3, and and keeps going backwards.
Starting at step 11, state three runs until it finds a minus sign, and then switches to state 4. That doesn't take long: it finds the equal right away.
State 4 skips over spaces until in finds a one: then it erases it, switches to state 1, and starts scanning forwards again. It switches to state 1 right before step 13. At this point, now, instead of having the tape read "4-2", it reads "3-1".
It repeats the process to get rid of a digit from each of the numbers on the tape in steps 13-21.
Finally, there are no more digits in the second number - it's been erased. So in steps 22 and 23, states 2 and 3, it erases the "=" and the "-", leaving just the final answer on the tape, and then it halts.

One important thing to notice: this turing machine is hard-wired - it's a machine that does exactly one thing. It can't be changed or reprogrammed, because it doesn't have a program. The state transition table is part of the hardware of the machine. So in a very important sense, this is a very limited machine: each turing machine has a "program" hardcoded into it: it can't be "reprogrammed", because it isn't programmed at all. It's just built with those states, and those actions.

But here's where Turing got really clever. What if you were to build a turing machine which understood turing machines? That is, what if you built a machine which took as input the state transition table for any other turing machine?

That's called a universal Turing machine (UTM). And while it was far from easy for Turing to design one (his first publication of the UTM had an error in it), hackers since then have played around with UTMs, and it turns out to be not that difficult to do (once you've seen how it's done). Last I checked, Minsky had the record for the smallest UTM, using 7 states and four characters. (You can see Minsky's machine at mathworld.) Since then, Wolfram apparently came up with a 2-state 5 character UTM!

Monday, March 27, 2006

Sorry for the slow day

Sorry for the slow monday. It's been a rough day. Our home internet service got upgraded this morning, which of course meant that we didn't have internet service all day. (Why, oh why, do these things never go smoothly?)

Now that I bought a new router, got it configured, and rewired the hopeless tangle of cables, I'm back online. I've got two posts in progress, but I don't know if I'll have time to finish writing either of them tonight. Things will be back to normal tommorow.

Sunday, March 26, 2006

Question for the readers: Should I use MathML?

For talking about things like logic, it would be useful to be able to use something that lets me put stuff into posts formatted in a mathematical style. There's an HTML extension called MathML which would make it easy to do that. But, alas, it's not yet supported by all browsers.

For Internet Explorer on windows, there's no built-in support, but there's a free extension that you can install to view MathML.

Recent versions of Mozilla and Firefox for all platforms has built-in support for viewing MathML.

For Mac users who use Safari (like me), there's a slightly kludgy applet that you can install to view MathML.

So, dear readers, would you still read Good Math, Bad Math if you needed to use a browser that did MathML to be able to read it?

Friday, March 24, 2006

On how not to apply mathematical models

Over at Aetiology, there's a furious war going on in the comments between Tara and an AIDS-HIV denialist. One of the things that the denialist has referenced is an essay by Professor Rebecca Culshaw, a mathematician who has worked on mathematical models of AIDS and HIV transmission, and who has become an HIV denialist.

I'll leave the science debate over at Aetiology, where it belongs. But there's definitely a mathematical aspect to this. Professor Culshaw lends her authority as a mathematician to the HIV denialist folks. Does her math support what she's saying?

Alas, no.

Professor Culshaw is not a bad mathematician - quite the opposite. What I can read of her publications shows very solid mathematical work, done extremely well.

The problem is that when she tries to apply the mathematics to the science of epidemiology, she fails miserably, and the reason why is mathematical.

Let me first present an excerpt of her essay. (I suggest reading the thing in it's entirety; it's an interesting piece of writing, even if I think she's wrong.)

Over the past ten years, my attitude toward HIV and AIDS has undergone a dramatic shift. This shift was catalyzed by the work I did as a graduate student, analyzing mathematical models of HIV and the immune system. As a mathematician, I found virtually every model I studied to be unrealistic. The biological assumptions on which the models were based varied from author to author, and this made no sense to me. It was around this time, too, that I became increasingly perplexed by the stories I heard about long-term survivors. From my admittedly inexpert viewpoint, the major thing they all had in common – other than HIV – was that they lived extremely healthy lifestyles. Part of me was becoming suspicious that being HIV-positive didn’t necessarily mean you would ever get AIDS.

By a rather curious twist of fate, it was on my way to a conference to present the results of a model of HIV that I had proposed together with my advisor, that I came across an article by Dr. David Rasnick about AIDS and the corruption of modern science. As I sat on the airplane reading this story, in which he said "the more I examined HIV, the less it made sense that this largely inactive, barely detectable virus could cause such devastation," everything he wrote started making sense to me in a way that the currently accepted model did not. I didn’t have anywhere near all the information, but my instincts told me that what he said seemed to fit.

Over the past ten years, I nevertheless continued my research into mathematical models of HIV infection, all the while keeping an ear open for dissenting voices. By now, I have read hundreds of articles on HIV and AIDS, many from the dissident point of view but far, far more from that of the establishment, which unequivocally promotes the idea that HIV causes AIDS and that the case is closed. In that time, I even published four papers on HIV (from a modeling perspective). I justified my contributions to a theory I wasn’t convinced of by telling myself these were purely theoretical, mathematical constructs, never to be applied in the real world. I suppose, in some sense also, I wanted to keep an open mind.

Reading this, you would conclude that what she had studied in her research was mathematical models of HIV transmission between people. You'd be wrong. Her work is on the transmission of HIV between infected and non-infected cells in culture. For example, here's one abstract of a paper available on the web:

We consider a two-dimensional model of cell-to-cell spread of HIV-1 in tissue cultures, assuming that infection is spread directly from infected cells to healthy cells and neglecting the effects of free virus. The intracellular incubation period is modeled by a gamma distribution and the model is a system of two differential equations with distributed delay, which includes the differential equations model with a discrete delay and the ordinary differential equations model as special cases.We study the stability in all three types of models. It is shown that the ODE model is globally stable while both delay models exhibit Hopf bifurcations by using the (average) delay as a bifurcation parameter. The results indicate that, differing from the cell-to-free virus spread models, the cell-to-cell spread models can produce infective oscillations in typical tissue culture parameter regimes and the latently infected cells are instrumental in sustaining the infection. Our delayed cell-to-cell models may be applicable to study other types of viral infections such as human T-cell leukaemia virus type 1 (HTLV-1)

And there lies the problem. Mathematical models are highly dependent on the details of their formulation and their parameters. Her work on modeling viral behavior is not modeling viral behavior of HIV in a human host; it's modeling viral behavior in a cell culture. That's a useful thing to understand: it helps us understand how HIV attacks and spreads between cells. But it doesn't really tell us anything directly about how HIV should spread between cells in a dramatically different environment. Viruses don't behave the same in culture and in a complete host. (For example, see this paper, describing how differently Hepatitis-A behaves in culture from in a host.)

She's trying to take a mathematical model created to specifically model one environment, and to apply that model to a different environment, without modifying the model for the new environment.

You can't do that. It doesn't work. A good mathematical model of HIV infection in a human host is not the same as a good mathematical model of HIV infection in a cell culture. You can't use the math that you derived for describing a cell culture, and apply it to the new environment without modifying the model to account for the differences between the environments. That's just bad math. Professor Culshaw appears to be a good enough mathematician that she should know better.

More bad probability: Cheating with Chance from AiG

Remember back in this post, I put together a taxonomy of the ways probability was commonly misused?

Well, here's a really great example of the kind of thing I was talking about. This comes from Answers in Genesis, in an article titled, in a striking bit of honesty, "Cheating with chance":

The argument from probability that life could not form by natural processes but must have been created is sometimes acknowledged by evolutionists as a strong argument. The probability of the chance formation of a hypothetical functional ‘simple’ cell, given all the ingredients, is acknowledged to be worse than 1 in 10^57800. This is a chance of 1 in a number with 57,800 zeros. It would take 11 full pages of magazine type to print this number. To try to put this in perspective, there are about 10^80 (a number with 80 zeros) electrons in the universe. Even if every electron in our universe were another universe the same size as ours that would ‘only’ amount to 10^160 electrons.

What kind of errors are there is just this one paragraph?

Big numbers: the whole point of this is to try to stress the idea of the probability against it is so large that you should consider it impossible.
Fake numbers: Where does 1 in 10^57800 come from? Nowhere. It's just randomly pulled out of a hat. There's no justification for it. It's just a frighteningly big number, which is effective for making the "big numbers" tactic work.
Misshapen search space: the argument is based on a single step from nothing to a fully functional single cell. Yeah, if you could compute the odds of that happening, they'd be pretty damned large. But that's not what the theory that they're trying to refute proposes: in the theory of evolution, the search space would have millions of steps between inorganic chemicals and a complete cell. The probability of successfully reaching a cell in a properly structured search space will be dramatically different from the probability of jumping from a pile of inorganic chemicals to a complete cell in a search space with no intermediate states.

Alas, this is typical of the kinds of arguments that creationists like to use: fake math and strawmen to try to fool people into believing them.

Friday Random Ten

It's friday again.

Cultural Genetics, Frameshift. Would you believe, progressive metal from a Dream Theater spinoff, where all of the songs are based on essays by Richard Dawkins?
The New Madera Waltz, Salamander Crossing. A quiet waltz, by a bluegrass band from New England.
Light Mass Prayers, Porcupine Tree. Great band, which started out as a prank.
Matte Kudasai, King Crimson. King Crimson, probably the best progressive band ever.
Jim Coleman's/Spellan's Fiddle, Harry Bradley. Very sparse traditional Irish music by (arguably) the finest Irish flautist in the world. If you want to hear what Irish music is supposed to sound like, Harry is pretty much the guy to listen to.
Deadwing, Porcupine Tree. PT comes up again today. Gosh, what are the odds of that? Some people would say that couldn't have happened randomly, since I've got thousands of songs on my ipod. :-)
Ego Tripping at the Gates of Hell, the Flaming Lips. A new acquisition, which I'm not thrilled with. I'd heard great things about the Flaming Lips, but this album leaves me cold.
Jeremy Reel, Psychograss. Really fantastic wacko bluegrass from a bunch o'Loonies including my old banjo teacher, the truly great Tony Trischka. (Psychograss also include Darol Anger, Todd Phillips, Mike Marshall, and David Grier.)
What can you mach?, Shirim Klezmer Orchestra. Very cool klezmer album.
Scents and Subtle Sounds, Phish. I really like Phish; shame I didn't discover them until after they broke up.

Thursday, March 23, 2006

Spam in the comments

Unfortunately, we're having our first outbreak of spam in the comments. For now, I'm deleting them as I find them. But I thought it would be good to make clear why you're seeing deleted posts, and what I'm deleting.

The only thing I'm deleting from comments is true spam: "make money fast" schemes, and links to commercial websites trying to boost their google pagerank. I'm not touching substantive comments of any kind.

A bit of logic

The other day, I was at work, hacking away. Now, most days, I wind up working pretty much without interruptions, because I'm on assignment to a product development group, and I'm the only guy in New York working on this product.

But, once in a while, someone bangs on my door to talk. So I'm chatting away, actually talking about Battlestar Galactica, and the guy I'm talking with responds to something with "But that's not logical". I hate what Star Trek did to the common understanding of "logical".

All of this is a roundabout way of saying that I want to talk about logic.

What is Logic?

There's no way to look at a statement in isolation and say that it's not logical. In fact, it's pretty darned close to impossible to say whether something is logical at all without first agreeing on just what logic you're talking about!

Logic is not a single set of rules. It's a structure - or more correctly, a way of studying a particular kind of structured system of rules. Logic is a way of building up a structure that lets you decide whether or not a particular set of statements is meaningful; and if they are, what they mean, and what kind of reasoning you can do using them.

What we usually think of as logic is actually one particular logic,
called first order predicate logic (FOPL). It's the logic of basic "and", "or", and "not" statements, with the ability to talk about "all" of some group of objects having a property, or the existence of an item with a property. (I'll define it in more detail later.)

FOPL is far from the only useful logic out there. Just to give a quick idea of a few different logics, here's a couple with brief descriptions.

Intuitionistic Logic: Something close to FOPL, but it's got the very interesting property that "A or (not A)" is not always true.
Temporal Logic: A logic that is focused on reasoning about how things happen as time passes. Quantifiers in temporal logic aren't things like "For all X, P(x) is true", but things like "Eventually there will be an X where P(X) is true", or "For all time, P(x) is true". Temporal logic is widely used in something called model checking, which is a technique use for (among other things) verifying the correctness of operations in a CPU design.
Fuzzy Logic: A logic where statements aren't specifically true or false, but have degrees of truth associated with them. (corrected from "probablity of truth" based on comment -markcc)

Logic is crucial to mathematics, because math is nothing but the formal systems that you can build using logics.

So - what is a logic?

There are three components to a formal logic:

Syntax: what does a valid statement in the logic look like?
Semantics: what does a valid statement in the logic mean?
Inference Rules: how can you use meaningful statements in the logic to
infer other meaningful statements?

An Example Logic: Basic Propositional Logic

For a simple example, we can look at basic propositional logic. Propositional logic is a very basic logic: it has a set of primitive statements, and just four operators: AND, OR, NOT, and IMPLIES. So it's easy to explain.

Syntax of Valid Statements in Propositional Logic

So, let's start with syntax.

In propositional logic, we have a set of basic, atomic propositions. (Atomic means that they can't be subdivided into smaller propositions.) We write those with capital letters. Any single basic proposition is a valid statement in the logic.

We also have the operators: AND, OR, NOT, and IMPLIES. If f and g are valid statements, then not f, not g, f and g, f or g, and f implies g are all valid statements.

Finally, we have parenthesis for grouping: if f is a valid statement, then ( f ) is a valid statement.

So, each of the following is a valid statement:


        A
        B
        A or B
        A or B 
        A and B and C or (D and E) or (E implies F)
        not A
        not A implies (C and D or E)

That's it for syntax.

Semantics: The meaning of propositional logic

For semantics, we start by saying _which_ of the basic propositions is true. So, for example, if every upper-case letter were a proposition, we could say that A through M were true, N through Z were false.

Then we define the meanings of the operators:

x AND y	x is TRUE	x is FALSE
y is TRUE	TRUE	FALSE
y is FALSE	FALSE	FALSE

x OR y	x is TRUE	x is FALSE
y is TRUE	TRUE	TRUE
y is FALSE	TRUE	FALSE

x IMPLIES y	x is TRUE	x is FALSE
y is TRUE	TRUE	TRUE
y is FALSE	FALSE	TRUE

NOT x
x is TRUE	FALSE
x is FALSE	TRUE

Given those tables, and a truth binding for all of the atomic propositions, and we can work out whether a given statement is true or false.

Inference rules: reasoning using propositional logic

Now, finally, inference rules. Suppose that we aren't given an exhaustive list of all of the atomic propositions and whether they're true or false. We'd like to be able to use statements that we know to decide whether or not other statements are true. There are a set of rules we can use to do this:

If we know: x is TRUE, and x IMPLIES y is TRUE;
then we can conclude: B is TRUE.
If we know: y is FALSE and x IMPLIES y is TRUE;
then we can conclude that x is FALSE.
If we know: x is TRUE and x AND y is TRUE;
then we can conclude that y is TRUE.
If we know: x is TRUE and NOT (x AND y) is TRUE;
then we can conclude that y is FALSE
If we know: x is FALSE and x OR y is TRUE;
then we can conclude that y is TRUE
If we know: x OR y is FALSE;
then we can conclude: X is FALSE and Y is FALSE

And so on. To be complete, we'd need to add some more, but the above are the basics. (We would need a rule for NOT; for AND and OR we would need some rules for commutativity, etc.)

So, finally, let's take an example of propositional logic.

Let's take some basic propositions (note that I'm not yet saying whether any of these are true or false!):

Proposition A = I am an annoying person.
Proposition B = I am talking to you.
Proposition C = You are annoyed at me.

We assert that the following two things are TRUE:

(A AND B) IMPLIES C (In English, "IF I am an annoying person and AND I am talking to you THEN you are annoyed.")
NOT C. (In English, "You are not annoyed at me. Remember, this is just for illustrations sake, by now, you may well be annoyed at me in reality!)

Now, what can we figure out by knowing that those two statements are true?

By one the inference rules we gave above, "If we know: y is FALSE and x IMPLIES y is TRUE; then we can conclude that x is FALSE.". We can use this rule, by using (A AND B) for x, and C for y: If we know that C is false, and (A and B) implies C, then we know that (A AND B) is false. So we can conclude NOT (A AND B), which means "It is not the case that I am annoyed and I am talking to you"; which actually can be switched (through a sequence of changes) to "Either I am not annoying, or I am not talking to you (or both)."

Wednesday, March 22, 2006

Bad Math for Bad Politics

I was forwarded a link this afternoon, to some really bad political math. It seems that the great blowhard, Rush Limbaugh, had the following image on his website:

It has apparently since disappeared; but there've been a lot of similar pieces showing up in the conservative press. So let's just take a look at the numbers. They claim to be putting the number of American deaths in Iraq into perspective. So, they pick a few ways that a lot of people have died, and give the numbers:

Death by auto accident: 120,000
Death by falling: 45,000
Death by poisoning: 27,000
Death by drowning: 12,000
War in Iraq, all causes: 2,300

So, what's wrong with this picture?

The problem is that it's comparing apples to orchards.

The military doesn't give the exact number of soldiers in Iraq. Various reports I found on the web give the number anywhere from 120,000 to 150,000. So, let's by generous, and assume that the number of soldiers there is at the large end. That means that we're talking about 2,300 deaths out of about 150,000 people.

According to the US census bureau, the number of American citizens is 295,734,134. We'll round that down to 295,000,000. And we'll ignore the fact that there are a lot of non-citizens in America, and that the death numbers cited other than the Iraq war are for American residents, not American citizens. So we're talking about 216,000 deaths out of about 295,000,000 people.

Let's normalize those numbers, to deaths/1000 people.

Death by auto accident: 0.4/1000
Death by falling: 0.15/1000
Death by poisoning: 0.1/1000
Death by drowning: 0.04/1000
Death of US soldiers in Iraq: 15/1000

So, if we actually look at the numbers in a realistic context - that is, normalized to give a way to compare in a meaningful way, then the deaths of soldiers in Iraq is about 22 times larger than the sum of the most common accidental deaths in the US. Put another way, if the death rate of Americans were the same as the death rate of American soldiers in Iraq, then since the beginning of the Iraq war, we would have seen 4,425,000 accidental deaths instead of 216,300.

And as an aside, I've got to say, the image of Rush Limbaugh holding an American flag while trying to diminish the magnitude of the sacrifice made by American soldiers in this foolish war is deeply, deeply sickening. Give our soldiers the respect they deserve for the kinds of sacrifices they're making: what they're doing is a whole lot harder and more dangerous than anything any of us do in our normal lives. To pretend that the deaths of these people - not to mention the deaths of all of the innocent Iraqi's - is the equivalent of getting into a car accident is disgraceful.

Tuesday, March 21, 2006

Crackpot math: Velikovskians and Dinosaurs

As I always say, one of the best ways of recognizing a crackpot is through math. You can make nice arguments about all sorts of theories using words, but if the math doesn't back it up, the words are worthless.

One of my favorite examples of this comes from back in the days when I was active on the usenet talk.origins newsgroup, where there were a bunch of Velikovskians. My personal favorite was a wacko named Ted Holden. Ted believes that gravity is not the dominant force in orbital mechanics, but that it's all electromagnetic. (But of course, there's nomath in that argument!)

Ted also beleives in a strange theory where Saturn used to be an "electromagnetic star" positioned very close to the earth, and that this created a zone of weakened gravity on earth. His argument for this is based, among other things, on the idea that dinosaurs couldn't possibly have stood in current earth gravity. Why not? Well, it's obvious! If we scale a human weightlifter up to the size of a dinosaur, he won't be able to stand up.

In Ted's own words:

Any animal has to be able to lift its own weight off the ground, i.e. stand up, with no more difficulty than Kazmaier experiences doing a 1000 lb. squat. Consider, however, what would happen to Mr. Kazmaier, were he to be scaled up to 70,000 lb., the weight commonly given for the brontosaur. Kazmaier's maximum effort at standing, fully warmed up, assuming the 1000 lb. squat, was 1340 lb. (1000 for the bar and 340 for himself). The scaled maximum lift would be a solution to:
1340/340^.667 = x/70,000^667 or 47,558 lb..
He'd not be able to lift his weight off the ground!
A sauropod dinosaur had four legs you might say; what happens if Mr. Kazmaier uses arms AND legs at 70,000 lb.. The truth is that the squat uses almost every muscle in the athlete's body very nearly to the limits, but in this case, it doesn't even matter. A near maximum benchpress effort for Mr. Kazmaier would fall around 600 lb.. This merely changes the 1340 to 1940 in the equation above, and the answer comes out as 68,853. Even using all muscles, some more than once, the strongest man who we know anything about would not be able to lift his own weight off the ground at 70,000 lb.!

Moreover, Kazmaier is using glutteal and lower back muscles in the squat, and pectorals in the benchpress, i.e. extra muscle groups which the sauropod he is being compared to would not be assisted by in standing. Any tiny advantage in leverage which a sauropod might have over the human lifter for any reason, would be overwhelmed by the huge edge in available musculature and the usage of the extra muscle groups on the part of the human in the comparison.

To believe then, that a brontosaur could stand at 70,000 lb., one has to believe that a creature whose weight was largely gut and the vast digestive mechanism involved in processing huge amounts of low-value foodstuffs, was somehow stronger than a creature its size which was almost entirely muscle, and that far better trained and conditioned than would ever be found amongst grazing animals. That is not only ludicrous in the case of the brontosaur, but the calculations only get worse when you begin trying to scale upwards to the supersaur and ultrasaur at their sizes.

How heavy can an animal still get to be in our world, then? How heavy would Mr. Kazmaier be at the point at which the square-cube problem made it as difficult for him just to stand up as it is for him to do 1000 lb. squats at his present size of 340 lb.? The answer is simply the solution to:

1340/340^.667 = x/x^.667

or just under 21,000 lb.. In reality, elephants do not appear to get quite to that point. McGowan (DINOSAURS, SPITFIRES, & SEA DRAGONS, p. 97) claims that a Toronto Zoo specimen was the largest in North America at 14,300 lb., and Smithsonian personnel once informed the author that the gigantic bush elephant specimen which appears at their Museum of Natural History weighed around 8 tons.
Again, in all cases, we are comparing the absolute max effort for a human weight lifter to lift and hold something for two seconds versus the sauropod's requirement to move around and walk all day long with scaled weight greater than these weights involved in the maximum, one-shot, two-second effort. That just can't happen.

Yes indeedy - because a scaled version of the weightlifter Bill Kazmaier couldn't bench press his scaled weight if he were the dinosaur, a dinosaur couldn't possibly have stood up.

That's the argument: if a human scaled up with perfect proportions to the weight of a dinosaur, he wouldn't be able to stand up. The only math is the multiplication needed to scale a human weightlifter to dinosaur size.

What's wrong with the math? Mainly the fact that it's totally invalid.

It's true that pretty much, muscle is muscle. There's no dramatic difference between the strength of muscles on different animals. But: there's a huge difference in anatomy.

The weight that a muscle can lift isn't just based on how much tension the muscle can create. It's based on the structure of the thing it's pulling against. Remember the lever principle: the torque that I can exert by pushing or pulling on a lever is proportional to how far I am from the fulcrum. Muscles act the same way: to know how much a muscle can lift, you need to know the pivot points and the attachment points, to work out the effective length of the lever arm.

Humans doing squat lifts or bench presses are terrible at this. Our joints are extremely skinny; the attachment points are set up with relatively small moment arm - which means that our joints do not lift well. Look at a human skeleton's knee joint - and compare it to a dinosaurs knee joint:

Look at the width of the bone at the knee joint! That's one serious lever. Measuring it with some screen ruler tools, I get human knee joints as around 2x the width of the knee bone, but that doubling is shared on both sides of the bone - it's close to symmetrical. The dinosaur bone is very assymetrical, and it projects forwards approximately 4x(!) the width of the lower leg bone. Ignoring any other factors, the dinosaur will have at least three times the lifting strength in its leg of the human. And the knee is a relatively strong joint for a human; our backs are terribly weak in an upright posture; comparatively, dinosaurs had very different vertebrae and hips from us, which sacrificed mobility in exchange for strength. (See, for example, here.)

So does it make any mathematical sense to extrapolate what a dinosaur could do based on comparison with a scaled human? Hell no. The math just doesn't support it.

Some site setup news

Just to let folks know, I've been doing a bit of work on polishing the website. There's now an RSS feed, down by the end of the blogroll on the right-hand side. There are a few other things that I've done to make things work a bit more smoothly, but that's the main one that people have been asking me about.

Clarifying Recursion

I managed to confuse a lot of people yesterday in my post on building math, because I forgot that normal people don't understand recursion. For a computer scientist, or a mathematician involved in logic, recursion is so fundamental that it's easy to forget that when you first see it, it looks like some kind of bizzare magic. In my day job, where I'm currently doing software development, I use recursion every day.

Hopefully I can clear up that confusion.

The cleverest definition that I've seen of recursion comes from the Hackers dictionary. In there, it has:

Recursion

see "Recursion"

Recursion is about defining things in terms of themselves. For example, think about the factorial. For any positive integer N, the factorial of N (written N!) is the product of all of the integers less than it:

1! = 1
2! = 1 * 2 = 2
3! = 1 * 2 * 3 = 6
4! = 1 * 2 * 3 * 4 = 24
5! = 1 * 2 * 3 * 4 * 5 = 120
...

But that's a cumbersome way of saying it. Instead, if you look at it, you can seee that there's a pattern: the factorial of each number N is the product of a sequence of numbers, and that sequence is exactly the same as the sequence for the number before it, with N added onto the end.

Let's use that fact to make the list a bit simpler; let's just replace the sequence for everything but N with the product of the numbers in that part of the sequence:

1! = 1
2! = 1 * 2 = 2
3! = 2 * 3 = 6
4! = 6 * 4 = 24
5! = 24 * 5 = 120
...

Now the pattern should be reasonably easy to see: look at 4! and 5!; 4! = 24; 5! = 5 * 24. Since 4! is 24, we can then say that 5! = 5 * 4!.

So we can say that in general, N! = N * (N-1)!.

Except that that doesn't quite work. Let's take a look at why. Let's use that to compute 3!: 3! = 3 * 2! = 3 * 2 * 1! = 3 * 2 * 1 * 0! = 3 * 2 * 1 * 0 * -1! = ...

To make recursion work, you need to define a base case: that is, you need to define a starting point. For factorial, the way we do that is say that the factorial of 0 = 1. Then things work: factorial is only supposed to work for positive numbers; and for any positive number, the recursive definition will expand until it hits 0!, and then it will stop. So let's look at 3! again:

3! = 3 * 2! = 3 * 2 * 1! = 3 * 2 * 1 * 0! = 3 * 2 * 1 * 1 = 6.

The way that we write a recursive definition is to state the two cases using conditions, in something that looks almost like a little bit of a program:

N! = 1 if N = 0
N! = N * (N-1)! if N > 0

Recursion is often used in math in another way: often, one of the easiest ways to prove something is called induction; induction is nothing but using recursion to define a proof.

For a very typical example, we can look at something similar to recursion. What if we want to take the sum of every number from 1 to N? Let's write that as S_N. Well, there's an equation for it: the sum of all of the integers from 1 to N = S_N = N*(N+1)/2. Here's how we can prove that.

We start with a base case. Let's do two of them, just for the heck of it:
- if N = 1, then the sum of all integers from 1 to 1 = 1. According to the equation, that's 1(1+1)/2 = 2/2 = 1.
- if N = 2, then it's the sum of all integers from 1 to 2 = 1 + 2 = 3. According to the equation, that's 2(2+1)/2 = 2*3/2 = 3.
Now we look at the inductive case. We assume that the equation is true for a value N, and we prove that if it's true for N, then it will also be true for N+1.
- Assume that S_N = N*(N+1)/2
- We want to prove that S_(N+1) = (N+1)(N+2)/2.
- We know that S_(N+1) = (N+1) + S_N
- We can expand that to: (N+1) + N(N+1)/2
- We can expand further: (N+1) + (N^2 + N)/2
- We can multiply (N+1) by (2/2) so that we can combine things; since 2/2=1, that doesn't change anything: (2N+2)/2 + (N^2 + N)/2
- Now we can add the numerators: (N^2 + 3N + 2)/2
- Now we factor the top: (N+1)(N+2)/2.
So, we've shown that for 1 and 2, the equation holds. Since it holds for two, by the induction it holds for 3. If it holds for 3, by the induction, it holds for 4, and so on.

If you followed this, congratulations, you understand recursion! If you didn't, please ask questions in the comments, so that I can try to fix this until it's comprehensible.

Monday, March 20, 2006

Building Math

Some of the things that I find myself saying in comments, or that I'd like to be able to talk about require understanding something about basic mathematics that most people are not taught. I think the way that we teach math is a damn shame, but that's the way things are, at least for now. But it does take a lot of the fun out of it.

Say the word mathematics to most people, and what they think of first is numbers. But numbers aren't the only thing in math - in fact, you can do all sorts of interesting math without numbers.

What math is really about is creating structured systems with formal rules describing what is and is not a valid statement in the system, and for determining what is and is not true in that system.

As I mentioned in some earlier posts, Godel blew a gaping hole in mathematics. But for most purposes, that's OK. We understand the limits that Godel puts on us, and most of the time, we can avoid the problems of self-reference that trigger Godel's incompleteness results.

The reason that numbers are what people think of in mathematics is because that's where it all began. First people thought of numbers, because they needed to be able to count things; and then they started to think about what they meant. The whole edifice of mathematics and logic grew out of there. What we're usually taught in school is mathematics constructed from a combination of number theory and set theory.

For today, I'm going to take basic set theory, and show you how we can build a very simple version of mathematics. What I'll do for today is show you how using sets - and nothing but sets - how we can build numbers, with addition and subtraction operators. To keep things simple, I'll also just stick to the natural numbers - we can construct the other kinds of numbers given nothing but the naturals. (How we do that is a topic for another day - but for now, just remember that we do this with computers - computers only really understand integers, but they can encode other things using sets of integers - ratios, decimals, even irrational numbers.)

So let's start off by recalling what a set is. A set is a construct which can
enumerate some collection of values. What those values are, we'll leave for a few moments. But given a set, we can ask what values it contains. If it's a finite set, we can see the full list of all values. Also, given a set, we can provide a value, and check if the set contains that value. This is called a membership test.

Also, given two sets, we can compute two basic sets from the combination: union, and intersection. The members of the union of sets A and B contain all of the values that are members of either A OR B. The members of the intersection of A and B are all of the values that are in A AND B.

Ok. Now, here's where it gets tricky. There's nothing but sets so far. So the only things that exist are the empty set, the set containing the empty set, and sets built up that way.

To get numbers, we use set wrapping. We use the empty set as zero (and I'll write it as zero as well). One is the set containing the empty set, or { 0 }. Two is the set containing the set containing the empty set, or { { 0 } }.

To get any number at all, we'll define an increment function, and we'll define decrement as the inverse of increment. (We'll talk about what functions are, later.)

inc(x)= {x}
dec(inc(x)) = x

Now we've got numbers. Addition is simple:

x + y =

if x = 0 then y
otherwise dec(x) + inc(y)

Now, we can start thinking about sets of numbers. That's easy - just sets containing our encoding of values.

So, how can we do functions? Well, a function is an infinite set of pairs: the first element of the pair is from the domain of the function, and the second is from the range. So, for instance, the increment function I mentioned up above is an infinite set of pairs:

{ (0,1), (1,2), (2,3), ... (x,x+1)... }

So to represent functions, all we need to do is represent pairs.

Pairs are easy: they're sets with two values. The only catch is that because sets are intrinsically unordered, we need some way of distinuishing the first element of the pair (the index) from the second element of the pair (the value).

So: we divide the numbers into two sets. All even numbers we still treat as numbers, but we treat them as (n/2). All odd numbers are position labels, and we treat them as labeling position (n-1)/2.

So, a pair is just a set of two numbers, one even, one odd. The ordered pair (1,2) would be set-represented as { 3 4 }, or { { { { 0 } } } { { { { 0 } } } } }.

So now we have numbers, pairs, and functions. We can actually do arbitrary tuples with as many values as we want, which lets us do multi-parameter functions. So in fact, we have everything we need to do all of the math that can be done with the conventional math of natural numbers, but it's done in terms of nothing but sets - absolutely nothing but sets.

You can "build" mathematics this way from almost anything. I've seen full mathematical constructions from computing machines, functions (lamba calculus), category theory, bag theory, pure logic, communicating machines (pi calculus), and more.

There is one very big catch to this, and that's called the model. Essentially, the model is a construct in logic that shows that there is a way of making sense out of this. A model, roughly speaking, is one of two things: a set of logic rules that show how the rules of your math can map onto something real; or a set of transformation rules that convert any meaningful statement in your mathematics into a statement in another form of mathematics with a model.

Models are tricky things. It took me a frightfully long time to really understand what they were about, and why people cared about them. I'll do a post (or more likely a couple) about models at some point; but for now, the short version of why you need a model is:

You can define a mathematical system that looks perfectly reasonable, but accidentally contains an error that allows it be totally inconsistent - that is, that makes every statement true: not just good statements, like 3+2=5, but all statements: 1 = 0, A implies -A, 3 + 2 = 27, etc. The model is the grounding that shows that your system is as complete and consistent as a formal system can be.

The conflict between IC and IT arguments

As I was responding to comments over the weekend, something interesting struck me about the irreducible complexity (Behe) and the information theory arguments against evolution that I discussed last week. The two arguments are actually contradictory. Anyone who accepts one really should not under any circumstances accept the other.

Behe's argument is strictly constructive: that is, it relies on monotonically increasing complexity (which means monotonically increasing information). It argues that in evolution, structures can only be created by adding elements. If structures can become less complex, then the irreducible complexity argument fails.

To see why, just think about a complexity landscape. (Now, for clarity, remember that the complexity landscape is different from the fitness landscape; they both cover the same graph of nodes, but the values at any point in the landscape are different for fitness and complexity.) The complexity landscape can be seen as a three-dimensional surface, with the elevation of any point being its complexity value.

Dembski basically argues that an evolutionary process can only climb hills: it can only increase the complexity, so it can't go downhill. So it can't get to an IC system, unless the IC system is on an uphill slope; and he argues that there's no way to get to that point - essentially that the surface is discontinuous under the point of the IC system, so you can't climb to it. But if you can go downhill, then you can climb over a peak, and descend to the IC system by throwing away unneeded parts. If you can ever decrease complexity, then there is no way to avoid the fact that you can reduce to a minimum point. (Per Chaitin, you can't be sure that that's the lowest point, just that it's a local minimum. There might be a lower point, but you'll need to climb up out of the local minimum "valley" to get to someplace where you go down
to reach it.) (As an aside: if anyone knows of good surface-drawing software for a macintosh, please let me know; this would be a lot easier to explain with a picture, but I don't have any software handy where I can quickly and easily make a surface.)

The "information theory" argument is that evolutionary systems cannot be constructive. They argue that an evolutionary process can never produce information - and thus cannot ever produce new structures. If this is the case, then you can't create an irreducibly complex structure by addition - if it happens naturally, it can only happen by destruction of information: reducing the complexity of structures until they have the minimum information that they need to produce a required structure.

You can't have it both ways. If you want to make the Behe argument about irreducible complexity, then you must argue that evolution is monotonically increasing with respect to information/complexity. If it can ever decrease, then it becomes possible for a random process to reach an IC state. If you want to make so-called information theory argument, then you're setting a basic premise that information/complexity in a random system is monotonically decreasing.

So anyone who tries to use both arguments without explicitly separating them is either thoroughly clueless or deliberately deceptive.

Saturday, March 18, 2006

Another Take on Information Theory

Jeffrey Shallit, over at his Recursivity blog, has also taken on the creationists misuse of information theory. Alas, he calls it "Kolmogorov" information theory; I come from the school that says Kolmogorov made a critical mistake that can result in invalid results; Greg Chaitin started off at exactly the same place as Kolmogorov, but a few years later, caught on to the mistake and fixed it. (For those
interested, the difference is that in Kolmogov's version, the program that you're analyzing doesn't signal when it's done generating its output; in Chaitin, it does.)

Friday, March 17, 2006

The Problem with Irreducible Complexity

Today, I thought I'd take on another of the intelligent design sacred cows: irreducible complexity. This is the cornerstone of some of the really bad arguments used by people like Michael Behe.

To quote Behe himself:

By irreducibly complex I mean a single system composed of several well-matched, interacting parts that contribute to the basic function, wherein the removal of any one of the parts causes the system to effectively cease functioning. An irreducibly complex system cannot be produced directly (that is, by continuously improving the initial function, which continues to work by the same mechanism) by slight, successive modifications of a precursor system, because any precursor to an irreducibly complex system that is missing a part is by definition nonfunctional. An irreducibly complex biological system, if there is such a thing, would be a powerful challenge to Darwinian evolution. Since natural selection can only choose systems that are already working, then if a biological system cannot be produced gradually it would have to arise as an integrated unit, in one fell swoop, for natural selection to have any thing to act on.

Now, to be clear and honest upfront: Behe does not claim that this is a mathematical argument. But that doesn't mean that I don't get to use math to shred it.

There are a ton of problems with the whole IC argument, but I'm going to take a different tack, and say that even if those other flaws weren't there, it's still a meaningless argument. Because from a mathematical point of view, there's a critical, fundamental problem with the entire idea of irreducible complexity: you can't prove that something is irreducible complex.

This is a result of some work done by Greg Chaitin in Algorithmic Complexity Theory. A fairly nifty version of this can be found on Greg's page.

The fundamental result is that given a complex system, you cannot in general show that there is no smaller/simpler system.

As usual for algorithmic information theory, the proof is in terms of computer programs, but it works beyond that; you can think of the programs as programs to simulate a device.

First, suppose that we have a programming system, which we'll treat as a function. P(x) = the result of running program x on P. x is both a program, and its input data, coded into a single string, so x=(c,d), where c is code, and d is data.

Now, suppose we have a formal axiomatic system, which describes the basic rules that P operates under. We can call this FAS.

If it's possible to tell where you have minimal program using the axiomatic system, then you can write a program that examines other programs, and determines if they're minimal. Even better: you can write a program that will generate a list of every possible minimal program, sorted by size.

Let's jump aside for just a second to show how you can do that, generate every possible minimum program:

First, write a program which generates every possible string of one character, then every possible string of two characters, etc., and outputs them in sequence.
Connect the output of that program to another program, which checks each string that it receives as input to see if it's a syntactically valid program. If it is, it outputs it. If it isn't, it just discards it.
At this point, we've got a program which is generating every possible program for P. Now, remember that we said that using FAS, we can write a program that tests an input program to determine if its minimal. So, we use that program to test out inputs, to see if they're minimal. If they are, we output them; if they aren't, we discard them.

Now, let's take a second, and write out the program in mathematical terms:

Remember that P is a function modeling our computing system, FAS is the formal axiomatic system. We can describe P as a function from a combination of program and data to an output: P(c,d)=result.

In this case, c is the program above; d is FAS. So P(c,FAS)=a list of minimal programs.

Ok, now, back to the main track.

Take our program, c, and our formal axiomatic system, FAS, and compute their
length. Call that l.

Now, run P(c,FAS) until it generates a string longer than l.

Ok. Now, write a program c' that for P that runs P(c,FAS) until the length of the output is larger than l + some constant. Set the constant to the length of c'.

What does this program do? It runs a program which generates a sequence of minimal programs until it finds one larger than itself plus all of its data, then it runs that and emits the output.

So - c' finds and outputs a supposedly minimal program larger than
itself. But since c' is a program which emits the same result
but is smaller, the other program isn't minimal.

Evil, huh?

But the point of it is quite deep. It's not just a mathematical game. We can't tell when something complicated is minimal. Even if we knew every relevant fact about all of the chemistry and physics that affects things, even if the world were perfectly deterministic, we can't tell when something is as simple as it can possibly be.

So irreducibly complexity is useless in an argument; because we can't know when something is irreducibly complex.

Good Math/Bad Math