## Tuesday, May 30, 2006

### Types in Lambda Calculus

Now that we've got intuitionistic logic under our belts, we're ready to get back to lambda calculus: we've got the logical tools we need to define the model. Of course, nothing is ever quite that simple, is it?

What we've talked about so far is the simple untyped lambda calculus. That's how LC started; it's the first version that Church invented. But it did end up with some problems, and to get around that, a concept of types was introduced, leading to the simple typed lambda calculus, and on from there to a world of variants - SystemT, SystemF, the Lambda Cube (not related to the time cube :-)), and so one. And eventually, people realized that the untyped lambda calculus was really just a simple pathologically simple case of a typed lambda calculus - it's a typed LC with only one type.

The semantics of lambda calculus are easiest to talk about in a typed version. For now, I'll talk about the simplest typed LC, known as the simply typed lambda calculus; and how it's ultimately semantically equivalent to an intuitionistic logic. (In fact, every version of typed LC corresponds to an IL; and every beta reduction in an LC corresponds to an inference step in an IL proof; that's why we needed to sidetrack into intuitionistic logic to get here.)

The main thing that typed lambda calculus adds to the mix is a concept called base types. In a typed lambda calculus, you have some universe of atomic values which you can manipulate; those values are partitioned into simple types. Base types are usually named by single lower-case greek letters; since that's a pain on Blogger, I'll use upper-case english letters for type names. So, for example, we could have a type "N", which consists of the set of natural numbers; a type "B" which corresponds to boolean true/false values; and a type "S" which corresponds to strings.

Once we have basic types, then we can talk about the type of a function. A function maps from values of one type (the type of parameter) to values of a second type (the type of the return value). For a function that takes a parameter of type "`A`", and returns a value of type "`B`", we write its type as "`A -> B`". "->" is called the function type constructor; it associates to the right, so "`A -> B -> C`" means "`A -> (B -> C)`".

To apply types to the lambda calculus, we do a couple of things. First, we need a syntax update so that we can include type information in lambda terms. And second, we need to add a set of rules to show what it means for a typed program to be valid.

The syntax part is easy. We add a ":" to the notation; the colon has an expression or variable binding on its left, and a type specification on its right. It asserts that whatever is on the left side of the colon has the type specified on the right side. A few examples:
1. `lambda x : N . x + 3`. This asserts that the parameter, `x`, has type "`N`", which is the natural numbers. There is no assertion of the type of the result of the function; but since we know that "+" is a function with type "`N -> N`", which can infer that the result type of this function will be "N".
2. `(lambda x . x + 3) : N -> N`. This is the same as the previous, but with the type declaration moved out, so that it asserts the type for the lambda expression as a whole. This time we can infer that "`x : N`" because the function type is "`N -> N`", which means that the function parameter has type "`N`".
3. `lambda x : N, y : B . if y then x * x else x`. A two parameter function; the first parameter is type "`N`", the second type "`B`". We can infer the return type, which is "`N`". So the type of the full function is "`N -> B -> N`". This may seem surprising at first; but remember that lambda calculus really works in terms of single parameter functions; multi-parameter functions are a shorthand for currying. So really, this function is: `lambda x : N . (lambda y : B . if y then x * x else x)`; the inner lambda is type "`B -> N`"; the outer lambda is type "`N -> (B -> N)`".
To talk about whether a program is valid with respect to types (aka well-typed), we need to introduce a set of rules for type inference. When the type of an expression is inferred using one of these rules, we call that a type judgements. Type inference and judgements allow us to reason about types in a lambda expression; and if any part of an expression winds up with an inconsistent type judgement, then the expression is invalid. (When Church started doing typed LC, one of the motivations was to distinguish between values representing "atoms", and values representing "predicates"; he was trying to avoid the Godel-esque paradoxes, by using types to ensure that predicates couldn't operate on predicates.)

I'm going to have to adopt a slightly unorthodox notation for type judgements; the standard notation is just too hard to render correctly with the software I'm using currently. The usual notation is something that looks like a fraction; the numerator consists of statements that we know to be true; the denominator is what we can infer from those. In the numerator, we normally have statements using a context, which is a set of type judgements that we already know; it's usually written as an uppercase greek letter. For type contexts, I'll use the names of greek letters written in uppercase letters. If a type context includes the statement that "`x : A`", I'll write that as "`CONTEXT |- x : A`". For the fraction inference notations, I'll use a two lines labelled "Given:" and "Infer:". (You can see what the normal notation looks like by visiting wikipedia's STLC page.)

Rule1: (Type Identity)`Given: nothingInfer: x : A |- x : A`
The simplest rule: if we have no other information except a declaration of the type of a variable, then we know that that variable has the type it was declared with.

Rule2: (Type Invariance)`Given: GAMMA |- x : A, x != yInfer: (GAMMA + y : B) |- x : A`
This is a statement of non-interference. If we know that "`x : A`", then inferring a type judgement about any other term cannot change our type judgement for "x".

Rule3: (Parameter to function inference)` Given: (GAMMA + x : A) |- y : BInfer: GAMMA |- (lambda x : A . y) : A -> B`
This statement allows us to infer function types: if we know the type of the parameter to a function is "`A`"; and we know that the type of the value returned by the function is "`B`", then we know that the type of the function is "`A -> B`".

And finally, Rule4: (Function application inference)`Given: GAMMA |- x : A -> B, Gamma |- y : AInfer: GAMMA |- (x y) : B`
If we know that a function has type "`A -> B`", and we apply it to a value of type "`A`", the result is an expression of type "`B`".

These four rules are it. If we can take a lambda expression, and come up with a consistent set of type judgements for every term in the expression, then the expression is well-typed. If not, then the expression is invalid.

Just for kicks, here's how we'd write types for the SKI combinators. These are incomplete types - I'm using type variables, rather than specific types. In a real program using the combinators, you could work out the necessary substitutions of concrete types for the type variables. Don't worry, I'll show you an example to clarify that.
• I combinator: `(lambda x . x) : A -> A`
• K combinator:: ` (lambda x : A . ((lambda y : B . x) : B -> A)): A -> B -> A`
• S combinator: `(lambda x : A -> B-> C . (lambda y : A -> B . (lambda z : A . (xz : B -> C) (yz : B)))) : (A -> B -> C) -> (A -> B) -> C`
Now, let's look at a simple lambda calculus expression: `lambda x y . y x`. Without any type declarations or parameters, we don't know the exact type. But we do know that "x" has some type; we'll call that "A"; and we know that "y" is a function that will be applied with "x" as a parameter, so it's got parameter type A, but its result type is unknown. So using type variables, we can say "`x : A, y : A -> B`". We can figure out what "`A`" and "`B`" are by looking at a complete expression. So, let's work out the typing of it with x="3", and y="`lambda a : N. a*a". We'll assume that our type context already includes "*" as a function of type "N -> N -> N".`
• `(lambda x y . y x) 3 (lambda a : N . a * a)`
• Since 3 is a literal integer, we know its type: `3 : N`.
• By rule 4, we can infer that the type of the expression "`a * a`" where "`a : N`" is "`N`"; (* : N -> N -> N; it's applied to an N and an N) and therefore, by rule 3 the lambda expression has type "N -> N". So with type labelling, our expression is now: `(lambda x y . y x) (3 : N) (lambda a : N . (a * a):N) : N -> N`
• So - now, we know that the parameter "x" of the first lambda must be "N"; and "y" must be "`N -> N`"; so by rule 4, we know that the type of the application expression "`y x`" must be "N"; and then by rule 3, the lambda has type: `N -> (N -> N) -> N`.
• So, for this one, both A and B end up being "N".
So, now we have a simply typed lambda calculus. The reason that it's simply typed is because the type treatment here is minimal: the only way of building new types is through the unavoidable "->" constructor. Other typed lambda calculi include the ability to define parametric types, which are types expressed as functions ranging over types.