A perspective on four languages

November 13, 2009

C, Common Lisp, Forth, and Haskell represent extremes of the different media in which programmers can work. When I say medium, I mean it in the same way as an artist would. Clay and marble are different media for sculptors. Pen and ink differs from oil paints. They are all a combination of substrate, tools, and idiom for ordering the world. Similarly, these four programming languages represent the far corners of the ways a programmer may approach his creations.

What makes these languages really different? I believe there are two aspects. The first is the underlying mathematical substrate. C and Forth are built around the manipulation of blocks of memory. Common Lisp and Haskell are built around variants of the lambda calculus. Each substrate makes certain approaches to problems natural, though neither is a natural vehicle for human thought. The brain's adaptation to the lambda calculus is usually experienced as an epiphany. There is a similar adaptation to machine memory, but it takes place without the epiphany.

The other aspect is more difficult to explain, but is what led me to this classification in the first place. When writing Haskell I found myself coming up against things I was unable to express efficiently, usually involving letting the program change as it ran. Whenever I found myself in this difficulty, my brain reflexively turned to Lisp. Trying to envision Lisp from within Haskell was bizarre. I knew how to envision Haskell within Lisp, but not the other way. After struggling, I found a useful metaphor: I knew how to manipulate symbols according to rules on paper, what does manipulating symbols on paper look like from within an axiomatic system?

Common Lisp is one answer to this question. Its core is an axiomatic system, the lambda calculus, which John McCarthy extended just enough to write an interpreter for the resulting calculus in itself. Common Lisp augments the interpreter with other automata on the same substrate which make construction straightforwards. Though the choice of automata may vary, the essential idea of reifying the act of construction is universal.

The separation between Haskell and Common Lisp also arises between C and Forth. C applies its own grammar, independent of the underlying hardware, to the manipulation of memory. It was written by Kernighan, Ritchie, and Thompson to abstract the machine. Forth was written by a man with the opposite priorities. Charles Moore was trying to reach in and touch his machine. The machine itself was his medium, and he has devoted decades since then to the design and construction of beautiful machines. Forth is a pattern in memory on these machines, the equivalent of a blank piece of paper, which is to be worked until the machine bears the desired form.

Interestingly, using the lambda calculus as a substrate has driven Common Lisp and Haskell farther apart than Forth and C. A variable in both Forth and C represents a block of memory, while in Haskell and Common Lisp they are utterly different. Haskell's variables are mathematical equivalences. a = 3 is a statement that anywhere a occurs in a program, 3 could just as well occur, and if the compiler thinks it prudent, can be used in place of a.

In Common Lisp, a variable is really a symbol. Symbols are real entities in the program. They can have values attached to them, functions attached to them, arbitrary other things attached to them, but they are things in their own right. So when we write (+ a b) in Common Lisp, we are creating a list of the three symbols: +, a, and b. Because of where it finds them, the compiler expects that + has a function attached to it, and a and b have values attached to them, but that is because we are trying to feed them through an automaton for a particular purpose. We could just as easily write another automaton which expected completely different properties from them.

But I claimed more than that these languages are distinct. I claimed that they are the extremes of a programmer's media. While I think no one will deny that Forth is an extreme in whatever direction it finds itself, the others require some justification. Common Lisp is an extreme by process. Most of the world who worked in this direction put their labor into Common Lisp, and there has been little advance in the past twenty years in how such a language should work. So it remains, a stable statement of this medium.

Unlike C, most structured programming languages such as ALGOL and Pascal explicitly do not define variables to be blocks of memory. Indeed, ALGOL 68 explicitly does not. They were formulated to run on a much broader range of computer architectures than C. But more importantly, we are talking about media, means of expression. C was shaped by poets. Kernighan, Ritchie, Pike, and Thompson built the language alongside the idiom. This had an effect equivalent to Petrarch writing in Italian. The language suddenly found itself in possession of existence and poetry at once. It has no theoretical basis. It has no overarching logic. It is full of corner cases, horrid syntax, and frightening design decisions, but, in the hands of a master, it scintillates as its brethren do not.

Haskell is temporarily the extreme. There are other languages hovering around this same fringe such as Clean, and others which may well be beyond it like Cayenne. There is a whole trail of previous languages -- Prolog, Scheme, and the various dialects of ML -- leading to this point. I believe this is why so much of programming language research focuses on this corner. There is still work to be done.

A good programmer should have explored all the media available to him, and it makes sense to explore them in their extreme form. This means these four languages:

approach lambda calculus as core machine memory as core
language as given Haskell C
language as medium Common Lisp Forth

Fred Ross
13 November 2009
Lausanne, Switzerland

Did you enjoy that? Try one of my books:
Nonfiction Fiction
Into the Sciences Monologue: A Comedy of Telepathy