Algebras in programming

Status: Draft
Confidence: Very likely

Math in this page not rendering? See the fix

An example: say you have multiple copies of a data structure in different places, and when one gets updated, you send a message to the other copies to have them merge in that change. What you want is to have any two states have a well defined merge, which we will denote $\wedge$ . You want

$a \wedge b = b \wedge a$

so you can send message both ways; you want

$(a \wedge b) \wedge c = a \wedge (b \wedge c)$

so it doesn’t matter what order the messages arrive in. And you want to be able to send messages multiple times in case they get lost but not have to worry about double counting anything, so

$a \wedge a = a.$

Mathematicians have a name for such an operation: the meet of a semilattice. They’re well studied. You design your merge operation to satisfy these properties. In one direction, this makes the task of implementing it more straightforward, since you know what you need to do. In the other, you can reason about the behavior of the larger system from these properties.

This turns out to be a really great way to work. You start designing programs by figuring out what algebra parts of your program need to be a model of, and then designing the system so they have those properties.

Aside. Design patterns in programming are what happened when people without mathematical maturity glimpsed these algebras in their work, and then exapted a misunderstood form of Christopher Alexander’s pattern languages to describe them.

And it turns out that a handful of algebraic structures, all with various links to each other, describe the majority of the useful cases. Most of them fall along three lines from sets without any more structure. One line adds operations and goes through magmas, monoids, semigroups, group, rings, and fields, with digressions through tropical semirings and the like. These represent things like numbers or vectors. The second goes through partial orders, semilattices, lattices, directed acyclic graphs, and off to general graphs. These represent nested things and connected pieces. The third is a path that describes fragments of programs and behaviors, starting with monoids, going through applicative functors, monads, and arrows. These aren’t straight lines of proper inclusion, but things do tend to fall around these three rough directions.

So, monads…

While I’m here, I can’t resist explaining what monads are.¹

When I am writing an imperative program, I expect to be able to write statements one after another, and to bind values to variables and use them. At the most basic level we expect

f(x) is the same as y = x; f(y)
f(x) is the same as g = f; g(x)
a; (b; c) is the same as (a; b); c

In the notation we usually use for an imperative language, this seems obvious (which means it’s a well adapted notation). Any time we are writing sequences of statements where variables are a thing we probably want these properties. These properties define an algebra called a monad.

This doesn’t look anything like what is usually bandied around in monad tutorials and Haskell, so it’s worth showing the translation into the usual form there.

Aside. Actually, it looks a lot like what you write in Haskell’s do notation. Monads are usually thought of as part of category theory, but I prefer to approach all this from the point of view of model theory. Model theory looks at sets of formal properties that define an algebra (called a theory) and asks what can be said about things that satisfy those properties (models of the theory). When you are designing software, we care about what theories/algebras we want and whether we can build models of them in our software. From this point of view what we write in the do notation is the important thing. How we cobble a model of it together under the hood is implementation details.

To show the link, we’re going to wildly change notation to one that is less well adapted to monads but is better adapted to functional programming. In the process some things that seemed incorporeal in our notation will become explicit.

First, we are going to model successive statements as nested closures. a; b becomes a >> b. Variable binding turns into passing the value to a special function that embeds it in our model of a monad and then binding it into a following function: y=x; f(y) becomes success(x) >>= \y -> f(y).

If we translate our three properties above into this notation, we get

success(a) >>= f $\equiv$ f(a)
f(a) >>= success $\equiv$ f(a)
p >>= (q >>= r) $\equiv$ (p >>= q) >>= r

This notation makes them look really scary. It’s really just a way of describing imperative languages with an unfamiliar name. Remember that back in the day structured programming was regarded as too abstract and difficult for real world use. Then a small group of people forced it into the first year programming curriculum so the students never learned any other way, and suddenly it was basic stuff that everyone knew. The same is true for basic data structures. Monads, applicative functors, and arrows haven’t been dumped into this initial brainwashing yet, so they’re still regarded as difficult.

Monads get described as all kinds of crazy things, and among programmers are mostly associated with how Haskell handles IO. They’re not intrinsic to IO. Clean uses a totally different approach. Nor is IO intrinsic to monads. There are plenty of languages with no side effects that form monads.↩︎

« Back to Programming | Home