December 5, 2013

[A friend sent me a link to a talk on monads in his area, saying he was interested but that they were way above his head. This is what I emailed back to him.]

Monads are made way scarier than they are. Here's what you need to know.

First, design patterns are algebraic structures glimpsed dimly through a distorted glass. So are a lot of other things. When you realize this, you start designing programs by figuring out what algebraic properties your elements need to have, and then designing the system so they have those properties.

For example, say you have multiple copies of a data structure in different places, and when one gets updated, you send a message to the other copies to have them merge in that change. What you want is to have any two states have a well defined merge, which we will denote $\wedge$. You want

$a \wedge b = b \wedge a$

so you can send message both ways; you want

$(a \wedge b) \wedge c = a \wedge (b \wedge c)$

so it doesn't matter what order the messages arrive in. And you want to be able to send messages multiple times in case they get lost but not have to worry about double counting anything, so

$a \wedge a = a.$

Mathematicians have a name for such an operation: the meet of a semilattice. They're well studied. So you design your merge to be such a thing.

And it turns out that a handful of algebraic structures, all with various links to each other, describe the majority of the useful cases. Most of them fall along two lines from sets without any more structure. One line adds operations and goes through magmas, monoids, semigroups, group, rings, and fields, with digressions through tropical semirings and the like. The second goes through partial orders, semilattices, lattices, directed acyclic graphs, and off to general graphs. These aren't straight lines of proper inclusion, but things do tend to fall around these two rough directions.

Monads are a variation along the the way to fields. Say I have functions that can either return a value or fail. Then I want to design my program to correspond to an algebra which will take care of all handling of failures.

What does that require? I want my typical program transformations of inlining (x = 3; f(x) is the same as f(3)) and juxtaposing (f(x); g(y) followed by h(z) should be the same as f(x) followed by g(y); h(z)).

So what does that mean in the algebra? I'm going to wildly change notation, but it's still the same as above. You often have to name things that seemed incorporeal to make the algebras come clear.

So, x=3; f(x) is going to become success(3) >>= \x -> f(x) where success(3) is an evaluation that always succeeds and returns 3, and \x -> f(x) is a function with argument x and body f(x). >>= means short circuit and fail immediately if its left hand side failed, and otherwise take the value of the success and call the right hand side with it as an argument.

We have two operations in our algebra: success and >>=. What do we demand of them to be able to manipulate programs the way we expect?

Well, inlining: success(a) >>= \x -> f(x) $\equiv$ f(a). We can clean that up. \x -> f(x) is just f, so we can say

success(a) >>= f $\equiv$ f(a)

And when you're debugging you often do things like change x = f(g(q)) to p = g(q); x = f(p) so you can see what's going on in the debugger. If that's going to work, you need to be able to say x = f(a); x $\equiv$ f(a), or in our new notation f(a) >>= \x -> success(x) $\equiv$ f(a). Again, \x -> success(x) is just success, so we have

f(a) >>= success $\equiv$ f(a)

And last, we want to be able to juxtapose lines arbitrarily, so f(x) followed by g(x); h(z) should be the same as f(x); g(x) followed by h(z). So >>= needs to be associative:

p >>= (q >>= r) $\equiv$ (p >>= q) >>= r

Those are the three monad laws. They just describe what you expect to be able to do with code. Actually, any time you have some kind of sequential operations in code you expect to be able to do that, so whenever you're trying to build some kind of language of ordered operations, no matter how crazy their semantics, you know that you want it to look like a monad, and you design it accordingly.

Now, not everything in this class of algebras should look like a monad. Consider audio processing pipelines. They branch and merge, because a lot of the processing should be done in parallel. Monads intrinsically deal with "thing after thing after thing" situations. When you have processing pipelines you want another structure called an arrow. There's all kinds of variations on these. There is a slightly weaker algebra than a monad that goes by the name applicative functor, and a variety of other structures in this family.

Monads are no scarier than trees or DAGs or maps. You design your programs so they have known properties, and the same kind of properties show up again and again because they guarantee the manipulations you really want to be able to make. Hashmaps and balanced binary trees have different complexities for their operations, but it's still the same set of operations that we define across them. Maybe monads (what I described above) and input/output are very different in what they do, but we still expect to be able to do the same kind of basic code refactorings across them.

Remember that back in the day structured programming was regarded as too abstract and difficult for real world use. Then a small group of people forced it into the first year programming curriculum so the students never learned any other way, and suddenly it was basic stuff that everyone knew. The same is true for basic data structures. Monads, applicative functors, and arrows haven't been dumped into this initial brainwashing yet, so they're still regarded as difficult.

Did you enjoy that? Try one of my books:
 Nonfiction Fiction