Category Theory Intro

While I was at the NE Scala Symposium there was a lot of discussion of the finer points of functional programming. There was a lot of discussion of Category Theory and Monads; while I’d like to say I understood everything that was going on I’m not going to claim I did. I had seen discussion of the topics before but never really got it, or why it was valuable. I got an understanding of the basic concepts for the first time and wanted to try and write it out with the beginner in mind, since the biggest issue is the terminology and wikipedia assumes you’ve got a significant education in abstract math. As someone who isn’t an expert on this I’m going to simplify and probably end up misusing some terms, but hopefully it will get the basic ideas through in a way that you understand enough of the terminology to read more advanced works, or to understand why you don’t need to.

Starting from the theoretical aspect, you’ve got a category that is made up of three different aspects: objects, arrows, and the means to compose multiple arrows in a way that is commutative and has an identity. The arrows represent transitions between objects. That’s it.

There is a special class of categories called monoids, which are categories with only one object. At first this seems kind of pointless, but depending on how you define that object they become interesting. For example, if you define the object as the set of integers, and the arrows as individual numbers then the composition of numbers becomes an operator like addition. Addition is commutative so that’s not an issue, and adding 0 is an identity. It is a somewhat odd way to define addition, but it ensures you get a couple of different properties out of the system.

But what good is this monoid structure? It is the mathematical underpinning of why you can fold over a collection and why MapReduce is provably correct.

A monad is a transformation within a category (i.e., an arrow) that defines two additional properties, where there is an identity available and it is associative. Where have we seen that before? It’s a monoid! So adding 3 to an integer is a monad. What’s the big deal? The integer example is straightforward if dubiously valuable; with something more complex it might make more sense. So imagine you have some structure, with the ability to transform into another structure that is similar, has an identity and is associative. So what’s is that? It’s flatMap.

I’m going to switch to the other end and come at this from the programming side now for a minute. We’ve got these structures that have values and can be converted into other similar structures. That sounds kind of like a program to me. If you’ve got a List[String] with some strings in it and you convert that into another List[String], you could have taken those urls and returned the contents of the url, or you could have taken a list of file names and read the contents of the files. This tells us a couple of things: string typed things interoperate with all sorts of stuff and the act of the transformation can be abstracted from the data of the transformation. The first part embodies the idea that if a functional program compiles, it’s probably right since functional programs will define more specific data types that ensure you are putting the pieces together. The second represents the separation of data from the transformations of data or from the actual business logic of the application. This is the opposite of object oriented programming where code and data are coupled together in an object. So we’ve separated the code and the data, and without objects you’ve essentially got free functions that eventually get composed together into larger and larger computational pieces. No reason those transformations can’t be set up to be associative and have identities. So now you have a program represented as a series of monad transformations, easy.

There is some space between the programming perspective and the math perspective. But a lot of it is just constructing categories in specific ways so the transformations and compositions are what you want so I’m going to leave that alone, but I would like to discuss some of what this matters. The separation of the transformation and the data also allows you to separate the definition of the transformation from the execution of the transformation. That makes your code inherently testable, as well as imbuing it with the ability to apply large classes of optimizations. Those optimizations are part of what makes functional programming parallel processing friendly and highly performant.

So while you probably don’t care that the code has these underlying mathematical properties, you do care what those properties can imbue upon your program. They make it easy to test and easy to be fast. You don’t really need to understand monads and category theory at a theoretical level to take advantage of all of this and what it means for your program. Libraries like Cats are full of the category theory terminology, making it harder to find what you want without understanding the underlying terminology. Having a combinable rather than a semigroup would make it more accessible, but harder to understand why it works.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s