Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Monoid" is an adjective that describes a data type.

Anything you describe as a monoid has to have three properties: you can add them together, there's an "empty" or "zero" value, and (a + b) + c = a + (b + c).

For example, strings with concatenation are a monoid, because you can use an empty string and the string + operator.

For example, integers with addition are a monoid, because you can use 0 and the integer + operator.

For example, sets are a monoid, because you can use the empty set and the union operator.

For example, a "Picture" data type could be a monoid, because you could have a fully-transparent picture and use an "overlay" operator to put one picture on top of another.

Why do you care? Well, for one thing, it just makes it easy to find the function or operator you want to use: as the OP said, if I feel like my data type should have a concat or append operator but I don't know what it's called, I just use mappend / mempty / mconcat.

But, as another practical example: "sum" in Python works for numbers, but I've seen folks assume that it works for anything that you can use + on, and so try and use strings with it. But, because "sum" isn't a general purpose tool, that doesn't work! In Haskell, on the other hand, the equivalent -- mconcat -- will work on anything that is a monoid. (And can be specialized to work faster for specific data structures.)

Other languages also have monoids. If someone tells you Haskell has monoids and other languages don't, what they really mean is, Haskell makes explicit the monoid pattern / interface, where other languages have it implicitly for different data types. Talking about the pattern explicitly isn't revolutionary, but it can be pretty useful for discoverability and writing "general" algorithms.

Instead of calling data structures monoids, you'd get the same effect if, as a programming language community, you decided that as many types as possible should support "+", and that every data type that can should have a makeEmptyThing() method, and that it would be weird and not ok if x + makeEmptyThing() didn't equal x and if (x + y) + z didn't equal x + (y + z), and then subtly shamed any libraries that defined makeEmptyThing() and + in ways that didn't follow that pattern. But if your programming language community did this (for consistency) you'd probably want to come up with a name for it -- "Addable", "Concatable", etc -- and the Haskell community chose "Monoid" (because of relationships to theory etc etc).



You're actually wrong about Python's sum function, it works for anything you can + on BUT strings [0], this is to avoid quadratic behavior with string concatenation.

[0] https://github.com/python/cpython/blob/5837d0418f47933b2e3c1...


> But, as another practical example: "sum" in Python works for numbers, but I've seen folks assume that it works for anything that you can use + on, and so try and use strings with it.

I don't know python very well, but aren't string special cased? It does work with lists.

    sum([[1,2,3],[4,5,6]], [])


It does if you specify a sequence of list as the first argument, and an empty list as an initial value.


This is because the start argument defaults to 0. It doesn't make sense to add a list and 0.

The same thing does not work for a sequence of strings and an empty string.

    sum(["abc","def"], "")
Throws TypeError: sum() can't sum strings [use ''.join(seq) instead]


I agree with much of what you say here. Some refinements and small disagreements follow:

A monoid isn't just a data type. Mathematically, it's a set ("data type" is plenty close enough, in this context) and an operation. Integers with addition are a monoid. But so are integers with multiplication. And these are two different monoids. Haskell confuses this issue a bit - they way the Haskell libraries are set up you can only name one Monoid instance per type, and so types that form monoids in several interesting ways get wrapped in "new types" so you can name each of the several. This mostly works fine, but is a bit hacky from a math POV, and I think doesn't lead as well as it might to a more general understanding.

It's worth noting, too, that some of the Haskell libraries - particularly those from the early days - make some unfortunate decisions about what instances to bless. I often complain about the Monoid instance for Map. Map forms a monoid under union so long as we combine values (on collision) associatively. Unfortunately, the library doesn't let us pick how they are combined, it just takes the left one. That's usually not what I want, and it's particularly painful when I've been combining Sets with monoid operations and now realize they need to carry some extra info.

... and then of course there's floating point, which probably shouldn't even be Num.

> But if your programming language community did this (for consistency) you'd probably want to come up with a name for it -- "Addable", "Concatable", etc -- and the Haskell community chose "Monoid" (because of relationships to theory etc etc).

There are two things I really like about using the name Monoid, compared to the others.

First, it's much clearer what the rules are. We aren't stuck wondering whether strings or products are "really" "Addable", whether functions of the form (a -> a) are "really" "Concatable" (... and what about nonempty strings? you can concatenate them, but they have no identity object...).

Clarity in an interface means I know what properties I can rely on. You can define those properties precisely with "more intuitive" names, but then the actual interface doesn't match the intuition and people won't realize it, which is worse. You could get precise and intuitive by adding verbosity - "AssociativeOperationWithIdentity". My principle objection there is readability and aesthetics, but if that's what a community wants to go with okay... "Monoid" is short and unambiguous and well established in related fields.

That last point touches on the other thing I like. There are mathematical results that can be useful to programmers. Reducing the translation needed helps make those more accessible. And everyone could do with knowing just a bit more algebra :-P




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: