20 September 2016 9:33 PM (concurrency | go | ml | ignorance | guile)
Peoples! Lately I've been navigating the guile-ship through waters unknown. This post is something of an echolocation to figure out where the hell this ship is and where it should go.
Concretely, I have been working on getting a nice lightweight concurrency system rolling for Guile. I'll write more about that later, but you can think of it as being modelled on Go, though built as a library. (I had previously described it as "Erlang-like", but that's just not accurate.)
Earlier this year at Curry On this topic was burning in my mind and of course when I saw the language-hacker fam there I had to bend their ears. My targets: Matthew Flatt, the amazing boundary-crossing engineer, hacker, teacher, researcher, and implementor of Racket, and Matthias Felleisen, the godfather of the PLT research family. I saw them sitting together and I thought, you know what, what can they have to say to each other? These people have been talking together for 30 years right? Surely they are actually waiting for some ignorant dude to saunter up to the PL genius bar, right?
So saunter I do, saying, "if someone says to you that they want to build a server that will handle 100K or so simultaneous connections on Racket, what abstraction do you tell them to use? Racket threads?" Apparently: yes. A definitive yes, in the case of Matthias, with a pointer to Robby Findler's paper on kill-safe abstractions; and still a yes from Matthew with the caveat that for the concrete level of concurrency that I described, you'd have to run tests. More fundamentally, I was advised to look at Concurrent ML (on which Racket's concurrency facilities were based), that CML was much better put together than many modern variants like Go.
This was very interesting and new to me. As y'all probably know, I don't have a formal background in programming languages, and although I've read a lot of literature, reading things only makes you aware of the growing dimension of the not-yet-read. Concurrent ML was even beyond my not-yet-read horizon.
So I went back and read a bunch of papers. Turns out Concurrent ML is like Lisp in that it has a tribe and a tightly-clutched history and a diaspora that reimplements it in whatever language they happen to be working in at the moment. Kinda cool, and, um... a bit hard to appreciate in the current-day context when the only good references are papers from 10 or 20 years ago.
However, after reading a bunch of John Reppy papers, here is my understanding of what Concurrent ML is. I welcome corrections; surely I am getting this wrong.
1. CML is like Go, composed of channels and goroutines. (Forgive the modern referent; I assume most folks know Go at this point.)
2. Unlike Go, in CML a channel is never buffered. To make a buffered channel in CML, you spawn a thread that manages a buffer between two channels.
3. Message send and receive operations in CML are built on a lower-level primitive called "events". (send ch x) is instead euivalent to (sync (send-event ch x)). It's like an event is the derivative of a message send with respect to time, or something.
4. Events can be combined and transformed using the choose and wrap combinators.
5. Doing a sync on an event created by choose allows a user to build select in "user-space", as a library. Cool stuff. So this is what events are for.
6. There are separate event type implementations for timeouts, channel send/recv blocking operations, file descriptor blocking operations, syscalls, thread joins, and the like. These are supported by the CML implementation.
7. The early implementations of Concurrent ML were concurrent but not parallel; they did not run multiple "goroutines" on separate CPU cores at the same time. It was only in like 2009 that people started to do CML in parallel. I do not know if this late parallelism has a practical impact on the viability of CML.
What is the relationship of CML to Go? Specifically, is CML more expressive than Go? (I assume the reverse is not the case, but that would also be an interesting result!)
There are a few languages that only allow you to select over message receives (not sends), but Go's select doesn't have this limitation, so that's not a differentiator.
Some people say that it's nice to have events as the common denominator, but I don't get this argument. If the only event under consideration is message send or receive over a channel, events + choose + sync is the same in expressive power as a built-in select, as far as I can see. If there are other events, then your runtime already has to support them either way, and something like (let ((ch (make-channel))) (spawn-fiber (lambda () (put-message ch exp))) (get-message ch)) should be sufficient for any runtime-supported event in exp, like sleeps or timeouts or thread joins or whatever.
To me it seems like Go has made the right choices here. I do not see the difference, and that's why I wrote all this, is to be shown the error of my ways. Choosing channels, send, receive, and select as the primitives seems to have the same power as SML events.
Let this post be a pentagram on the floor, then, to summon the CML cognoscenti. Well-actuallies are very welcome; hit me up in the comments!
[edit: Sam Tobin-Hochstadt tells me I got it wrong and I believe him :) In the meantime while I work out how I was wrong, examples are welcome!]