sidelong glimpses

Evening, tubes. I don't usually post about works-in-progress, but this one has been on my mind for two or three weeks, really paralyzing my mojo, so perhaps getting it out in textual form will help me move on a bit.

operation delimited continuation

So yes, to my non-programmer audience, let this be a pleasantly written but unintelligible lullaby, a word-train rocking you to sleep; but to the nerds out there, I'm quite fascinated with this topic, and I hope I'm able to convey some of that fascination to you.

Consider the following C program:

#include <stdlib.h>

int main ()
  abort ();
  return 0;

Let's assume we have it in foo.c, and compile it. We'll do so from tcsh, for reasons to be explained later.

$ tcsh
% limit coredumpsize unlimited
% gcc foo.c

OK, we should have a binary in a.out. Now, we run it, with predictable effects:

% ./a.out
Abort (core dumped)
% ls -l core
-rw------- 1 wingo wingo 151552 2010-02-14 21:37 core

So, what happened was that the program ran, then it got to the abort, which caused it to unexpectedly exit. Where did it exit to? In this case, back to the prompt.

That program is no longer running, but when it exited, it left behind a copy of its state -- of its continuation. A continuation is what's left of a computation. In this case, what's left is for abort to return, then main returns 0, then probably some system-specific code, then we'd be back at the prompt again. Aborting just brought us back to the prompt a little faster.

Anyway, the core dump has all of the data that would be necessary to complete the process' computation. With a debugger you can get in and see what that data was, but you can't run the program again from that point -- you'd need some extra OS-level support for that.

why are you talking about this stuff

Glad you asked! Core dumps are really a specific instance of control features in programming languages. Control features are what allow you to skip around in your program. In this case, we skipped back to the prompt.

Scheme includes support for a very low-level control feature known as continuations. With continuations, you can implement all kinds of things, like exceptions, coroutines, generators, and short-cut operators. You can make up new control operators in a Scheme library and add them to an innocent, unsuspecting program.

Many people are aware of this, but fewer know that continuations have a problem, a big problem. The origin of the problem is that a continuation captures the whole future of a computation, not just a part of it. So if you implemented exceptions with continuations, imagine dumping core every time you threw an exception. Not very efficient, no?

Well the truth is more complicated of course. If you have access to the entirety of a program, the compiler can transform your code into what is known as continuation-passing style (which the laity may recognize by the weaker names of single-static-assignment or A-normal form) and in that case invoking a continuation is just like calling a function. But for the rest of us, we have to play tricks with dynamic extents, incrementally dumping parts of the core as the program enters different dynamic extents.

(I'm simplifying here, to continue the core-dump metaphor, in the knowledge that the clergy will correct this doctrine in the comments.)

But in the general case, capturing and reinstating continuations can be quite expensive, because in the absence of sophisticated analysis, one simply dumps the whole core. It's a terrible but common implementation strategy.

There are other, deeper problems with continuations as present in Scheme, notably the lack of composability. Though I do understand that this problem is deeper, I don't understand its depths, so I'm going to forgo a discussion of that here.

the solution: more power

So, continuations are great, but they can be expensive to implement. OK. In this context it's amusing that the solution to this problem is actually to give continuations more power.

The problem is that in Scheme you have control over where you dump core, but not how far you dump core. While it's true that continuations logically capture the whole future of a computation, in practice that continuation has a limit: the process boundary. Our a.out was spawned at the tcsh prompt, but it has no access to the implementation of the shell itself, not to mention the kernel.

So, more power, as we were saying: let's allow programs to delimit the range of computation captured by a continuation. Instead of capturing continuations up to the process boundary, a program can capture a continuation only up to a certain point in a program.

Pleasantly, that point is known as a %, pronounced "prompt". (Dorai Sitaram notes that he would have preferred >, but it was taken by greater-than. Also, this comes from back in 1993, so perhaps he can be forgiven the tcsh-style prompt.)

It's as if you were implementing a shell in your program, as if your program were an operating system for other programs.

Here's a translation of foo.c into a Scheme that supports prompts:

(define (foo)

And now we run it, in a prompt:

(% (foo)
   (lambda (core . args)
     (format #t "Abort (core dumped): ~a\n" core)
     (format #t "Arguments to abort: ~a\n" args)
;; =| Abort (core dumped): #<continuation ... >
;; =| Arguments to abort: 
;; => 1

The second argument to % is the "handler". When abort is called, control returns to the handler, and the continuation -- the "core dump" -- is passed as the first argument. Unlike in UNIX, abort in Scheme can take any number of arguments. Those arguments are passed to the handler.

Also unlike UNIX, the core dump may be revived; but normal Scheme continuations do have a UNIX similarity in that reviving them is like invoking exec(3) -- they replace the current process image with a new one. This might be the right thing for user space to do, but if you're implementing a kernel, scheduling a process shouldn't give that process control over the whole system.

Instead, delimited continuations are more like functions, in that you can call them for a value. Like this:

(% (foo)
   (lambda (core . args)
     (format #t "Abort (core dumped, but continuing): ~a\n" core)
     (format #t "Arguments to abort: ~a\n" args)
     (core))) ;; revivify the core, letting the program finish
;; =| Abort (core dumped, but continuing): #<continuation ...>
;; =| Arguments to abort: 
;; => 0

gloves off

If you didn't know about continuations before, and you've gotten this far, then drinks all around! If not, then, um, drinks all around? In any case, we're going more hard-core now, so you'll need it.

Delimited continuations are not quite a new topic. The first reference that I'm aware of that begins to understand the problem is The theory and practice of first-class prompts, by Matthias Felleisen, in 1988.

Unfortunately that document is only available behind a paywall, because the ACM is a pack of scrooge mcducks. So instead you'll probably have to deal with Sitaram and Felleisen's 1990 contribution, Control delimiters and their hierarchies.

But don't read that one first. First, read Sitaram's beautifully clear 1993 paper, Handling control, which is a lovely exposition of the offerings that delimited continuations have to the working Scheme programmer.

An obvious extension of Sitaram's work is in the implementation of abstractions that look like operating systems. Back in those days it wasn't quite possible to imagine, but the current example would be the web server, where each session is a different process. So we return HTML and wait for the user to come back, and in the meantime schedule other processes; and when they do come back, we resume their session via resuming their computational process.

It's typically said that with continuations you can implement any control feature, but that's not true if your target language also has continuations. For example, you can implement exceptions with continuations plus mutable state, but if the program you are running also uses continuations in their standard Scheme form, the program's use can conflict with the implementation of exceptions, producing incorrect programs.

feature on top of feature

This is why it's 2010 now and no major programming language includes delimited continuations in its specification -- because it's tough to see how they interact with other primitive elements of the language. There has been a lot of research on this recently, with some promising results.

But before I move onto the shining light, I should mention a pit of darkness, which is the profusion of delimited control operators in the literature.

Sitaram in his papers proposes prompt (or %) and control (similar to abort) as his primary delimited continuation operators. But there are others. The ones that are seen most often in the literature are shift and reset, and also there is cupto. It was only in 2007 that Dybvig, Peyton-Jones, and Sabry were able to tease out their relationship; if you are interested, I highly recommend the first third of A monadic framework for delimited continuations, despite the title.

This result was mentioned by Matthew Flatt's 2007 paper, co-written by Yu, Findler, and Felleisen, Adding Delimited and Composable Control to a Production Programming Environment. All this article has really been a prelude to mentioning this paper, and I think it merits a section of its own.

the flatt contribution

Go ahead and open up the Flatt paper, and skim through it.

I'll wait.

OK! First impressions?

Mine were, "what is the deal with all those pictures?". They seemed to be an affectation of friendliness, but appeasing neither the uninitiates nor the academics. But reading the paper in more depth, I was really impressed by their ability to convey information to the Scheme practitioner.

Because let's face it, there are piles and piles of terrible papers out there; and, what's worse, piles and piles of potentially good papers that simply aren't intelligible to the practicing programmer. OK, I dig on the lambda calculus; but what is your supposedly-proven result from your toy grammar going to do to help me make programs?

So I've always been skeptical, perhaps overly skeptical, of the desire to prove properties about programs, of the need to justify with symbology what you already know to be right. I still haven't seen the light. But at least now I know it exists, thanks to Flatt et al's paper.

The practical problem that I had was that Guile's catch and throw still weren't quite integrated with its new virtual machine; they caused the virtual machine to recurse, calling itself reentrantly. Besides being inelegant, this forced the catch expression and handler to be allocated as closures, and restricted potential optimizations.

So I started to look at how I could express catch and throw in Guile's intermediate language. It was not clear that I could add them in an orthogonal way, until I remembered something I read about delimited continuations; but it wasn't until the Flatt paper that I really understood how to implement catch and throw in terms of these lower-level primitives.

But again, it seems I'm writing into the night, so I'll try to wrap up briefly. The form of catch is like this:

catch key thunk handler [pre-unwind-handler]

Then on the other side, you have throw:

throw key arg ...

I'm not trying to argue these are the best control operators, not by a long shot; but they are what we have now, so we need to support them.

So, following Flatt, imagine we have a fluid (dynamically-bound) variable, %eh, the exception handler. Then throw is easy:

throw key arg ... =
  (fluid-ref %eh) key arg ...

Yay psuedocode. catch is trickier, though:

catch key thunk handler =
  let prompt-tag = (fresh-tag)
    % prompt-tag
      with-fluids %eh = (throw-handler prompt-tag key)
      λ (cont thrown-key arg ...)
       handler thrown-key arg ...

Aside from the pseudo-code being obtuse, perhaps you're confused that the % there has three arguments -- a tag, then an expression, then the handler (a lambda). Well the tag is useful to identify a particular prompt, so that an abort can jump back to a particular location, not just the most recent prompt.

Also you're probably underwhelmed, that most of the interesting work was pushed off into throw-handler, which takes the given handler and does some magic to it. Well indeed, there is some magic:

throw-handler prompt-tag catch-key =
  let prev-eh == (fluid-ref %eh)
    λ (thrown-key arg ...)
     if thrown-key is catch-key or catch-key is true
       abort prompt-tag thrown-key arg ...
       prev-eh thrown-key arg ...

So the resulting throw-handler would run in the context of the throw invocation, and if the key matches, or we are catching all throws (catch-key is true), we'll jump back to the prompt.

Note that in our definition of catch, the first argument is the captured continuation, but that argument is never referenced. This simple observation allows us to note at compile-time that the continuation does not need to be reified -- there's no need to dump core, it's just a longjmp().

Guile also supports pre-unwind handlers: handlers that run in the context of the throw instead of the context of the catch. Since Flatt breezed over this in his paper, I'll mention an implementation of that here:

catch key thunk handler pre-unwind-handler =
  let prompt-tag = (fresh-tag)
    % prompt-tag
          %eh = (custom-throw-handler prompt-tag key pre-unwind-handler)
      λ (cont thrown-key arg ...)
       handler thrown-key arg ...

So, all the same, except the handler is made by custom-throw-handler.

custom-throw-handler prompt-tag key f =
  let prev = fluid-ref %eh,
      running = make-fluid false
    λ (thrown-key arg ...)
     if thrown-key is key or key is true
       let was-running = fluid-ref running
         with-fluids running = true
           if not was-running
             f thrown-key arg ...
           # fall through in any case
           if prompt-tag
             abort prompt-tag thrown-key arg ...
             prev thrown-key arg ...
       prev thrown-key arg ...


I know your eyes are glazing over by now. Mine are too, and it's not just the wine. But there's something important here.

The important thing is that small sigil, =; those definitions really are it. If you can prove anything about the behavior of catch now, you should be able to prove it about the new implementation above -- even though it works completely differently.

That's power! Power to refactor, knowing that your refactorings are correct. Because I was lost in a morass, paralyzed in front of my keyboard, wondering about the ghosts awaiting me if I poked code. Now I know. I need an efficient with-fluids, some pieces of prompt and abort, and to add that all to the bootstrap language of Guile, and I have it. It's right. And I know it is because Flatt et al have already done all of the calculations for me :)

the sound of mandolins

I realize it's something of a fizzleout ending, but I think it has to remain that way before I get it all into git. Inshallah we'll be rolling like this to Guile 1.9.8, though it could very well be 1.9.9. But either way, I didn't imagine myself saying this, but thank you, Flatt & crew: your formal work has led me to see the way out of my practical problem. Cheers!

4 responses

  1. Jakub Piotr Cłapa says:

    Flatt, Felleisen & the whole PLT Scheme crew have many papers which quite nicely mix programming language theory and practice. And their mailing list is insightful too.

  2. Jakub Piotr Cłapa says:

    It would be nice to have & automatically escape to &amp; in the comments. Either this or a better error message. ;)

  3. Brian Gough says:

    Thanks for the nice explanation!

  4. Andy Moreton says:

    For more reading on this sort of thing, have a look at:
    and all of the other goodness from Oleg.

Comments are closed.