wingolog

partitioning ambiguous edges in guile

2025-04-25T15:00:26Z

Today, some more words on memory management, on the practicalities of a system with conservatively-traced references.

The context is that I have finally started banging Whippet into Guile, initially in a configuration that continues to use the conservative Boehm-Demers-Weiser (BDW) collector behind the scene. In that way I can incrementally migrate over all of the uses of the BDW API in Guile to use Whippet API instead, and then if all goes well, I should be able to switch Whippet to use another GC algorithm, probably the mostly-marking collector (MMC). MMC scales better than BDW for multithreaded mutators, and it can eliminate fragmentation via Immix-inspired optimistic evacuation.

problem statement: how to manage ambiguous edges

A garbage-collected heap consists of memory, which is a set of addressable locations. An object is a disjoint part of a heap, and is the unit of allocation. A field is memory within an object that may refer to another object by address. Objects are nodes in a directed graph in which each edge is a field containing an object reference. A root is an edge into the heap from outside. Garbage collection reclaims memory from objects that are not reachable from the graph that starts from a set of roots. Reclaimed memory is available for new allocations.

In the course of its work, a collector may want to relocate an object, moving it to a different part of the heap. The collector can do so if it can update all edges that refer to the object to instead refer to its new location. Usually a collector arranges things so all edges have the same representation, for example an aligned word in memory; updating an edge means replacing the word’s value with the new address. Relocating objects can improve locality and reduce fragmentation, so it is a good technique to have available. (Sometimes we say evacuate, move, or compact instead of relocate; it’s all the same.)

Some collectors allow ambiguous edges: words in memory whose value may be the address of an object, or might just be scalar data. Ambiguous edges usually come about if a compiler doesn’t precisely record which stack locations or registers contain GC-managed objects. Such ambiguous edges must be traced conservatively: the collector adds the object to its idea of the set of live objects, as if the edge were a real reference. This tracing mode isn’t supported by all collectors.

Any object that might be the target of an ambiguous edge cannot be relocated by the collector; a collector that allows conservative edges cannot rely on relocation as part of its reclamation strategy. Still, if the collector can know that a given object will not be the referent of an ambiguous edge, relocating it is possible.

How can one know that an object is not the target of an ambiguous edge? We have to partition the heap somehow into possibly-conservatively-referenced and definitely-not-conservatively-referenced. The two ways that I know to do this are spatially and temporally.

Spatial partitioning means that regardless of the set of root and intra-heap edges, there are some objects that will never be conservatively referenced. This might be the case for a type of object that is “internal” to a language implementation; third-party users that may lack the discipline to precisely track roots might not be exposed to objects of a given kind. Still, link-time optimization tends to weather these boundaries, so I don’t see it as being too reliable over time.

Temporal partitioning is more robust: if all ambiguous references come from roots, then if one traces roots before intra-heap edges, then any object not referenced after the roots-tracing phase is available for relocation.

kinds of ambiguous edges in guile

So let’s talk about Guile! Guile uses BDW currently, which considers edges to be ambiguous by default. However, given that objects carry type tags, Guile can, with relatively little effort, switch to precisely tracing most edges. “Most”, however, is not sufficient; to allow for relocation, we need to eliminate intra-heap ambiguous edges, to confine conservative tracing to the roots-tracing phase.

Conservatively tracing references from C stacks or even from static data sections is not a problem: these are roots, so, fine.

Guile currently traces Scheme stacks almost-precisely: its compiler emits stack maps for every call site, which uses liveness analysis to only mark those slots that are Scheme values that will be used in the continuation. However it’s possible that any given frame is marked conservatively. The most common case is when using the BDW collector and a thread is pre-empted by a signal; then its most recent stack frame is likely not at a safepoint and indeed is likely undefined in terms of Guile’s VM. It can also happen if there is a call site within a VM operation, for example to a builtin procedure, if it throws an exception and recurses, or causes GC itself. Also, when per-instruction traps are enabled, we can run Scheme between any two Guile VM operations.

So, Guile could change to trace Scheme stacks fully precisely, but this is a lot of work; in the short term we will probably just trace Scheme stacks as roots instead of during the main trace.

However, there is one more significant source of ambiguous roots, and that is reified continuation objects. Unlike active stacks, these have to be discovered during a trace and cannot be partitioned out to the root phase. For delimited continuations, these consist of a slice of the Scheme stack. Traversing a stack slice precisely is less problematic than for active stacks, because it isn’t in motion, and it is captured at a known point; but we will have to deal with stack frames that are pre-empted in unexpected locations due to exceptions within builtins. If a stack map is missing, probably the solution there is to reconstruct one using local flow analysis over the bytecode of the stack frame’s function; time-consuming, but it should be robust as we do it elsewhere.

Undelimited continuations (those captured by call/cc) contain a slice of the C stack also, for historical reasons, and there we can’t trace it precisely at all. Therefore either we disable relocation if there are any live undelimited continuation objects, or we eagerly pin any object referred to by a freshly captured stack slice.

fin

If you want to follow along with the Whippet-in-Guile work, see the wip-whippet branch in Git. I’ve bumped its version to 4.0 because, well, why the hell not; if it works, it will certainly be worth it. Until next time, happy hacking!

scheme quiz time!

2013-11-02T13:42:06Z

Scheme quiz time!

Consider the following two functions:

(define (test1 get)
  (let ((v (make-vector 2 #f)))
    (vector-set! v 0 (get 0))
    (vector-set! v 1 (get 1))
    v))

(define (test2 get)
  (let* ((a (get 0))
         (b (get 1)))
    (vector a b)))

Assume the usual definitions for all of the free variables like make-vector and so on. These functions both create a vector with two elements. The first element is the result of a call to (get 0), where get is a function the user passes in as an argument. Likewise the second comes from (get 1).

(test1 (lambda (n) n)) => #(0 1)
(test2 (lambda (n) n)) => #(0 1)

So the functions are the same.

Or are they?

Your challenge: write a standard Scheme function discriminate that, when passed either test1 or test2 as an argument, can figure out which one it is.

. . .

Ready? If you know Scheme, you should think on this a little bit before looking at the answer. I'll wait.

. . .

OK!

We know that in both functions, two calls are made to the get function, in the same order, and so really there should be no difference whatsoever.

However there is a difference in the continuations of the get calls. In test1, the continuation includes the identity of the result vector -- because the vector was allocated before the get calls. On the other hand test2 only allocates the result after the calls to get. So the trick is just to muck around with continuations so that you return twice from a call to the test function, and see if both returns are the same or not.

(define (discriminate f)
  (let ((get-zero-cont #t)
        (first-result #f))
    (define (get n)
      (when (zero? n)
        (call/cc (lambda (k)
                   (set! get-zero-cont k))))
      n)
    (let ((result (f get)))
      (cond
       (first-result
        (eq? result first-result))
       (else
        (set! first-result result)
        (get-zero-cont))))))

In the call to f, we capture the continuation of the entry to the (get 0) call. Then later we re-instate that continuation, making the call to f return for a second time. Then we see if both return values are the same object.

(discriminate test1) => #t
(discriminate test2) => #f

If they are the same object, then the continuation captured the identity of the result vector -- and if not, the result was only allocated after the get calls.

so what?

Unhappily, this has practical ramifications. In many compilers it would be advantagous to replace calls to vector with calls to make-vector plus a series of vector-set! operations. Such a transformation lowers the live variable pressure. If you have a macro that generates a bison-like parser whose parse table is built by a call to vector with 400 arguments -- this happens -- you'd rather not have 400 live variables in the function that builds that table. But this isn't a safe transformation to make, unless you can prove that no argument captures the current continuation. Happily, for the parser generator macro this is the case, but it's not something to bet on.

It gets worse, though. Since test1 returns the same object, it is possible to use continuations to mutate previous return values, with nary a vector-set! in sight!

(define (discriminate2 f)
  (let ((get-zero-cont #f)
        (escape #f))
    (define (get n)
      (case n
        ((0) (call/cc (lambda (k)
                        (set! get-zero-cont k)
                        0)))
        ((1) (if escape
                 (escape)
                 1))))
    (let ((result (f get)))
      (call/cc
       (lambda (k)
         (set! escape k)
         (get-zero-cont 42)))
      result)))

(discriminate2 test1) => #(42 1)
(discriminate2 test2) => #(0 1)

This... this is crazy.

story time

Now it's story time. Guile has a bytecode VM, and usually all code is compiled to that VM. But it also has an interpreter, for various purposes, and that interpreter is fairly classic: it's a recursive function that takes a "memoized expression" and an environment as parameters. Only, the environment was silly -- it was just a list of values. Before evaluating, a "memoizer" runs to resolve lexical references to indexes in that list, and entering a new lexical contour conses on that list.

Well of course that makes lexical variable lookup expensive. It usually doesn't matter as everything is compiled, but it's a bit shameful, so I rewrote it recently to use two-dimensional environments. Let me drop some ASCII on you:

   +------+------+------+------+------+
   | prev |slot 0|slot 1| ...  |slot N|
   +------+------+------+------+------+
      \/
   +------+------+------+------+------+
   | prev |slot 0|slot 1| ...  |slot N|
   +------+------+------+------+------+
      \/
     ...
      \/
   toplevel

It's a chain of vectors, linked through their first elements. Resolving a lexical in this environment has two dimensions, the depth and the width.

Vectors.

You see where I'm going with this?

I implemented the "push-new-environment" operation as a sequence of make-vector and an eval-and-vector-set! loop. Here's the actual clause that implements this:

(define (eval exp env)
  (match exp
    ...
    (('let (inits . body))
     (let* ((width (vector-length inits))
            (new-env (make-env width #f env)))
       (let lp ((i 0))
         (when (< i width)
           (env-set! new-env 0 i (eval (vector-ref inits i) env))
           (lp (1+ i))))
       (eval body new-env)))
    ...))

This worked fine. It was fast, and correct. Or so I thought. I used this interpreter to bootstrap a fresh Guile compile and all was good. Until running those damned test suites that use call/cc to return multiple times from let initializers, as in my discriminate2 test. While the identity of env isn't visible to a program as such, the ability of call/cc to peel apart allocation and initialization of the environment vector makes this particular implementation strategy not viable.

In the end I'll inline a few arities, and then have a general case that allocates heap storage for the temporaries:

(case (vector-length env)
  ((0) (vector env))
  ((1) (vector env (eval (vector-ref inits 0) env)))
  ...
  (else
   (list->vector
    (cons env
          (map (lambda (x) (eval x env))
               (vector->list inits))))))

Of course I'd use a macro to generate that. It's terrible, but oh well. Es lo que hay.

generators in v8

2013-05-08T19:31:42Z

Hey y'all, ES6 generators have landed in V8! Excellent!

Many of you know what that means already, but for those of you that don't, a little story.

A few months ago I was talking with Andrew Paprocki over at Bloomberg. They use JavaScript in all kinds of ways over there, and as is usually the case in JS, it often involves communicating with remote machines. This happens on the server side and on the client side. JavaScript has some great implementations, but as a language it doesn't make asynchronous communication particularly easy. Sure, you can do anything you want with node.js, but it's pretty easy to get stuck in callback hell waiting for data from the other side.

Of course if you do a lot of JS you learn ways to deal with it. The eponymous Callback Hell site lists some, and lately many people think of Promises as the answer. But it would be nice if sometimes you could write a function and have it just suspend in the middle, wait for your request to the remote computer to come through, then continue on.

So Andrew asked if me if we could somehow make asynchronous programming easier to write and read, maybe by implementing something like C#'s await operator in the V8 JavaScript engine. I told him that the way into V8 was through standards, but that fortunately the upcoming ECMAScript 6 standard was likely to include an equivalent to await: generators.

ES6 generators

Instead of returning a value, when you first call a generator, it returns an iterator:

// notice: function* instead of function
function* values() {
  for (var i = 0; i < arguments.length; i++) {
    yield arguments[i];
  }
}

var o = values(1, 2, 3);  // => [object Generator]

Calling next on the iterator resumes the generator, lets it run until the next yield or return, and then suspends it again, resulting in a value:

o.next(); // => { value: 1, done: false }
o.next(); // => { value: 2, done: false }
o.next(); // => { value: 3, done: false }
o.next(); // => { value: undefined, done: true }

Maybe you're the kind of person that likes imprecise, incomplete, overly abstract analogies. Yes? Well generators are like functions, with their bodies taken to the first derivative. Calling next integrates between two yield points. Chew on that truthy nugget!

asynchrony

Anyway! Supending execution, waiting for something to happen, then picking up where you left off: put these together and you have a nice facility for asynchronous programming. And happily, it works really well with promises, a tool for asynchrony that lots of JS programmers are using these days.

Q is a popular promises library for JS. There are some 250+ packages that depend on it in NPM, node's package manager. So cool, let's take their example of the "Pyramid of Doom" from the github page:

step1(function (value1) {
    step2(value1, function(value2) {
        step3(value2, function(value3) {
            step4(value3, function(value4) {
                // Do something with value4
            });
        });
    });
});

The promises solution does at least fix the pyramid problem:

Q.fcall(step1)
.then(step2)
.then(step3)
.then(step4)
.then(function (value4) {
    // Do something with value4
}, function (error) {
    // Handle any error from step1 through step4
})
.done();

But to my ignorant eye, some kind of solution involving a straight-line function would be even better. Remember what generators do: they suspend computation, wait for someone to pass back in a result, and then continue on. So whenever you would register a callback, whenever you would chain a then onto your promise, instead you suspend computation by yielding.

Q.async(function* () {
  try {
    var value1 = yield step1();
    var value2 = yield step2(value1);
    var value3 = yield step3(value2);
    var value4 = yield step4(value3);
    // Do something with value4
  } catch (e) {
    // Handle any error from step1 through step4
  }
});

And for a super-mega-bonus, we actually get to use try and catch to handle exceptions, just as Gods and Brendan intended.

Now I know you're a keen reader and there are two things that you probably noticed here. One is, where are the promises, and where are the callbacks? Who's making these promises anyway? Well you probably saw already in the second example that using promises well means that you have functions that return promises instead of functions that take callbacks. The form of functions like step1 and such are different when you use promises. So in a way the comparison between the pyramid and the promise isn't quite fair because the functions aren't quite the same. But we're all reasonable people so we'll deal with it.

Note that it's not actually necessary that any of the stepN functions return promises. The promises library will lift a value to a promise if needed.

The second and bigger question would be, how does the generator resume? Of course you've already seen the answer: the whole generator function is decorated by Q.async, which takes care of resuming the generator when the yielded promises are fulfilled.

You don't have to use generators of course, and using them with Q does mean you have to understand more things: not only standard JavaScript, but newfangled features, and promises to boot. Well, if it's not your thing, that's cool. But it seems to me that the widespread appreciation of await in C# bodes well for generators in JS.

ES6, unevenly distributed

Q.async has been part of Q for a while now, because Firefox has been shipping generators for some time.

Note however that the current ES6 draft specification for generators is slightly different from what Firefox ships: calling next on the iterator returns an object with value and done properties, whereas the SpiderMonkey JS engine used by Firefox uses an exception to indicate that the iterator is finished.

This change was the result of some discussions at the TC39 meeting in March, and hasn't made it into a draft specification or to the harmony:generators page yet. All could change, but there seems to be a consensus.

I made a patch to Q to allow it to work both with the old SpiderMonkey-style generators as well as with the new ES6 style, and something like it should go in soon.

give it to me already!

So yeah, generators in V8! I've been working closely with V8 hackers Michael Starzinger and Andreas Rossberg on the design and implementation of generators in V8, and I'm happy to say that it's all upstream now, modulo yield* which should go in soon. Along with other future ES6 features, it's all behind a runtime flag. Pass --harmony or --harmony-generators to your V8 to enable it.

Barring unforeseen issues, this will probably see the light in Chromium 29, corresponding to V8 3.19. For now though this hack is so hot out of the fire that it hasn't even had time to cool down and make it to the Chrome Canary channel yet. Perhaps within a few weeks; whenever the V8 dependency in Chrome gets updated to the 3.19 tree.

As far as Node goes, they usually track the latest stable V8 release, and so for them it will probably also be a few weeks before it goes in. You'll still have to run Node with the --harmony flag. However if you want a sneak preview of the future, I uploaded a branch of Node with V8 3.19 that you can build from source. It might mulch your cat, but life is not without risk on the bleeding_edge.

Finally for Q, as I said ES6 compatibility should come soon; track the progress or check out your own copy here.

final notes

Thanks to the V8 team for their support in this endeavor, especially to Michael Starzinger for enduring my constant questions. There's lots of interesting things in the V8 pipeline coming out of Munich. Thanks also to Andrew Paprocki and the Bloomberg crew for giving me the opportunity to fill out this missing piece of real-world ES6. The Bloomberg folks really get the web.

This has been a hack from Igalia, your friendly neighborhood browser experts. Have fun with generators, and happy hacking!

guile and delimited continuations

2010-02-26T20:39:22Z

Guile now does delimited continuations.

Ahem! I say: Guile now does delimited continuations! Whoop whoop!

Practically speaking, this means that Guile implements prompt and abort, as described in Dorai Sitaram's 1993 paper, Handling Control. (Guile's abort is the operator that Sitaram calls control.)

I used to joke that all of this Guile compilation work was to bring Guile back into the 2000s, from being stuck in the 1990s. But delimited continuations are definitely a twenty-teens topic: besides PLT Scheme, I don't know of any other production language implementation that has delimited continuations as part of its core language. (Please correct my ignorance.) If this works out, and the implementation proves to be sufficiently bug-free, this is a great step forward not only for Guile but for the concept itself of delimited continuations.

Edit: Readers suggest that I add Scheme48, OCaml, and Scala to the list. Still, too few implementations for such a lovely construct.

Furthermore, I retrofitted Guile's existing catch and throw APIs, implementing them in terms of prompt and control, and providing compatible shims for Guile's C API.

Thanks again to Matthew Flatt for blazing the trail here.

rambling

So, for the benefit of those dozen or so people that will probably implement delimited continuations after me, here's a few strategies and pitfalls. For the rest of you, I recommend rancid wine.

First, I should make a disclaimer. Here, my focus is on a so-called "direct" implementation of delimited continuations; that is to say, I don't rely on a global continuation-passing style (CPS) transform of the source code.

There was recently an article posted to the Scheme reddit about Femtolisp, a Lisp implementation by Jeff Benzason. I thought it was pretty neat. I was especially pleased that he decided to support shift and reset; but bummed when I found out that he did so by local CPS-transformation. So you couldn't reset from arbitrary functions.

A local CPS-rewrite strategy doesn't work very well, because prompts are fundamentally dynamic. There is something fundamentally dynamic about them, that they are part of the dynamic environment (like dynamic-wind). So to support that via CPS, you end up needing some kind of double- or triple-barrelled CPS, with abort continuations as well. I think? I think.

So for me, direct implementation means that you use the language's native stack representation, not requiring global or local transformations.

Also I should confess that I didn't glean anything from Gasbichler and Sperber's Final Shift paper; in all likelihood, again due to my ignorance. So if I say things that they've said better, well, stories have to be retold to live, and it won't hurt if I add my interpretation.

the seitan of the matter

So, Digresso Riddikulus: poof! Where were we? And what's up with this jack-in-the-box? Yes, direct implementations of delimited continuations.

Here's a C program:

int baz ();
int bar () { return baz (); }
int foo () { return bar (); }
int main () { return foo (); }

And here's a C stack:

Right! So you see that function calls are pushing frames on the stack.

When you're programming normally, you have a top-down view of the stack. It's like standing on a ladder and looking down -- you see the top rung clearly, but the rest is obscured by your toolbelt.

Laying out the frames flat like this allows us to talk instead about the whole future of this program -- what it's doing, and what it's going to do. The future of the program is known as its continuation -- a funny word, but I hope the meaning is clear in this context.

Now, consider a C program that calls into some hypothetical Scheme, assuming appropriate Scheme definitions of foo, bar, and baz:

int main () { return eval ("(foo)"); }

And the corresponding pixels:

It should be clear that there are logically two stacks at work here. However they both correspond, in this case, to the same logical control-flow stack -- eval doesn't return until foo, bar, and baz have returned.

Also note that the "C" stack has been renamed the "foreign" stack, indicating that Scheme is actually where you want to be, and whatever is not in the Scheme world is heathen. This model maintains its truthiness regardless of the implementation strategy of the Scheme -- whether it reuses the C stack, or evaluates Scheme expressions on their own stack.

So! Delimited continuations, right? Let's try one. Here's some Scheme code that begins a prompt, and calls baz.

(% (baz) (lambda (k) k))

If baz returns normally, well that's the value; but if it aborts, the partial continuation will be saved, and returned. Verily, pixels:

The red dot and line indicates that that position in the continuation has been saved, and if we abort, we'll abort back to that line. It's like grabbing a cherry from the tree, and then falling down a couple of rungs on the ladder. Yes? Yes.

Expanding the example with an implementation of baz, we have:

(define (baz) (abort) 3)

So, baz will abort to the nearest prompt, and if the abort comes back, it will return 3. Like this:

Abort itself needs to capture a partial continuation -- that part of the continuation that is delimited by prompt and abort. (That's why they're called delimited continuations.)

In practice, for the implementor, this has a number of fine points. The one I'll mention here is that you don't actually capture the prompt frame itself, or the presence of the prompt on the dynamic stack. I tried to indicate that in my drawings by making the red line disjoint from the red box. The red box is what's captured by the continuation.

See Dybvig's "A monadic framework for delimited continuations" for a more complete discussion of what to capture.

(define (qux)
  (+ (k) 38))

OK! Assume that we save away that partial continuation somewhere, let's say, in k. Now the evaluator ends up calling qux, which calls the partial continuation.

Calling the partial continuation in this context means splatting it on the stack, and then fixing things up so that it looks like we actually called baz from qux.

I'll get to the precise set of things you have to fix up, but let's pause a moment with these pixels, and make one thing clear: you can't splat foreign frames back on the stack. Maybe you can in some circumstances, but not in the general case.

The practical implication of this is twofold: one, delimited continuation support needs to be a core part of your language. You can't evaluate the "body" part of the prompt by recursing through a foreign function, for example.

The second (and deeper) implication is that if any intervening code does recurse through a foreign function, the resulting partial continuation cannot be reinstated.

One could simply error in this case, but it's more useful to allow nonlocal exits, and then error on an attempted re-entry.

So yes, things you need to fix up: obviously the frame pointer stack. The dynamic winds (and more generally, the dynamic environment). And, any captured prompts themselves. Consider:

Here we see a prompt, which has a prompt inside it, which calls baz, which then aborts back to the first prompt. (You will need prompt tags for this.) Now, note: there are two horizontal lines here: two prompts active in the dynamic environment.

Let's see what happens when we reinstate the continuation, represented by the red dotted box:

Here we see that the blue line corresponding to the captured prompt is in a different place (relative to the base of the stack). That's because when you reinstate a prompt, any captured prompts are logically made new: one must recapture their registers, both real (via setjmp) and virtual (in the case of a virtual machine) when rewinding the stack.

And that's all!

postre

Meanwhile, back at the ranch: while searching for the correct spelling of riddikulus, I stumbled upon a delightful interview with JK Rowling, in three parts. Enjoy!

sidelong glimpses

2010-02-14T23:55:00Z

Evening, tubes. I don't usually post about works-in-progress, but this one has been on my mind for two or three weeks, really paralyzing my mojo, so perhaps getting it out in textual form will help me move on a bit.

operation delimited continuation

So yes, to my non-programmer audience, let this be a pleasantly written but unintelligible lullaby, a word-train rocking you to sleep; but to the nerds out there, I'm quite fascinated with this topic, and I hope I'm able to convey some of that fascination to you.

Consider the following C program:

#include 

int main ()
{
  abort ();
  return 0;
}

Let's assume we have it in foo.c, and compile it. We'll do so from tcsh, for reasons to be explained later.

$ tcsh
% limit coredumpsize unlimited
% gcc foo.c

OK, we should have a binary in a.out. Now, we run it, with predictable effects:

% ./a.out
Abort (core dumped)
% ls -l core
-rw------- 1 wingo wingo 151552 2010-02-14 21:37 core

So, what happened was that the program ran, then it got to the abort, which caused it to unexpectedly exit. Where did it exit to? In this case, back to the prompt.

That program is no longer running, but when it exited, it left behind a copy of its state -- of its continuation. A continuation is what's left of a computation. In this case, what's left is for abort to return, then main returns 0, then probably some system-specific code, then we'd be back at the prompt again. Aborting just brought us back to the prompt a little faster.

Anyway, the core dump has all of the data that would be necessary to complete the process' computation. With a debugger you can get in and see what that data was, but you can't run the program again from that point -- you'd need some extra OS-level support for that.

why are you talking about this stuff

Glad you asked! Core dumps are really a specific instance of control features in programming languages. Control features are what allow you to skip around in your program. In this case, we skipped back to the prompt.

Scheme includes support for a very low-level control feature known as continuations. With continuations, you can implement all kinds of things, like exceptions, coroutines, generators, and short-cut operators. You can make up new control operators in a Scheme library and add them to an innocent, unsuspecting program.

Many people are aware of this, but fewer know that continuations have a problem, a big problem. The origin of the problem is that a continuation captures the whole future of a computation, not just a part of it. So if you implemented exceptions with continuations, imagine dumping core every time you threw an exception. Not very efficient, no?

Well the truth is more complicated of course. If you have access to the entirety of a program, the compiler can transform your code into what is known as continuation-passing style (which the laity may recognize by the weaker names of single-static-assignment or A-normal form) and in that case invoking a continuation is just like calling a function. But for the rest of us, we have to play tricks with dynamic extents, incrementally dumping parts of the core as the program enters different dynamic extents.

(I'm simplifying here, to continue the core-dump metaphor, in the knowledge that the clergy will correct this doctrine in the comments.)

But in the general case, capturing and reinstating continuations can be quite expensive, because in the absence of sophisticated analysis, one simply dumps the whole core. It's a terrible but common implementation strategy.

There are other, deeper problems with continuations as present in Scheme, notably the lack of composability. Though I do understand that this problem is deeper, I don't understand its depths, so I'm going to forgo a discussion of that here.

the solution: more power

So, continuations are great, but they can be expensive to implement. OK. In this context it's amusing that the solution to this problem is actually to give continuations more power.

The problem is that in Scheme you have control over where you dump core, but not how far you dump core. While it's true that continuations logically capture the whole future of a computation, in practice that continuation has a limit: the process boundary. Our a.out was spawned at the tcsh prompt, but it has no access to the implementation of the shell itself, not to mention the kernel.

So, more power, as we were saying: let's allow programs to delimit the range of computation captured by a continuation. Instead of capturing continuations up to the process boundary, a program can capture a continuation only up to a certain point in a program.

Pleasantly, that point is known as a %, pronounced "prompt". (Dorai Sitaram notes that he would have preferred >, but it was taken by greater-than. Also, this comes from back in 1993, so perhaps he can be forgiven the tcsh-style prompt.)

It's as if you were implementing a shell in your program, as if your program were an operating system for other programs.

Here's a translation of foo.c into a Scheme that supports prompts:

(define (foo)
  (abort)
  0)

And now we run it, in a prompt:

(% (foo)
   (lambda (core . args)
     (format #t "Abort (core dumped): ~a\n" core)
     (format #t "Arguments to abort: ~a\n" args)
     1))
;; =| Abort (core dumped): #
;; =| Arguments to abort: 
;; => 1

The second argument to % is the "handler". When abort is called, control returns to the handler, and the continuation -- the "core dump" -- is passed as the first argument. Unlike in UNIX, abort in Scheme can take any number of arguments. Those arguments are passed to the handler.

Also unlike UNIX, the core dump may be revived; but normal Scheme continuations do have a UNIX similarity in that reviving them is like invoking exec(3) -- they replace the current process image with a new one. This might be the right thing for user space to do, but if you're implementing a kernel, scheduling a process shouldn't give that process control over the whole system.

Instead, delimited continuations are more like functions, in that you can call them for a value. Like this:

(% (foo)
   (lambda (core . args)
     (format #t "Abort (core dumped, but continuing): ~a\n" core)
     (format #t "Arguments to abort: ~a\n" args)
     (core))) ;; revivify the core, letting the program finish
;; =| Abort (core dumped, but continuing): #
;; =| Arguments to abort: 
;; => 0

gloves off

If you didn't know about continuations before, and you've gotten this far, then drinks all around! If not, then, um, drinks all around? In any case, we're going more hard-core now, so you'll need it.

Delimited continuations are not quite a new topic. The first reference that I'm aware of that begins to understand the problem is The theory and practice of first-class prompts, by Matthias Felleisen, in 1988.

Unfortunately that document is only available behind a paywall, because the ACM is a pack of scrooge mcducks. So instead you'll probably have to deal with Sitaram and Felleisen's 1990 contribution, Control delimiters and their hierarchies.

But don't read that one first. First, read Sitaram's beautifully clear 1993 paper, Handling control, which is a lovely exposition of the offerings that delimited continuations have to the working Scheme programmer.

An obvious extension of Sitaram's work is in the implementation of abstractions that look like operating systems. Back in those days it wasn't quite possible to imagine, but the current example would be the web server, where each session is a different process. So we return HTML and wait for the user to come back, and in the meantime schedule other processes; and when they do come back, we resume their session via resuming their computational process.

It's typically said that with continuations you can implement any control feature, but that's not true if your target language also has continuations. For example, you can implement exceptions with continuations plus mutable state, but if the program you are running also uses continuations in their standard Scheme form, the program's use can conflict with the implementation of exceptions, producing incorrect programs.

feature on top of feature

This is why it's 2010 now and no major programming language includes delimited continuations in its specification -- because it's tough to see how they interact with other primitive elements of the language. There has been a lot of research on this recently, with some promising results.

But before I move onto the shining light, I should mention a pit of darkness, which is the profusion of delimited control operators in the literature.

Sitaram in his papers proposes prompt (or %) and control (similar to abort) as his primary delimited continuation operators. But there are others. The ones that are seen most often in the literature are shift and reset, and also there is cupto. It was only in 2007 that Dybvig, Peyton-Jones, and Sabry were able to tease out their relationship; if you are interested, I highly recommend the first third of A monadic framework for delimited continuations, despite the title.

This result was mentioned by Matthew Flatt's 2007 paper, co-written by Yu, Findler, and Felleisen, Adding Delimited and Composable Control to a Production Programming Environment. All this article has really been a prelude to mentioning this paper, and I think it merits a section of its own.

the flatt contribution

Go ahead and open up the Flatt paper, and skim through it.

I'll wait.

OK! First impressions?

Mine were, "what is the deal with all those pictures?". They seemed to be an affectation of friendliness, but appeasing neither the uninitiates nor the academics. But reading the paper in more depth, I was really impressed by their ability to convey information to the Scheme practitioner.

Because let's face it, there are piles and piles of terrible papers out there; and, what's worse, piles and piles of potentially good papers that simply aren't intelligible to the practicing programmer. OK, I dig on the lambda calculus; but what is your supposedly-proven result from your toy grammar going to do to help me make programs?

So I've always been skeptical, perhaps overly skeptical, of the desire to prove properties about programs, of the need to justify with symbology what you already know to be right. I still haven't seen the light. But at least now I know it exists, thanks to Flatt et al's paper.

The practical problem that I had was that Guile's catch and throw still weren't quite integrated with its new virtual machine; they caused the virtual machine to recurse, calling itself reentrantly. Besides being inelegant, this forced the catch expression and handler to be allocated as closures, and restricted potential optimizations.

So I started to look at how I could express catch and throw in Guile's intermediate language. It was not clear that I could add them in an orthogonal way, until I remembered something I read about delimited continuations; but it wasn't until the Flatt paper that I really understood how to implement catch and throw in terms of these lower-level primitives.

But again, it seems I'm writing into the night, so I'll try to wrap up briefly. The form of catch is like this:

catch key thunk handler [pre-unwind-handler]

Then on the other side, you have throw:

throw key arg ...

I'm not trying to argue these are the best control operators, not by a long shot; but they are what we have now, so we need to support them.

So, following Flatt, imagine we have a fluid (dynamically-bound) variable, %eh, the exception handler. Then throw is easy:

throw key arg ... =
  (fluid-ref %eh) key arg ...

Yay psuedocode. catch is trickier, though:

catch key thunk handler =
  let prompt-tag = (fresh-tag)
    % prompt-tag
      with-fluids %eh = (throw-handler prompt-tag key)
        (thunk)
      λ (cont thrown-key arg ...)
       handler thrown-key arg ...

Aside from the pseudo-code being obtuse, perhaps you're confused that the % there has three arguments -- a tag, then an expression, then the handler (a lambda). Well the tag is useful to identify a particular prompt, so that an abort can jump back to a particular location, not just the most recent prompt.

Also you're probably underwhelmed, that most of the interesting work was pushed off into throw-handler, which takes the given handler and does some magic to it. Well indeed, there is some magic:

throw-handler prompt-tag catch-key =
  let prev-eh == (fluid-ref %eh)
    λ (thrown-key arg ...)
     if thrown-key is catch-key or catch-key is true
       abort prompt-tag thrown-key arg ...
     else
       prev-eh thrown-key arg ...

So the resulting throw-handler would run in the context of the throw invocation, and if the key matches, or we are catching all throws (catch-key is true), we'll jump back to the prompt.

Note that in our definition of catch, the first argument is the captured continuation, but that argument is never referenced. This simple observation allows us to note at compile-time that the continuation does not need to be reified -- there's no need to dump core, it's just a longjmp().

Guile also supports pre-unwind handlers: handlers that run in the context of the throw instead of the context of the catch. Since Flatt breezed over this in his paper, I'll mention an implementation of that here:

catch key thunk handler pre-unwind-handler =
  let prompt-tag = (fresh-tag)
    % prompt-tag
      with-fluids
          %eh = (custom-throw-handler prompt-tag key pre-unwind-handler)
        (thunk)
      λ (cont thrown-key arg ...)
       handler thrown-key arg ...

So, all the same, except the handler is made by custom-throw-handler.

custom-throw-handler prompt-tag key f =
  let prev = fluid-ref %eh,
      running = make-fluid false
    λ (thrown-key arg ...)
     if thrown-key is key or key is true
       let was-running = fluid-ref running
         with-fluids running = true
           if not was-running
             f thrown-key arg ...
           # fall through in any case
           if prompt-tag
             abort prompt-tag thrown-key arg ...
           else
             prev thrown-key arg ...
     else
       prev thrown-key arg ...

what

I know your eyes are glazing over by now. Mine are too, and it's not just the wine. But there's something important here.

The important thing is that small sigil, =; those definitions really are it. If you can prove anything about the behavior of catch now, you should be able to prove it about the new implementation above -- even though it works completely differently.

That's power! Power to refactor, knowing that your refactorings are correct. Because I was lost in a morass, paralyzed in front of my keyboard, wondering about the ghosts awaiting me if I poked code. Now I know. I need an efficient with-fluids, some pieces of prompt and abort, and to add that all to the bootstrap language of Guile, and I have it. It's right. And I know it is because Flatt et al have already done all of the calculations for me :)

the sound of mandolins

I realize it's something of a fizzleout ending, but I think it has to remain that way before I get it all into git. Inshallah we'll be rolling like this to Guile 1.9.8, though it could very well be 1.9.9. But either way, I didn't imagine myself saying this, but thank you, Flatt & crew: your formal work has led me to see the way out of my practical problem. Cheers!

call/cc and ecmascript

2009-02-25T09:29:12Z

I suppose it should have been obvious, but I was still delightfully surprised when Clinton Ebadi came up with these lines on IRC this morning:

ecmascript@(guile-user)> var callcc = this['call/cc'];
ecmascript@(guile-user)> var k = false;
ecmascript@(guile-user)> 1 + callcc (function (kont) { k = kont; return 0; });
1
ecmascript@(guile-user)> k (2);
3

42 good times indeed!

fosdem, presentations, foo

2007-03-07T14:09:35Z

post-fosdem

Allow me to free-associate about FOSDEM: packed hallways, rainy sidewalks, buttery croissants... actually about food I could go on for quite a while.

People-wise it was pretty good, although with a pervasive feeling of oddness. The free software community has its perceived social hierarchies, and for many people, these conference events are attempts to transfer those hierarchies to real life. So until someone knows what position that you occupy in their mental pecking order, they treat you with lots of distance, and depending on where they put you, potentially lots of distance afterwards.

The GNOME-FR people, on the other hand, were refreshingly exuberant. Very positive people, the Vincents and Guillaumes and the apparently unlinkable Christophe Fergeaus. Of course there were many other excellent folks there, and also of course, not enough time.

presentation

I presented a talk on writing GNOME applications in Scheme. It's available on the interwob as an incorrectly-rendered PDF, or as a tarball of SVGs to be presented with inkview. I was shocked, I had the title slide up for a good 5 minutes or so, but still about 40 people stayed in the room.

Of course with just the slides, you are missing my charm. For better?

meta-presentation

I got loads of wonderful comments on my last writing product, concerned with the mechanics of presentations. Especially useful was the tip that inkscape comes with a slide-show-like wrapper, inkview. Thanks Andy!

Also fascinating was Jos Hirth's XSVG, rendering SVG inside the browser (probably Firefox-only). Pretty cool, that. Maybe next time.

I should mention that I considered using some LaTeX package, but decided not to because I wanted to be able to easily tweak the presentation. That notwithstanding, several people mentioned latex-beamer as being quite good.

In the end I used my ghetto scripts, although the text-to-svg process was more work than it should have been. I looked at doing it with my existing XML processing tools, but it turns out that using functional programming techniques to do text layout is an underexplored area. I'm still poking at that.

other thangs, delimited by ⁋

My new year's vow to be more communicative is not going so well.

Interesting paper: Control Delimiters and Their Hierarchies by Dorai Sitaram and Matthias Felleisen. Summary: "Continuations in Scheme are cool but potentially expensive; we have a way of restricting them slightly but making them cheaper."

The library has rendered unto me an album by Miriam Makeba, which is providing much delight. (It's a collection, although I prefer her early-US stuff.)

I lit a fire on my terrace for the first time this year. Delicious global warming ensued.

My grandmother's not doing so well; I'm going home next week to visit.

I started yoga a little more than a month ago, at an Iyengar school. It's interesting, and way harder than I thought it would be.