wingolog

maxwell's equations of software

2009-12-11T14:23:10Z

There was some interesting feedback on my last article, and I ended up wanting to say too much in reply that here I am again, typing at the ether.

Stephen reflects the common confusion that somehow this is just a little file that might or might not work. No man, this is Guile! That's the implementation of Guile's eval. So unit tests, yes, we have more than 12000 of them. Only some of them specifically test the evaluator, but all of them go through the evaluator.

But to be honest, I think unit tests help, but when I'm hacking the compiler the most useful test is to simply rebuild the compiler. If it successfully bootstraps, usually we're doing pretty well.

Zed raises a more direct criticism:

This is why I hate lisp: [link to my article] Long and dense, no comments, not semantic. That's "awesome code"?

I don't think I used the word "awesome", but yes, I believe so, in the sense of "inspiring awe" -- at least for me.

I'm going to call on my homie Alan Kay for some support here. There was an oft-cited interview with Alan Kay a few years back, when he described eval as "Maxwell's equations of software". I quote:

[Interviewer:] If nothing else, Lisp was carefully defined in terms of Lisp.
[Alan Kay:] Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.
I realized that anytime I want to know what I’m doing, I can just write down the kernel of this thing in a half page and it’s not going to lose any power. In fact, it’s going to gain power by being able to reenter itself much more readily than most systems done the other way can possibly do.

Just for context, here is that half-a-page -- Maxwell's equations of software:

evalquote is defined by using two main functions, called eval and apply. apply handles a function and its arguments, while eval handles forms. Each of these functions also has another argument that is used as an association list for storing the values of bound variables and function names.

 evalquote[fn;x] = apply[fn;x;NIL]

where

 apply[fn;x;a] =
       [atom[fn] → [eq[fn;CAR] → caar[x];
                    eq[fn;CDR] → cdar[x];
                    eq[fn;CONS] → cons[car[x];cadr[x]];
                    eq[fn;ATOM] → atom[car[x]];
                    eq[fn;EQ] → eq[car[x];cadr[x]];
                    T → apply[eval[fn;a];x;a]];
        eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a]];
        eq[car[fn];LABEL] → apply[caddr[fn];x;cons[cons[cadr[fn];
                                                   caddr[fn]];a]]]

 eval[e;a] =
       [atom[e] → cdr[assoc[e;a]];
        atom[car[e]] → [eq[car[e];QUOTE] → cadr[e];
                        eq[car[e];COND] → evcon[cdr[e];a];
                        T → apply[car[e];evlis[cdr[e];a];a]];
        T → apply[car[e];evlis[cdr[e];a];a]]

pairlis and assoc have been previously defined.

 evcon[c;a] = [eval[caar[c];a] → eval[cadar[c];a];
               T → evcon[cdr[c];a]]

and

 evlis[m;a] = [null[m] →  NIL;
               T → cons[eval[car[m];a];evlis[cdr[m];a]]]

From the LISP 1.5 Programmer's Manual

What a mess! Where are the unit tests? Where are the comments? A little bit of whitespace, please! They use "cdar", "caar", and "cadr". They should be using named accessors! What if you eval an atom that's not found in the association list? Did they really name a function "evlis"? Et cetera.

Now, I do think the LISP 1.5 (it wasn't yet spelled "Lisp" then) evaluator is awesome, as it was in McCarthy's first papers on the subject. If you disagree with that, that's cool, I see why one would "hate" my poor imitation.

But given the Lisp history of meta-circular evaluators, one doesn't need to comment every one as if it were the first. In the same way that one recognizes an if statement without the need for a design pattern behind it, I would want anyone who's hacking on Guile's evaluator to have read or perhaps even written several evaluators; and once you've written one, you know the pattern.

More substantively, Charles speculates on source-to-source transformations to ease interpretation. I admit almost total ignorance regarding PreScheme; I've been meaning to learn about it for years now. But the point of this evaluator was to have it use the same representation as VM-compiled procedures; ideally it should run fast, but the first priority is for it to run on the VM itself instead of on a separate stack. There's certainly many more clever things that can be done there, and thankfully since eval is implemented in Scheme and compiled like anything else, it will also reap advantages of an improving compiler.

Regarding Emacs, I hope to say more on that point within a week or so, when the video for my talk at the recent GNU Hackers Meeting gets put up.

Finally, Bubo asks if this work is in a released tarball. Not yet is the answer, but it will be in Tuesday's 1.9.6 release.

Happy hacking!

in which our protagonist forgoes modesty

2009-12-09T20:51:00Z

I wrote the most beautiful code of my life last week.

I would like to explain it to you all, but I have to tell a little bit of backstory first.

the distant future, the year 2000

Until earlier this year, Guile has been an interpreted Scheme. Guile ran your code by parsing it into tree-like data structures, then doing a depth-first evaluation of that tree. The evaluation itself was performed with a C function, ceval, which would recursively call itself when evaluating sub-expressions.

ceval was OK, but not so great. Instead of recursively traversing a tree, it's better to pre-examine the expressions you're going to evaluate, and then emit linear sequences of code to handle those steps. That's to say that it's better to have a compiler than an interpreter. So I dug up Keisuke Nishida's old bytecode compiler that he wrote back in 2001, and eventually hacked it into a shape that we merged it into Guile itself.

That was a pretty sweet hack, to retrofit a compiler into Guile. But it wasn't as beautiful as the code I wrote last week.

the present

See, the problem was that now we had two stacks: the C stack that ceval used, and the virtual machine stack used by byte-compiled code. This was a debugging headache, as to get backtraces you had to ping-pong back and forth between the two stacks, interleaving their frames together in the right order. Also with two stacks it's practically impossible to write a real debugger that does single-stepping, inspection and modification of stack frames, etc.

The two-stack solution (ahem) had another problem: you couldn't tail-call between interpreted and compiled code, because the procedures used different stacks. Normally this wasn't a big deal because all code was compiled, but it would occasionally bite you. (The usual case would be when you had a compiled Guile, but just pulled new code from the git repository, then tried to compile it again -- so some of your compiled code was out-of-date and therefore not loaded, you had a mix of compiled and interpreted code in some important places, and your loop starts consuming stack.)

Finally, interpreted code behaved differently than compiled code in some cases. For example, consider the following code:

;; Returns two values: the value, if found,
;; and a flag indicating success.
(define (table-lookup table key)
  (let ((handle (assq key table)))
    (if handle
        (values (cdr handle) #t)
        (values #f #f))))

(define (trace-call f . args)
  (let ((result (apply f args)))
    (format #t "\nfunction returned ~a\n" result)
    result))

(trace-call table-lookup '((x . y)) 'x)

So if I try this at my Guile 1.8 prompt, I get this:

guile> (trace-call table-lookup '((x . y)) 'x)
function returned #
$1 = y
$2 = #t

We see that trace-call returns two values, and the tracing printout shows a "multiple values object" -- a Scheme object like any other, but that the primitive call-with-values knows how to destructure. The toplevel repl is wrapped in a call-with-values, so t and #t print separately.

Now if I fire up Guile 1.9, let's see what we get:

> (trace-call table-lookup '((x . y)) 'x)
function returned y
$1 = y

Guile 1.9's repl compiles its expressions, by default, and indeed we see different behavior -- the trace printout has a naked value, y, and only one value is returned.

Both of these behaviors are compatible with standard Scheme from R5RS on. The origin of the difference is that the behavior of values within a continuation that was not created with call-with-values is unspecified. Relatedly, it is not specified what will happen when you return N values to a continuation accepting M values, where N != M.

What's happening is that the compiler actually has two return addresses in each stack frame -- one for the normal singly-valued case, and one for multiple values. values will return to the multiple-value return address (MVRA), and anything else will go to the normal return address. So actually, compiled code can choose what to do when it gets multiple values. Instead of raising an error when two values are returned to the (let ((result ...)) ...) continuation, Guile chooses to do what you (probably) expect and just drop the second value.

In contrast, with a C evaluator, even noticing that two values were returned to a singly-valued continuation is a pain -- because you have to check and branch every time you recursively call ceval to see if you're getting a multiple-values object.

But I digress. I promised something nice, and here I am noodling about something else.

exit strategy

The solution to all these problems, of course, is to use just one stack, and have that stack be the same as the one that compiled code uses.

Practically what this means is that eval should not be a C function, because Guile does not compile to C; it should be something that ends up as compiled code.

(For now, compiled code is bytecode, run on the VM. I'm being a little vague here because Guile doesn't do native compilation yet, but it will, within a year or two, and the same considerations apply.)

I actually toyed with the idea of writing a hand-coded eval in VM bytecode, but I came to my senses soon enough, and the answer was delightful.

eval in scheme

Of course! Scheme's eval should be written in Scheme itself. Then we just compile it to bytecode, like any other Scheme procedure.

At this point, anyone who's actually had to do Scheme at university (not me) will recognize this as the meta-circular evaluator pattern. To be honest I had never written one before -- and I think the reason was that they always seemed so peripheral. When you write a meta-circular evaluator, the language you really work in is the one that implements the meta-circular evaluator, not the one implemented by the evaluator -- or at least, that's the case if you're trying to get something done, rather than learn about language.

But this is different. This time the meta-circular evaluator actually sits at the heart of Guile -- in fact, we use eval, as implemented in Scheme, and compiled to bytecode, to compile the compiler -- which itself is written in Scheme of course.

In the end, though, you have to have a Scheme compiler to compile eval.scm itself, so we do end up keeping around an evaluator in C. Its only purpose is to interpret the compiler, so we can compile eval.scm: then the compiled version of eval.scm compiles the rest of Guile, including the compiler.

Another option would have been to require a new-enough version of Guile itself to compile the compiler. But I want to be able to sanely bootstrap Guile's compiler, so that's out of the question. We could implement the compiler in portable Scheme, but that would forbid the compiler from making use of any of Guile's niceties.

the code

So here it is (and below). I don't claim that it is actually the most elegant code I have written, though I can think of none better at the moment; nor is it the fastest code, nor the most concise. But it sits in such a powerful place, and in so few lines, that I cannot help but to be pleased with it.

(define primitive-eval
  (let ()
    ;; The "engine". EXP is a memoized expression.
    (define (eval exp env)
      (memoized-expression-case exp
        (('begin (first . rest))
         (let lp ((first first) (rest rest))
           (if (null? rest)
               (eval first env)
               (begin
                 (eval first env)
                 (lp (car rest) (cdr rest))))))
      
        (('if (test consequent . alternate))
         (if (eval test env)
             (eval consequent env)
             (eval alternate env)))
      
        (('let (inits . body))
         (let lp ((inits inits) (new-env (capture-env env)))
           (if (null? inits)
               (eval body new-env)
               (lp (cdr inits)
                   (cons (eval (car inits) env) new-env)))))
      
        (('lambda (nreq rest? . body))
         (let ((env (capture-env env)))
           (lambda args
             (let lp ((env env) (nreq nreq) (args args))
               (if (zero? nreq)
                   (eval body
                         (if rest?
                             (cons args env)
                             (if (not (null? args))
                                 (scm-error 'wrong-number-of-args
                                            "eval" "Wrong number of arguments"
                                            '() #f)
                                 env)))
                   (if (null? args)
                       (scm-error 'wrong-number-of-args
                                  "eval" "Wrong number of arguments"
                                  '() #f)
                       (lp (cons (car args) env)
                           (1- nreq)
                           (cdr args))))))))

        (('quote x)
         x)

        (('define (name . x))
         (define! name (eval x env)))
      
        (('apply (f args))
         (apply (eval f env) (eval args env)))

        (('call (f . args))
         (let ((proc (eval f env)))
           (let eval-args ((in args) (out '()))
             (if (null? in)
                 (apply proc (reverse out))
                 (eval-args (cdr in)
                            (cons (eval (car in) env) out))))))
      
        (('call/cc proc)
         (call/cc (eval proc env)))

        (('call-with-values (producer . consumer))
         (call-with-values (eval producer env)
           (eval consumer env)))

        (('lexical-ref n)
         (let lp ((n n) (env env))
           (if (zero? n)
               (car env)
               (lp (1- n) (cdr env)))))
      
        (('lexical-set! (n . x))
         (let ((val (eval x env)))
           (let lp ((n n) (env env))
             (if (zero? n)
                 (set-car! env val)
                 (lp (1- n) (cdr env))))))
        
        (('toplevel-ref var-or-sym)
         (variable-ref
          (if (variable? var-or-sym)
              var-or-sym
              (let lp ((env env))
                (if (pair? env)
                    (lp (cdr env))
                    (memoize-variable-access! exp (capture-env env)))))))

        (('toplevel-set! (var-or-sym . x))
         (variable-set!
          (if (variable? var-or-sym)
              var-or-sym
              (let lp ((env env))
                (if (pair? env)
                    (lp (cdr env))
                    (memoize-variable-access! exp (capture-env env)))))
          (eval x env)))
      
        (('module-ref var-or-spec)
         (variable-ref
          (if (variable? var-or-spec)
              var-or-spec
              (memoize-variable-access! exp #f))))

        (('module-set! (x . var-or-spec))
         (variable-set!
          (if (variable? var-or-spec)
              var-or-spec
              (memoize-variable-access! exp #f))
          (eval x env)))))
  
    ;; primitive-eval
    (lambda (exp)
      "Evaluate @var{exp} in the current module."
      (eval 
       (memoize-expression ((or (module-transformer (current-module))
                                (lambda (x) x))
                            exp))
       '()))))

ecmascript for guile

2009-02-22T16:45:03Z

Ladies, gentlemen: behold, an ECMAScript compiler for Guile!

$ guile
scheme@(guile-user)> ,language ecmascript
Guile ECMAScript interpreter 3.0 on Guile 1.9.0
Copyright (C) 2001-2008 Free Software Foundation, Inc.

Enter `,help' for help.
ecmascript@(guile-user)> 42 + " good times!";
$1 = "42 good times!"
ecmascript@(guile-user)> [0,1,2,3,4,5].length * 7;
$2 = 42
ecmascript@(guile-user)> var zoink = {
                           qux: 12,
                           frobate: function (x) {
                              return random(x * 2.0) * this.qux;
                           }
                         };
ecmascript@(guile-user)> zoink.frobate("4.2")
$3 = 37.3717848761822

The REPL above parses ECMAScript expressions from the current input port, compiling them for Guile's virtual machine, then runs them -- just like any other Guile program.

Above you can see some of the elements of ECMAScript in action. The compiler implements most of ECMAScript 3.0, and with another few days' effort should implement the whole thing. (It's easier to implement a specification than to document differences to what people expect.)

The "frobate" example also shows integration with Guile -- the random function comes from current module, which is helpfully printed out in the prompt, (guile-user) above.

In addition, we can import other Guile modules as JavaScript, oops, I mean ECMAScript objects:

ecmascript@(guile-user)> require ('cairo');
$1 = #< b7192810>
ecmascript@(guile-user)> $1['cairo-version']();
$2 = 10800

I could automatically rename everything to names that are valid ES identifiers, but I figured that it's less confusing just to leave them as they are, and require ['%strange-names!'] to be accessed in brackets. Of course if the user wants, she can just rename them herself.

Neat hack, eh?

what the hell?

I realize that my readers might have a number of questions, especially those that have other things to do than to obsessively refresh my weblog. Well, since I had myself at my disposal, I decided to put some of these questions to me.

So, Andy, why did you implement this?

Well, I've been hacking on a compiler for Guile for the last year or so, and realized at some point that the compiler tower I had implemented gave me multi-language support for free.

But obviously I couldn't claim that Guile supported multiple languages without actually implementing another language, so that's what I did.

I chose ECMAScript because it's a relatively clean language, and one that doesn't have too large of a standard library -- because implementing standard libraries is a bit of a drag. Even this one isn't complete yet.

How does it perform? Is it as fast as those arachno-fish implementations I keep hearing about?

It's tough to tell, but it seems to be good enough. It's probably not as fast as compilers that produce native code, but because it hooks into Guile's compiler at a high level, as Guile's compiler improves and eventually gets native code compilation, it will be plenty fast. For now it feels snappy.

There is another way in which it feels much faster though, and that's development time -- having a real REPL with readline, a big set of library functions (Guile's), and fast compilation all make it seem like you're talking directly with the demon on the other side of the cursor.

It actually implements ECMAScript? And what about ES4?

ES3 is the current goal, though there are some bits that are lacking -- unimplemented parts of the core libraries, mainly. Probably there are some semantic differences as well, but those are considered bugs, not features. I'm just one man, except in interviews!

Well, there is one difference: how could you deny the full numeric tower to a language?

ecmascript@(guile-user)> 2 / 3 - 1 / 6;
$3 = 1/2

And regarding future standards of ECMAScript, who knows what will happen. ES4 looks like a much larger language. Still, Guile is well-positioned to implement it -- we already have a powerful object system with multimethod support, and a compiler and runtime written in a high-level language, which count for a lot.

Awesome! So I can run my jQuery comet stuff on Guile!!1!!

You certainly could, in theory -- if you implemented XMLHttpRequest and the DOM and all the other things that JavaScript-in-a-web-browser implements. But that's not what I'm interested in, so you won't get that implementation from me!

Where do you see this going?

I see this compiler leading to me publishing a paper at some conference!

More seriously, I think it will have several effects. One will be that users of applications with Guile extensions will now be able to extend their applications in ECMAScript in addition to Scheme. Many more people know ECMAScript than Scheme, so this is a good thing.

Also, developers that want to support ES extensions don't have to abandon Scheme to do so. There are actually many people like this, who prefer Scheme, but who have some users that prefer ECMAScript. I'm a lover, not a fighter.

The compiler will also be an example for upcoming Elisp support. I think that Guile is the best chance we have at modernizing Emacs -- we can compile Elisp to Guile's VM, write some C shims so that all existing C code works, then we replace the Elisp engine with Guile. That would bring Scheme and ECMAScript and whatever other languages are implemented to be peers of Elisp -- and provide better typing, macros, modules, etc to Emacs. Everyone wins.

So how do I give this thing a spin?

Well, it's on a branch at the moment. Either you wait the 3 or 6 months for a Guile 2.0 release, or you check it out from git:

git clone git://git.sv.gnu.org/guile.git guile
cd guile
git fetch
git checkout -b vm origin/vm
./autogen.sh && ./configure && make
./pre-inst-guile

Once you're in Guile, type ,language ecmascript to switch languages. This will be better integrated in the future.

Why haven't you answered my mail?

Mom, I've been hacking on a compiler! I'll call tonight ;-)

visualizing statistical profiles with chartprof

2009-02-09T15:04:46Z

Greetings, hackers of the good hack!

In recent weeks, my good hack has been Guile's compiler and virtual machine. It's almost regression-free, and we're looking for a merge to master within a few weeks.

Things are looking really good. Compiled coded conses significantly less than the evaluator, loads quickly, runs quickly of course, and on top of that has all kinds of fun features. Recently we made metadata have no cost, as we write it after the program text of compiled procedures, which are normally just mmap'd from disk. (I stole this idea from a paper on Self.)

In addition to our tower of language compilers, I recently added a tower of decompilers: from value, to objcode, to bytecode, to assembly... with the possibility of adding future decompilers, potentially going all the way back to Scheme itself. We have all of the debugging information to do it nicely.

However, there are still some regressions. Probably the biggest one is that GOOPS, Guile's object system, actually loads up more slowly with the VM than with the evaluator. So I spend last week giving GOOPS a closer look.

the hunt begins

Turns out, the slowness is in the compiler. But why should the compiler be running at runtime, you ask? Well, it's the dynamic recompilation stuff I spoke of before.

GOOPS compiles implementations of methods for each set of types that it sees at runtime, which is pretty neat. The PIC paper by Hölzle, Chambers, and Ungar describe the advantages of this approach:

The presence of PIC-based [runtime] type information fundamentally alters the nature of optimization of dynamically-typed object-oriented languages. In “traditional” systems such as the current SELF compiler, type information is scarce, and consequently the compiler is designed to make the best possible use of the type information. This effort is expensive both in terms of compile time and compiled code space, since the heuristics in the compiler are tuned to spend time and space if it helps extract or preserve type information. In contrast, a PIC-based recompiling system has a veritable wealth of type information: every message has a set of likely receiver types associated with it derived from the previously compiled version’s PICs. The compiler’s heuristics and perhaps even its fundamental design should be reconsidered once the information in PICs becomes available...

(Thanks to Keith for pointing me to that paper.)

Anyway, recompilation was slow. So I then started to look at exactly why compilation was slow. I had callgrind, which is good, but doesn't give you enough information about *why* you are in those specific C procedures -- it's a C profile, not a Scheme profile. I had statprof, which is good, but doesn't give you enough information about the specific calltrees.

chartprof

So what to do? Well, since statistical profilers already have to walk the stacks, it was a simple matter to just to make statprof squirrel away the full stacks as they were sampled. Then, after the profiling run is done, you can process those stacks into call trees, and visualize that.

But how to visualize the call trees? I had some basic ideas of what I wanted, but no tool that I knew of that could present the information easily. But what I did have was an excellent toolbox: guile-cairo, and guile itself. So voilà chartprof:

Chartprof takes full call trees and produces a representation of where the time is spent. The cascading red part represents the control flow, as nested procedure invocations, and the numbers inside the red part indicate the cumulative time percentage spent in those procedure calls. The numbers out of the red part indicate "self time", when the sampler caught program execution in that procedure instead of in one of its call children.

If you click on the thumbnail on the left, you can download the whole thing. It's big: 1.2 megabytes. There's lots of information in there, is the thing. I should figure out how to prune that information, if that can be done so usefully.

I drew horizontal lines at any call that did not always dispatch to exactly one subcall. It's interesting, you do want to line up procedures and their call sites, but you don't want too many lines. This way seems to be a good compromise, though of course it's not the last word.

analysis

It seems that the culprit is a bit unfortunate. GOOPS, when it loads, enables extensibility in about 200 of Guile's primitive procedures, e.g. equal? and for-each, which allows those procedures to dispatch to methods of generic functions. Unfortunately, this slows down pattern matching in the compilers, as the pattern matcher uses equal?, and that ends up calling out to Scheme just to see if something is equal? to a symbol... badness, that.

The equal? is particularly egregious, as a call to scm_equal_p will do a number of built-in equality checks, then if they all fail it will dispatch to the equal? generic, which dispatches to a method that calls eqv?, which does some built-in checks, then dispatches to an eqv? generic, which finally returns #f just to say that 'foo is not the same as 'bar, something that should be a quick pointer comparison.

So I'm not exactly sure what I'm going to do to fix this, but at least now I know what to think about. I would have had no clue that it was the pattern matcher if it weren't for the the graphical visualization that chartprof gave me. So yay for turning optimization into a tools problem.

code

The code is in (charting prof) from git guile-charting, which in turn needs guile-lib from git.

I'd be interested in hearing feedback about the visualizations, particularly if people have other ideas about how to visualize call graphs. I also have some other information that I could present somehow: the arguments to the procedure applications, and the source locations of the call sites.

Happy hacking!