wingolog

class redefinition in guile

2009-11-09T14:16:27Z

Yes hello hello!

Long-time readers will perhaps recall this diagram:

figure zero: things as they were.

It comes from an article describing how Guile represents its objects in memory, with particular concern for Guile-GNOME.

I was hacking on this code recently, and realized that this representation was not as good as it could be. Our switch to the BDW garbage collector lets us be more flexible with the size of our allocations, and so we can actually allocate the "slots" of an object inline with the object itself:

figure one: things how maybe they could have been.

Alas, during the hack, I discovered a stumbling block: that this representation doesn't allow for classes to be redefined.

redefine a data type, what?

Yes, Guile's object-oriented programming system (GOOPS) allows you to redefine the types of your data. It's OK! CLOS lets you do this too; it's an old tradition. Redefining a class at runtime allows you to develop by incremental changes, without restarting your program.

Of course once you change a class, its instances probably need to change too, probably reallocating their slots. So we have to reintroduce the indirection -- but allowing for locality in the normal, non-redefined case. Like this:

figure two: things how they are, almost.

So updating an instance is as simple as swapping a pointer!

Almost.

Well, not really. This is really something that's unique to Lisp, as far as I can tell, and not very widely-known in the programming community, and hey, I didn't completely understand it -- so man, do I have a topic for a blog or what.

step one: make a hole in the box

The way redefinition works is that first you make a new class, then you magically swap the new for the old, then instances lazily update -- as they are accessed, they check that their class is still valid, and if not update themselves. It's involved, yo, so I made a bunch of pictures.

figure three: class redefinition begins with defining a new class.

So yeah, figure three shows the new class, lying in wait beside the old one. Then comes the magic:

figure four: same identity, different state.

What just happened here? Well we just swapped the vtable and data pointers in the old and new classes. For all practical purposes, the old class is the new class, and vice versa. All purposes except one, that is: eq?. The old class maintains its identity, so that any code that references it, in a hash table for example, will see the same object, but with new state.

The class' identity is the same, but its state has changed. That's the key thing to notice here.

Now we mark the old class's data as being out of date, and the next time its instances check their class... what? Here we reach another stumbling block. The old class has already has new state, so it is already fresh -- meaning that the instance will think nothing is wrong. It could be that the instance was allocated when its class declared two slots, but now the class says that instances have three slots. Badness, this; badness.

So what really needs to happen is for instances to point not to the identity of their classes, but to the state of their classes. In practice this means pointing directly to their slots. This is actually an efficiency win, as it removes an indirection for most use cases. Comme ça:

figure five: instances actually point to class state, not class identity.

As we see in the figure, a well-known slot in the class holds the redefinition information -- normally unset, but if the class is invalidated, it will allow the instance to know exactly which version of the class it is changing from and to.

figure six: new equilibrium.

And finally, figure six shows the new state of affairs -- in which slot access has been redirected for all redefined classes, and all of their instances, transitively.

efficiency

All in all, this is quite OK efficiency-wise. Instance data is normally local, and class data is one indirection away. A redefined instance will have nonlocal data, but hey, not much you can do otherwise, without a copying collector.

There is one efficiency hack worth mentioning. Accessors, discussed in an earlier article, don't need to check and see if their class is up to date or not. This is because they are removed from the old class and re-added to the new one as part of the redefinition machinery.

summary

Redefinition is complicated, but pretty neat.

really, that's the summary?

Yes.

object closure and the negative specification

2008-04-22T21:58:58Z

Guile-GNOME was the first object-oriented framework that I had ever worked with in Scheme. I came to it with all kinds of bogus ideas, mostly inherited from my C to Python formational trajectory. I'd like to discuss one of those today: the object closure. That is, if an object is code bound up with data, how does the code have access to data?

In C++, object closure is a non-problem. If you have an object, w, and you want to access some data associated with it, you dereference the widget structure to reach the member that you need:

char *str = w->name;

Since the compiler knows the type of w, it knows the exact layout of the memory pointed to by w. The ->name dereference compiles into a memory fetch from a fixed offset from the widget pointer.

In constrast, data access in Python is computationally expensive. A simple expression like w.name must perform the following steps:

look up the class of w (call it W)
loop through all of the classes in W's "method resolution order" --- an ordered set of all of W's superclasses --- to see if the class defines a "descriptor" for this property. In some cases, this descriptor might be called to get the value for name.
find the "dictionary", a hash table, associated with w. If the dictionary contains a value for name, return that.
otherwise if there was a descriptor, call the descriptor to see what to do.

This process is run every time you see a . between two letters in python. OK, so getattr does have an opcode to itself in CPython's VM instruction set, and the above code is implemented mostly in C (see Objects/object.c:PyObject_GenericGetAttr). But that's about as fast as it can possibly get, because the structure of the Python language definition prohibits any implementation of Python from ever having enough information to implement the direct memory access that is possible in C++.

But, you claim, that's just what you get when you program in a dynamic language! What do you want to do, go back to C++?

straw man says nay

"First, do no harm", said a practitioner of another profession. Fundamental data structures should be chosen in such a way that needed optimizations are possible. Constructs such as Python's namespaces-as-dicts actively work against important optimizations, effectively putting an upper bound on how fast code can run.

So for example in the case of the object closure, if we are to permit direct memory access, we should allow data to be allocated at a fixed offset into the object's memory area.

Then, the basic language constructs that associate names with values should be provided in such a way that the compiler can determine what the offset is for each data element.

In dynamic languages, types and methods are defined and redefined at runtime. New object layouts come into being, and methods which operated on layouts of one type will see objects of new types as the program evolves. All of this means that to maintain this direct-access characteristic, the compiler must be present at runtime as well.

So, in my silly w.name example, there are two cases: one, in which the getattr method is seeing the combination of the class W and the slot name for the first time, and one in which we have seen this combination already. In the first case, the compiler runs, associating this particular combination of types with a new procedure, newly compiled to perform the direct access corresponding to where the name slot is allocated in instances of type W. Once this association is established, or looked up as in the second case, we jump into the compiled access procedure.

Note that at this point, we haven't specified what the relationship is between layouts and subclassing. We could further specify that subclasses cannot alter the layout of slots defined by superclasses. Or, we could just leave it as it is, which is what Guile does.

Guile, you say? That slow, interpreted Scheme implementation? Well yes, I recently realized (read: was told) that Guile in fact implements this exact algorithm for dispatching its generic functions. Slot access does indeed compile down to direct access, as far as can be done in a semi-interpreted Scheme, anyway. The equivalent of the __mro__ traversal mentioned in the above description of python's getattr, which would be performed by slot-ref, is compiled out in Guile's slot accessor generics.

In fact, as a theoretical aside, since Guile dispatches lazily on the exact types of the arguments given to generic functions (and not just the specializer types declared on the individual methods), it can lazily compile methods knowing exactly what types they are operating on, with all the possiblities for direct access and avoidance of typechecking that that entails. But this optimization has not yet entered the realm of practice.

words on concision

Python did get one thing right, however: objects' code access their data via a single character.

It is generally true that we tend to believe that the expense of a programming construct is proportional to the amount of writer's cramp that it causes us (by "belief" I mean here an unconscious tendency rather than a fervent conviction). Indeed, this is not a bad psychological principle for language designers to keep in mind. We think of addition as cheap partly because we can notate it with a single character: "+". Even if we believe that a construct is expensive, we will often prefer it to a cheaper one if it will cut our writing effort in half.
Guy Steele, Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO (p.9)

Since starting with Guile, over 5 years ago now, I've struggled a lot with object-oriented notation. The problem has been to achieve that kind of Python-like concision while maintaining schemeliness. I started with the procedural slot access procedures:

(slot-ref w 'name)
(slot-set! w 'name "newname")

But these procedures are ugly and verbose. Besides that, since they are not implemented as generic functions, they prevent the lazy compilation mentioned above.

GOOPS, Guile's object system, does allow you to define slot accessor generic functions. So when you define the class, you pass the #:accessor keyword inside the slot definition:

(define-class  ()
  (bar #:init-keyword #:bar #:accessor bar))

(define x (make  #:bar 3))
(bar x) => 3
(set! (bar x) 4)

Now for me, typographically, this is pretty good. In addition, it's compilable, as mentioned above, and it's mappable: one can (map bar list-of-x), which compares favorably to the Python equivalent, [x.name for x in list_of_x].

My problem with this solution, however, is its interaction with namespaces and modules. Suppose that your module provides the type, , or, more to the point, . If has 54 slots, and you define accessors for all of those slots, you have to export 54 more symbols as part of your module's interface.

This heavy "namespace footprint" is partly psychological, and partly real.

It is "only" psychological inasmuch as methods of generic functions do not "occupy" a whole name; they only specify what happens when a procedure is called with particular types of arguments. Thus, if opacity is an accessor, it doesn't occlude other procedures named opacity, it just specifies what happens when you call (opacity x) for certain types of x. It does conflict with other types of interface exports however (variables, classes, ...), although classes have their own . *Global-variables* do as well, and other kinds of exports are not common. So in theory the footprint is small.

On the other hand, there are real impacts to reading code written in this style. You read the code and think, "where does bar come from?" This mental computation is accompanied with machine computation. First, because in a Scheme like Guile that starts from scratch every time it's run, the accessor procedures have to be allocated and initialized every time the program runs. (The alternatives would be an emacs-like dump procedure, or R6RS-like separate module compilation.) Second, because the (drastically) increased number of names in the global namespace slows down name resolution.

lexical accessors

Recently, I came upon a compromise solution that works well for me: the with-accessors macro. For example, to scale the opacity of a window by a ratio, you could do it like this:

(define (scale-opacity w ratio)
  (with-accessors (opacity)
    (set! (opacity w)
          (* (opacity w) ratio))))

This way you have all of the benefits of accessors, with the added benefit that you (and the compiler) can see lexically where the opacity binding comes from.

Well, almost all of the benefits, anyway: for various reasons, for this construct to be implemented with accessors, Guile would need to support subclasses of generic functions, which is does not yet. But the user-level code is correct.

Note that opacity works on instances of any type that has an opacity slot, not just windows.

Also note that the fact that we allow slots to be allocated in the object's memory area does not prohibit other slot allocations. In the case of , the getters and setters for the opacity slot actually manipulate the opacity GObject property. As you would expect, no memory is allocated for the slot in the Scheme wrapper.

For posterity, here is a defmacro-style definition of with-accessors, for Guile:

(define-macro (with-accessors names . body)
  `(let (,@(map (lambda (name)
                  `(,name ,(make-procedure-with-setter
                            (lambda (x) (slot-ref x name))
                            (lambda (x y) (slot-set! x name y)))))
                names))
     ,@body))

final notes

Interacting with a system with a meta-object protocol has been a real eye-opener for me. Especially interesting has been the interplay between the specification, which specifies the affordances of the object system, and the largely unwritten "negative specification", which is the set of optimizations that the specification hopes to preserve. Interested readers may want to check out Gregor Kiczales' work on meta-object protocols, the canonical work being his "The Art of the Metaobject Protocol". All of Kiczales' work is beautiful, except the aspect-oriented programming stuff.

For completeness, I should mention the java-dot notation, which has been adopted by a number of lispy languages targetting the JVM or the CLR. Although I guess it meshes well with the underlying library systems, I find it to be ugly and non-Schemey.

And regarding Python, lest I be accused of ignoring __slots__: the getattr lookup process described is the same, even if your class defines __slots__. The __slots__ case is handled by descriptors in step 2. This is specified in the language definition. If slots were not implemented using descriptors, then you would still have to do the search to see if there were descriptors, although some version of the lazy compilation technique could apply.