wingolog

an admission

6 May 2008 9:50 PM (carl worth | cairo)

I subscribe to cairo-list just to read what Carl Worth has to say. Carl is smart, fair, level-headed, perceptive, very sympathetic to people that come with problems, yet opinionated as well. It's refreshing, inspiring, and humbling at the same time.

This blathering inspired by one random post, although it could have been any other post just as well.

catching memory leaks with valgrind's massif

5 May 2008 5:07 PM (valgrind | massif | memory leaks | mtrace | gstreamer | ffmpeg)

One of my first tasks at Oblong was to migrate their code for playing video from some very tricky, threaded ffmpeg + portaudio code to using GStreamer. The playback interface is fairly standard, but thorough: seeks to time, segment seeks, variable speed and reverse playback, frame stepping, etc. There were some twists: we do colorspace conversion on the GPU, and there's a strange concept of "masks", which is useful for operating on rotoscoped video, and there's integration with the GL main loop.

But anyway, I felt finished with all of that a while ago. The only problem was a lingering memory leak, especially egregious in the context of the art installation, which has yet to switch to my code.

So it was with a queasy, helpless feeling that I sat down and tried to systematize the problem, come up with a test case, and see if I could track down where the leak was. I tried code inspection at first, and I proved my code correct. (Foreshadowing, that.) I at least narrowed down the situations under which it occured. I then despaired for a while, before I hit on the way to make memory leak detection fun: turn it into a tools problem. Now instead of finding the leak, all I needed to do was to find the leak detector!

So I checked out valgrind from CVS; it crashed on me. Then I decided to see if libc had anything to offer me; indeed it does, mtrace. But alack, deadlocks. I even went so far as to include mtrace in my code, and applied Jambor's patch from the bug report on my sources, but lost because the ELF symbol resolution is intertwingled with libc's build system.

So back to valgrind, this time the 3.2.3 version packaged with Fedora, and lo and behold, exhibit A:

houston, we have a leak

My test is adding a video, waiting a while, removing the video, waiting again, then repeating. You can see the obvious video-playing versus video-removed phases.

Fortunately, you can also see the leak, and where it is: in the green part, corresponding to something that's calling g_try_malloc. This was comforting to find. It could have been something involving GL contexts or whatnot, and I'm using the babymunching nvidia drivers. So g_try_malloc was where it was coming from. But what was calling g_try_malloc?

For that, you have to dive into the textual output produced by massif. And sure enough, following things back far enough, you find that it is a GStreamer video buffer:

Context accounted for  7.2% of measured spacetime
  0x5F27E60: g_try_malloc (gmem.c:196)
  0x52E1487: gst_buffer_try_new_and_alloc (gstbuffer.c:359)
  0x530367B: gst_pad_alloc_buffer_full (gstpad.c:2702)
  0x53039FA: gst_pad_alloc_buffer (gstpad.c:2823)
  0xB9C11BF: gst_queue_bufferalloc (gstqueue.c:502)
  0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668)
  0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850)
  0xBBECB4F: gst_base_transform_buffer_alloc (gstbasetransform.c:1112)
  0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668)
  0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850)
  0xBBECB4F: gst_base_transform_buffer_alloc (gstbasetransform.c:1112)
  0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668)
  0x53039FA: gst_pad_alloc_buffer (gstpad.c:2823)
  0x52F6C68: gst_proxy_pad_do_bufferalloc (gstghostpad.c:182)
  0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668)
  0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850)
  0xC452722: alloc_output_buffer (gstffmpegdec.c:764)
  0xC454504: gst_ffmpegdec_frame (gstffmpegdec.c:1331)
  0xC45635D: gst_ffmpegdec_chain (gstffmpegdec.c:2236)
  0x5303C06: gst_pad_chain_unchecked (gstpad.c:3527)
  0x530421C: gst_pad_push (gstpad.c:3695)
  0xB9C0A3E: gst_queue_loop (gstqueue.c:1024)
  0x531E418: gst_task_func (gsttask.c:192)
  0x5F4338E: g_thread_pool_thread_proxy (gthreadpool.c:265)
  0x5F41C4F: g_thread_create_proxy (gthread.c:635)

For this level of information, you have to run massif with special options. I ran my test like this:

G_SLICE=always-malloc valgrind --tool=massif --depth=30 ./.libs/lt-vid-player

So now that I knew what was leaking, I decided to run with fewer, longer cycles to see the allocation characteristics were. And thus, exhibit B:

bucket effect -- watch the cyan trough fill up

You can see that after the video was removed from the scene the cyan part representing g_try_malloc allocation does not drop down to zero; indeed it starts to "fill up the trough", getting larger at each iteration.

Of course at this point I realized that I probably wasn't freeing the buffer that I kept as a queue between the GStreamer and GL threads on teardown. Indeed, indeed. Two lines later and we have the much more agreeable long-term plot:

memleak fixed

Moral of the story: "proof of correctness" is not proof of correctness.

Valgrind turned out to be much more useful to me in this instance than it was when I looked at before, when hacking Python. But again, the CVS/3.3 version was of no use, yet. Since then, 3.3 does indeed do graphs again, but in ascii. As a palliative, the textual output appears to have improved. Still, ascii graphs?

a preview look at quagmire

28 April 2008 3:57 PM (autotools | automake | quagmire | tromey | build systems | workflow)

We've been using git at work for a few months now, and finally settling down reasonable workflow. As reasonable as anything might be with a 9 hour difference between me and Los Angeles, and for a cross-platform project that uses autotools.

Regarding the latter, the autotools have been a constant source of pain for my co-workers. I don't mind it myself, but I feel bad because I had a hand in inflicting it on them. So now when hacking on a bit of test code, I decided to take a look at Quagmire, Tom Tromey's experiment in replacing automake, libtool, and (to a large degree) autoconf with portable GNU make.

With Quagmire, all of the autofoo is replaced with one makefile, Quagmire.mk. Its format is much like that of a Makefile.am. A simple project with a few source files, that depends on pkg-config, might look like this example, taken from the Quagmire distribution:

# this is Quagmire.mk, in the top-level source directory
# -*- makefile-gmake -*-

bin_PROGRAMS = ekeyring

lib_LIBRARIES = libzardoz.a

libzardoz.a_SOURCES = zardoz.c

ekeyring_SOURCES = ekeyring.c something.h
ekeyring_PACKAGES = gnome-keyring-1
ekeyring_CONFIG_HEADERS = config.h

config.h_FUNCTIONS = strcmp strdup
config.h_HEADERS = string.h nothing.h strings.h sys/types.h

data_SCRIPTS = ekeyring.c
data_DATA = something.h

EXTRA_DIST = README.simple

The syntax will be familiar to anyone who has used automake. It is further described in the README.

Then you add the following configure.ac to the same directory:

AC_INIT(ekeyring, 1.5.1)
AC_CONFIG_SRCDIR(ekeyring.c)
AC_ARG_PROGRAM
AM_QUAGMIRE
AC_OUTPUT

And copy quagmire.m4 (from http://quagmire.googlecode.com/svn/trunk/m4/quagmire.m4) to aclocal.m4, to provide the AM_QUAGMIRE definition. Then import the quagmire files into your source tree by running:

svn checkout http://quagmire.googlecode.com/svn/trunk/src quagmire

At this point you run autoconf, in order to produce configure, and then ./configure. It doesn't configure much. Most of the actual configure checks are run at make-time.

For example the ekeyring_PACKAGES line specifies the pkg-config files that are needed to compile ekeyring. Those checks are made as dependencies of the ekeyring target.

status

Quagmire is promising, and is quite fast, as is appropriate for an underfeatured project. It's not there yet though. It doesn't yet understand differing shared library extensions (.so versus .dylib), it doesn't (I don't think) know about static linking like libtool's .la files do, and I haven't gotten programs that depend on in-tree dynamic or static libraries to build yet.

And yet, it does a proper distcheck. It integrates with pkg-config. The makefiles are understandable -- I can get the whole process in my head. Tom Tromey is either exactly the person to fix autotools, considering that he wrote automake, or exactly the wrong person, depending on your religion. And GNU make is actually quite an OK language for expressing functional dependencies.

For now I'm going to hold off on trying to use Quagmire for a while at work, given that I need to do some static linking stuff on the mac. But it's an idea that will be in the back of my mind.

spamming the world into submission

25 April 2008 7:02 PM (guile | gnome | version bump | angry frenchman)

This morning's cafe-hack has brought forth a right jewel, Guile-GNOME 2.15.98. Since simplifying the base GObject wrapper, things seem to have gone fairly well. So with that bit out of the way, I decided we're ready to go stable, finally. The next release will be 2.16.0, which wraps the GNOME developer's platform at version 2.16: GTK+ 2.10, GLib 2.10, Pango 1.14, etc. More about that in a couple weeks.

Part of this release was to bump the "major version" already, in order to shake out any bugs in that mechanism. I mentioned this to Edward today at lunch, and he said that it makes two major version bumps today. Happy birthday Edward!

metablog

23 April 2008 12:54 PM (meta | tekuti | scheme)

Well, it's been a little while since I wrote about tekuti, my Scheme-powered, git-backed blog software. Time for an update.

features

I added a global tag cloud, in addition to the abridged cloud on the main page. Clicking on a tag takes you to a time-ordered list of posts having that tag, with a cloud of related tags. The post list could use some improvement, mostly because my titles are nonsensical.

I also implemented full-text search, which uses git grep under the hood. Amusing.

Also amusing is the "related posts" list, which shows up individual post pages. It's calculated as the set of posts which share the most tags with the post in question. The indexes that are automatically rebuilt when the master ref changes makes this a relatively cheap set to compute.

Also also: some artificial intelligence anti spam foo. Ha!

This stuff is actually fun to hack on, and is self-contained -- I'm almost spending more time writing about the features than I did implementing them.

documentation

I fleshed out tekuti's web page today, giving reasonably detailed install, deployment, and hacking instructions. It even includes a description on how to migrate from wordpress. I'd appreciate any comments that folks might have, probably better on this post than via email.

Stop the madness: uninstall PHP from your servers. We can do better than that!

object closure and the negative specification

22 April 2008 9:58 PM (scheme | goops | kiczales | mop | guile | python | object closure)

Guile-GNOME was the first object-oriented framework that I had ever worked with in Scheme. I came to it with all kinds of bogus ideas, mostly inherited from my C to Python formational trajectory. I'd like to discuss one of those today: the object closure. That is, if an object is code bound up with data, how does the code have access to data?

In C++, object closure is a non-problem. If you have an object, w, and you want to access some data associated with it, you dereference the widget structure to reach the member that you need:

char *str = w->name;

Since the compiler knows the type of w, it knows the exact layout of the memory pointed to by w. The ->name dereference compiles into a memory fetch from a fixed offset from the widget pointer.

In constrast, data access in Python is computationally expensive. A simple expression like w.name must perform the following steps:

  1. look up the class of w (call it W)

  2. loop through all of the classes in W's "method resolution order" --- an ordered set of all of W's superclasses --- to see if the class defines a "descriptor" for this property. In some cases, this descriptor might be called to get the value for name.

  3. find the "dictionary", a hash table, associated with w. If the dictionary contains a value for name, return that.

  4. otherwise if there was a descriptor, call the descriptor to see what to do.

This process is run every time you see a . between two letters in python. OK, so getattr does have an opcode to itself in CPython's VM instruction set, and the above code is implemented mostly in C (see Objects/object.c:PyObject_GenericGetAttr). But that's about as fast as it can possibly get, because the structure of the Python language definition prohibits any implementation of Python from ever having enough information to implement the direct memory access that is possible in C++.

But, you claim, that's just what you get when you program in a dynamic language! What do you want to do, go back to C++?

straw man says nay

"First, do no harm", said a practitioner of another profession. Fundamental data structures should be chosen in such a way that needed optimizations are possible. Constructs such as Python's namespaces-as-dicts actively work against important optimizations, effectively putting an upper bound on how fast code can run.

So for example in the case of the object closure, if we are to permit direct memory access, we should allow data to be allocated at a fixed offset into the object's memory area.

Then, the basic language constructs that associate names with values should be provided in such a way that the compiler can determine what the offset is for each data element.

In dynamic languages, types and methods are defined and redefined at runtime. New object layouts come into being, and methods which operated on layouts of one type will see objects of new types as the program evolves. All of this means that to maintain this direct-access characteristic, the compiler must be present at runtime as well.

So, in my silly w.name example, there are two cases: one, in which the getattr method is seeing the combination of the class W and the slot name for the first time, and one in which we have seen this combination already. In the first case, the compiler runs, associating this particular combination of types with a new procedure, newly compiled to perform the direct access corresponding to where the name slot is allocated in instances of type W. Once this association is established, or looked up as in the second case, we jump into the compiled access procedure.

Note that at this point, we haven't specified what the relationship is between layouts and subclassing. We could further specify that subclasses cannot alter the layout of slots defined by superclasses. Or, we could just leave it as it is, which is what Guile does.

Guile, you say? That slow, interpreted Scheme implementation? Well yes, I recently realized (read: was told) that Guile in fact implements this exact algorithm for dispatching its generic functions. Slot access does indeed compile down to direct access, as far as can be done in a semi-interpreted Scheme, anyway. The equivalent of the __mro__ traversal mentioned in the above description of python's getattr, which would be performed by slot-ref, is compiled out in Guile's slot accessor generics.

In fact, as a theoretical aside, since Guile dispatches lazily on the exact types of the arguments given to generic functions (and not just the specializer types declared on the individual methods), it can lazily compile methods knowing exactly what types they are operating on, with all the possiblities for direct access and avoidance of typechecking that that entails. But this optimization has not yet entered the realm of practice.

words on concision

Python did get one thing right, however: objects' code access their data via a single character.

It is generally true that we tend to believe that the expense of a programming construct is proportional to the amount of writer's cramp that it causes us (by "belief" I mean here an unconscious tendency rather than a fervent conviction). Indeed, this is not a bad psychological principle for language designers to keep in mind. We think of addition as cheap partly because we can notate it with a single character: "+". Even if we believe that a construct is expensive, we will often prefer it to a cheaper one if it will cut our writing effort in half.

Guy Steele, Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO (p.9)

Since starting with Guile, over 5 years ago now, I've struggled a lot with object-oriented notation. The problem has been to achieve that kind of Python-like concision while maintaining schemeliness. I started with the procedural slot access procedures:

(slot-ref w 'name)
(slot-set! w 'name "newname")

But these procedures are ugly and verbose. Besides that, since they are not implemented as generic functions, they prevent the lazy compilation mentioned above.

GOOPS, Guile's object system, does allow you to define slot accessor generic functions. So when you define the class, you pass the #:accessor keyword inside the slot definition:

(define-class <foo> ()
  (bar #:init-keyword #:bar #:accessor bar))

(define x (make <foo> #:bar 3))
(bar x) => 3
(set! (bar x) 4)

Now for me, typographically, this is pretty good. In addition, it's compilable, as mentioned above, and it's mappable: one can (map bar list-of-x), which compares favorably to the Python equivalent, [x.name for x in list_of_x].

My problem with this solution, however, is its interaction with namespaces and modules. Suppose that your module provides the type, <foo>, or, more to the point, <gtk-window>. If <gtk-window> has 54 slots, and you define accessors for all of those slots, you have to export 54 more symbols as part of your module's interface.

This heavy "namespace footprint" is partly psychological, and partly real.

It is "only" psychological inasmuch as methods of generic functions do not "occupy" a whole name; they only specify what happens when a procedure is called with particular types of arguments. Thus, if opacity is an accessor, it doesn't occlude other procedures named opacity, it just specifies what happens when you call (opacity x) for certain types of x. It does conflict with other types of interface exports however (variables, classes, ...), although classes have their own <typographic-convention>. *Global-variables* do as well, and other kinds of exports are not common. So in theory the footprint is small.

On the other hand, there are real impacts to reading code written in this style. You read the code and think, "where does bar come from?" This mental computation is accompanied with machine computation. First, because in a Scheme like Guile that starts from scratch every time it's run, the accessor procedures have to be allocated and initialized every time the program runs. (The alternatives would be an emacs-like dump procedure, or R6RS-like separate module compilation.) Second, because the (drastically) increased number of names in the global namespace slows down name resolution.

lexical accessors

Recently, I came upon a compromise solution that works well for me: the with-accessors macro. For example, to scale the opacity of a window by a ratio, you could do it like this:

(define (scale-opacity w ratio)
  (with-accessors (opacity)
    (set! (opacity w)
          (* (opacity w) ratio))))

This way you have all of the benefits of accessors, with the added benefit that you (and the compiler) can see lexically where the opacity binding comes from.

Well, almost all of the benefits, anyway: for various reasons, for this construct to be implemented with accessors, Guile would need to support subclasses of generic functions, which is does not yet. But the user-level code is correct.

Note that opacity works on instances of any type that has an opacity slot, not just windows.

Also note that the fact that we allow slots to be allocated in the object's memory area does not prohibit other slot allocations. In the case of <gtk-window>, the getters and setters for the opacity slot actually manipulate the opacity GObject property. As you would expect, no memory is allocated for the slot in the Scheme wrapper.

For posterity, here is a defmacro-style definition of with-accessors, for Guile:

(define-macro (with-accessors names . body)
  `(let (,@(map (lambda (name)
                  `(,name ,(make-procedure-with-setter
                            (lambda (x) (slot-ref x name))
                            (lambda (x y) (slot-set! x name y)))))
                names))
     ,@body))

final notes

Interacting with a system with a meta-object protocol has been a real eye-opener for me. Especially interesting has been the interplay between the specification, which specifies the affordances of the object system, and the largely unwritten "negative specification", which is the set of optimizations that the specification hopes to preserve. Interested readers may want to check out Gregor Kiczales' work on meta-object protocols, the canonical work being his "The Art of the Metaobject Protocol". All of Kiczales' work is beautiful, except the aspect-oriented programming stuff.

For completeness, I should mention the java-dot notation, which has been adopted by a number of lispy languages targetting the JVM or the CLR. Although I guess it meshes well with the underlying library systems, I find it to be ugly and non-Schemey.

And regarding Python, lest I be accused of ignoring __slots__: the getattr lookup process described is the same, even if your class defines __slots__. The __slots__ case is handled by descriptors in step 2. This is specified in the language definition. If slots were not implemented using descriptors, then you would still have to do the search to see if there were descriptors, although some version of the lazy compilation technique could apply.

readings, and expoundings thereon

12 April 2008 11:11 PM (mako | martin | democracy | anarchism | smalltalk | alan kay | lambda the ultimate)

the reading

Alan Kay's The Early History of Smalltalk.

New ideas go through stages of acceptance, both from within and without. From within, the sequence moves from "barely seeing" a pattern several times, then noting it but not perceiving its "cosmic" significance, then using it operationally in several areas, then comes a "grand rotation" in which the pattern becomes the center of a new way of thinking, and finally, it turns into the same kind of inflexible religion that it originally broke away from.

the expounding

The most interesting individual I ran into at last month's Campaign for Real Ale and Common Lisp meeting in LA was a Smalltalk hacker. I was showing off tekuti, boasting about how you could hack it from Emacs while it's running via GDS, the Guile debugging system, and the fellow scoffed, his beard moving up and down in laughter.

Smalltalk is a language, yes, he explained, but it's also a system: an integrated environment that you hack from the inside. The editor is not separated from the program via a process boundary: the objects of the program are immediately available to tinker with. So, for example, SLIME is but a pale echo of what is available to a Smalltalk hacker.

I admit I don't fully get this -- how do processes and threads enter into the picture? How can so much imperative state-modification scale to programming in the large? But what I do know is that especially in the young and naive field of computer science (ooh, a widget!), we forget more than we know. SLIME (and Guile's GDS) undoubtably does form a kind of message-passing, but it is necessarily limited, having been bolted on, not existing as a native linguistic construct.

Already now, I am unable to fully use the CPUs that I have at work. The programs I run are too primitive to make use of all of its cores. Processing capability is fast outracing the Von Neumann bottleneck, creating a "market" for different kinds of programs, ones that run in heterogenous, asynchronous environments -- distributed computing and NUMA and all of that.

Something new has to come, and the new thing will be built on the best strains of the past: ideas that cut with the grain of multicentered programming. These ideas have nothing to do with C++ or with pthreads.

This would be the point at which I should say that we have to mine the past to create a new future, but I am a lisp weenie, and so I hold that the past is the future, in some form. Also, regarding Smalltalk, Lambda, the ultimate political party: we lispers are too afraid of smalltalkers to call them our own. So it appears that the present will again slump into a future of an uneasy détente.

the reading

The Geek Shall Inherit the Earth: My Story of Unlearning, by Benjamin Mako Hill.

Before [taking a college course called "Cyberlaw"], I had only considered free software and my involvement with free software advocacy and development as a tool I might use to accomplish other goals and activities. Before this point I saw myself as writer, a programmer, or a computer scientist. At this point, I was exposed to people who were, first and foremost, activists, idealists, and advocates of a system that I understood. After this point, I allowed myself to be seen--by others and by myself--as an activist and a radical, first and foremost--on my terms and in a way that I felt completely comfortable with.

the expounding

Mako writes about his personal history, a story of his journey to and then with free software, and then to a larger conception of what it means to be a human being. His holistic approach to existence refreshes the soul. There are ties between what we do in the day, what we do in the night, and what happens in the world: to have that much-dream'd new world, we must change the practice of our days.

Also, an afterthought. As time goes by, I slip farther down Google's "wingo" search results. I was convinced this was a sign of the expanding internet, or a castigation for egotism, but today I was consoled a bit, seeing Mako, a much, much better known person to the internets, out of the top five for his name.

While I am not in a position to judge myself, certainly in Mako's case this is not a good thing: Mako's writings are much more interesting and important than the price of whatever stock is denoted by the MAKO ticker.

the reading

Democracy without elections, by Brian Martin (1995).

It should be a truism that elections empower the politicians and not the voters. Yet many social movements continually are drawn into electoral politics.

[Elections operate] to bring mass political activity into a manageable form: election campaigns and voting. People learn that they can participate: they are not totally excluded. They also learn the limits of participation. Voting occurs only occasionally, at times fixed by governments. Voting serves only to select leaders, not to directly decide policy. Finally, voting doesn't take passion into account: the vote of the indifferent or ill-informed voter counts just the same as that of the concerned and knowledgeable voter. Voting thus serves to tame political participation, making it a routine process that avoids mass uprisings. The expansion of suffrage helps to reduce the chance that a revolt by an oppressed or excluded group will be seen as justified; with the vote, it is easy for others to claim that they should have used 'orthodox channels.'

the expounding

The baldfaced fascism of the current American regime produces in many a sense of nostalgia for the benign Clinton regime, the one in which "the others" were consigned to opposition, and "we" controlled the presidency. The daily fire-and-motion given to us by the American news media machine, discussing the topics given to them by White House spokesmen, in White House terminology, makes the intellect so absorbed in reaction that constructive production is difficult, and impossible for many.

But let us not forget that even the conservative UNICEF estimates that 500,000 Iraqis, mostly children, died as a result of the Clinton-imposed Iraq sanctions. The real number is assuredly higher, and only with the most assiduous efforts is the Bush administration finally able to contemplate surpassing this number.

This is Genocide, by "both" parties.

Electing a "Democrat", of whatever color or gender, will not change this. No progressive change has ever resulted from the election of a so-called leftist politician: changes are the result of the grassroots action of organized people. Changes in the figureheads are a mere symptom.

I used to have the "unsophisticated" viewpoint described by Martin, in which I viewed both parties as fundamentally serving the cause of institutionalized violence, and therefore equal. Well while I might believe the former, the latter is not the case: strategic voting has effects. See for example the Communist party's abstention in the German elections in '33 or so, in which they considered the National Socialists to be equal to the Christian Democrats, and thus abstained with tragic results. The truth is that you always have to have your values with you, to see the choices that are available -- and, like Mako says, often they are not the choices given -- and to make your choices according to your values.

the listening

I adore Sonic Youth's "Washing Machine".

git: a transcriptional database

12 April 2008 7:11 PM (git | poetry | khayyam | translation | scheme | tekuti)

I had a thought yesterday while biking into town. Git is a transcriptional database -- it writes and writes and writes, and what we are left with is its transcript. I wouldn't call it a transactional database, since it has no rollback operator. It doesn't need one. Ref updates either succeed or fail. If they fail, well, write, write again:

The Moving Ref writes; and, having commit,
Moves on: cosmic Rays nor Zero-blit
Shall untrue its blobs, its trees Unbind,
Nor all your Pushes flip a Single bit.

It is perhaps not as beautiful as Fitzgerald's translation, but the bar was set quite high. To compensate, here is what is, to my knowledge, the first translation of Khayyam into Scheme:

(define (git-update-ref refname proc count)
  (let* ((ref (git-rev-parse refname))
         (commit (proc ref)))
    (cond
     ((zero? count) #f) ; failure
     ((false-if-git-error
         (git "update-ref" refname commit ref))
      commit)
     (else
      (git-update-ref (git-rev-parse refname) (1- count))))))

The rest of the text may be found in tekuti.

allocate, memory (part 2 of n)

11 April 2008 6:11 PM (guile | goops | metaobject protocol)

In my last installment, I started by saying something about a patch, then got lost in the vagaries of garbage collection. Hopefully in this writing product I can reign in the dangling pointers.

bit twiddling

If you are programming in an environment with a garbage collector, and you want to expose C resource to your managed environment, you will have to cooperate with the garbage collector. Specifically, you will need to know when an object becomes garbage, so that you can deallocate its resources.

In Guile, the historic way to do this is via a specific kind of boxed value, the "small object", or "SMOB". A SMOB is a double- or quad-word object whose first word is a tag designating the type of the object, and the rest is for C code to manipulate. SMOB types have to be registered with the Guile runtime, and have type-specific free, printing, and marking functions.

SMOBs are impoverished objects, however. There are only 8 bits in which to store the SMOB type, and they must be registered manually in C. It would be impossible to associate one SMOB type with each GType, for instance. So for Guile-GNOME, which is where I'm really getting with all of this, you have to wrap C objects on two levels: one generic SMOB for GTypeInstances, and one more object-oriented wrapper that exposes the GType.

This two-level wrapping is ugly, confusing, and bug-prone. Thinking about this, and looking at the (gnome gobject) module with an eye to a stable, supportable interface, I wondered: is there not some other way to get free() notifications from the garbage collector?

As you might guess, the answer is yes. But to fully explain, in my most verbose fashion, we'll have to take a look at how objects are represented in the Lisp family of languages. The following discussion is specific to Guile, but shares fundamental characteristics with Common Lisp and other standard Lisp systems.

objects in scheme

Guile's object system, GOOPS, is derived from TinyCLOS, via STklos. This includes a full meta-object protocol, so all details of e.g. instance, slot, and class allocation are fully extensible, while maintaining compilability. The following graphic describes the memory layout produced by the default allocation protocol:

click for the svg

Instances contain two important words, one to point their classes, and one to point to a data array. The data array is as long as the number of Scheme objects that need be associated with the instance: the set of slots associated with the object. Of course, other slot allocation strategies are possible, for example storing slot values in a hash table as Python does by default.

By way of illustration, the process of slot access, via (slot-ref foo 'bar) goes like this:

  1. Get the class of the object (dereferencing the vtable pointer)

  2. Find the allocation information for the slot

    • It is part of the class' "slot-definitions" slot

  3. If the slot is allocated in the data array, there is a fast-path array access

  4. Otherwise the slot-definition provides getter and setter procedures for the slot

In effect, the class tells you how to read the instance. The allocation lookup process can be optimized: if the class of the object is known beforehand, either via inferencing or declaration, the lookup can be performed at compilation time. Otherwise, and more likely, an accessor can be made that compiles getters and setters on the first lookup.

and then I was like, anyway..

It turns out that two little-known facts enable us to shove C pointers into GOOPS objects. First, slots in the default data array may be allocated either as Scheme values (the normal case), or as raw, untagged words. The latter case allows us to put random C data in Scheme object without confusing the garbage collector or the interpreter. Second, classes have such a "raw word" slot containing a pointer to a free function that will be called when instances are freed, ostensibly to free the data array. However we can override this value to perform type-specific deallocation, and then free the data array.

This realization let me collapse the old memory layout:

click for the svg

into this:

click for the svg

The sum total is that now a wrapper for a GObject is just 3 words: the vtable and data pointers, and a 1-element array to hold the GTypeInstance* pointer. That is, 12 or 24 bytes, for 32- or 64-bit words, respectively. (There are a couple of inefficiencies that increase this number of words to about 6, but those will go away with time.)

That patch took me about 3.5 months to bake fully, and is present in the newly-released Guile-GNOME 2.15.97. I haven't had time to update the online documentation yet, but the number of exported procedures and variables is down by about 100 or 150, with no loss in expressive power. I'm pretty pleased!

allocate, memory (part one of n)

10 April 2008 7:33 PM (nargery | scheme | guile | garbage collection | memory management | emacs | conservative)

In this article, I'd like to justify to myself in public a patch I've been cooking for a few months. It's a bit of nargery (I love that word, thank you Thomas), but hey, I'm a bit of a narg. Well, maybe more than a bit: I'm even going to break this into two parts.

in theory

So one of the things that all high-level computer languages have in common is a garbage collector. (Notice how I implicitly define what an HLL is; sly, that.) What it means is that instead of having to manually allocate and free memory, you just allocate with wild abandon, and trust the computer to figure out what's junk and what's not.

There are a number of algorithms for this "junk detection" used in various systems, but most of them involve, at some stage, a process by which a set X of "live" objects is progressively enlarged to include all objects which are referenced by anything in set X. Objects which are not in that set are then thrown away.

So if I have a list in set X, eventually after running this "mark" phase, all of the elements of the list will also be in set X.

The trick is to determine what exactly is in set X at the start of the mark phase. This set is commonly called the "root set", because those objects form the roots of the live object tree. Global variables would belong to the root set, for example. Another important source of live object roots is the stack: the implicit chain of computations that is waiting to complete at any given moment, along with all of the resources needed to complete that computation.

in practice

So enough with the theory. Guile, being a Scheme interpreter designed to embed well with C, shares C's stack. It has no real idea how things are laid out on the stack itself, yet it somehow has to know which objects referenced on the stack are live Scheme objects (and not e.g. integers, C structures, etc.)

The strategy that Guile uses is known as "conservative" garbage collection. This means that it scans the memory region between the base of the stack and the current stack pointer, treating every word as if it were a Scheme object. If that word has the right tag, and it points to other scheme objects in the heap, then it is deemed a part of the root set.

Obviously, other arrangements of bits, for example certain long integers, will also have this property. That is why this scheme (ha!) is known as "conservative", because "set X" is known to be a superset of the set of live objects. The worst that could happen would be a memory leak, not memory corruption.

The alternative to conservative marking is precise marking, like Python does with its refcounts, or like Emacs does with explicitly adding values to the root set (via the GCPRO macros). But it's a lot of bookkeeping, and really easy to make mistakes. Combine that with continuations and dynamic-wind and you have yourself a right mess.

a pause

And with that, I'll rest for a bit, and then see if I can continue tomorrow. Happy hacking!