on the impossibility of composing finalizers and ffi
While poking the other day at making a Guile binding for Harfbuzz, I remembered why I don’t much do this any more: it is impossible to compose GC with explicit ownership.
Allow me to illustrate with an example. Harfbuzz has a concept of blobs, which are refcounted sequences of bytes. It uses these in a number of places, for example when loading OpenType fonts. You can get a peek at the blob’s contents back with hb_blob_get_data, which gives you a pointer and a length.
Say you are in LuaJIT. (To think that for a couple years, I wrote LuaJIT all day long; now I can hardly remember.) You get a blob from somewhere and want to get its data. You define a wrapper for hb_blob_get_data:
local hb = ffi.load("harfbuzz") ffi.cdef [[ typedef struct hb_blob_t hb_blob_t; const char * hb_blob_get_data (hb_blob_t *blob, unsigned int *length); ]]
Presumably you then arrange to release LuaJIT’s reference on the blob when GC collects a Lua wrapper for a blob:
ffi.cdef [[ void hb_blob_destroy (hb_blob_t *blob); ]] function adopt_blob(ptr) return ffi.gc(ptr, hb.hb_blob_destroy) end
OK, so let’s say we get a blob from somewhere, and want to copy out its contents as a byte string.
function blob_contents(blob) local len_out = ffi.new('unsigned int') local contents = hb.hb_blob_get_data(blob, len_out) local len = len_out[0]; return ffi.string(contents, len) end
The thing is, this code is as correct as you can get it, but it’s not correct enough. In between the call to hb_blob_get_data and, well, anything else, GC could run, and if blob is not used in the future of the program execution (the continuation), then it could be collected, causing the hb_blob_destroy finalizer to release the last reference on the blob, freeing contents: we would then be accessing invalid memory.
Among GC implementors, it is a truth universally acknowledged that a program containing finalizers must be in want of a segfault. The semantics of LuaJIT do not prescribe when GC can happen and what values will be live, so the GC and the compiler are not constrained to extend the liveness of blob to, say, the entirety of its lexical scope. It is perfectly valid to collect blob after its last use, and so at some point a GC will evolve to do just that.
I chose LuaJIT not to pick on it, but rather because its FFI is very straightforward. All other languages with GC that I am aware of have this same issue. There are but two work-arounds, and neither are satisfactory: either develop a deep and correct knowledge of what the compiler and run-time will do for a given piece of code, and then pray that knowledge does not go out of date, or attempt to manually extend the lifetime of a finalizable object, and then pray the compiler and GC don’t learn new tricks to invalidate your trick.
This latter strategy takes the form of “remember-this” procedures that are designed to outsmart the compiler. They have mostly worked for the last few decades, but I wouldn’t bet on them in the future.
Another way to look at the problem is that once you have a system working—though, how would you know it’s correct?—then you either never update the compiler and run-time, or you become fast friends with whoever maintains your GC, and probably your compiler too.
For more on this topic, as always Hans Boehm has the first and last word; see for example the 2002 Destructors, finalizers, and synchronization. These considerations don’t really apply to destructors, which are used in languages with ownership and generally run synchronously.
Happy hacking, and be safe out there!
9 responses
Comments are closed.
Surely the language semantics can just specify a definite construct for lifetime extension? If it’s in the spec, no need to assume the absence of “tricks”.
More than anything, I think this exposes the weakness of an native FFI interface with raw pointers. In the LuaJIT example, the runtime has no idea of the relationship between “blob” and “contents”, so the GC can run wild and free “blob” first.
By contrast: in gobject-introspection, the return here would be tagged (transfer none) - and a copy would be made before it’s ever exposed to the GC’ed language.
If you don’t want that copy to be made - if you want to map hbblobt to a language-native buffer interface in some way - then I think the binding has to be hand-crafted in a language with explicit, GC visible references.
Maybe in this case, you could make hfblobt a single chunk of memory without a finalizer, so that the GC knows that contents is a pointer inside the block of memory pointed to by blob, but that’s in no way general.
In the LuaJIT example, blob is reachable throughout the function by the blob argument, so it can not be collected until the function returns. ffi.string copies the data. What is the issue?
https://luajit.org/extffisemantics.html#gc Here blob is “on the Lua stack”, as there is no difference between a function argument and a local.
Andy, you’ve made some nice points here. If I can add just a bit more:
You wrote that “there are but two workarounds”, but I think you’ve missed what is often the only real workaround:
Your wrapper objects need a ‘valid?’ flag, and you have to arrange to mark all of the right objects as invalid when one is freed. In your HarfBuzz example, when a font object is freed, all blobs which were embedded in the font must be invalidated as well. So you need to do some bookkeeping behind the scenes.
(Your bookkeeping code may need to make careful use of weak references to avoid holding all the HarfBuzz objects in memory forever.)
Sorry! Clicked the submit button a bit too fast there!
You also need to make sure that your font wrapper is not freed before your blob wrapper. Right.
I think the general solution is to give the font wrapper object a ‘free’ method, and ensure you call it in the right places. In other words, you can’t rely on GC finalizers to free the external resource.
What sucks about this is that if your font object is encapsulated in some other object, that object also needs a ‘free’ method, and so on (recursively). In the worst case, you can end up managing the lifetimes of almost all your objects manually, à la C. The slap in the face is that your GC must duplicate the same work to ensure that managed heap memory is also freed!
This does solve the problem reliably, though, and you don’t have to worry about “the compiler and GC learning a new trick”. You do have to worry about getting the calls to ‘free’ wrong.
For a safe FFI interface over raw pointers (capabilities...) CHERI (and Cornucopia) offers enough architectural capability to give untrusted unmanaged code a raw pointer without being able to read through it after free; one can hope a enterprising cloud provider ships it!
In the .NET world, we have GCHandle, which is simply a wrapper around a reference to a GC-managed object that has explicit lifetime semantics (i.e. you need to explicitly destroy it for GC to consider it to not be a root anymore).
Very interesting read. Recently, I’ve been working with FFI a lot recently with libffi and also the native interface Deno runtime exposes and can confirm finalization of native objects is a difficult problem indeed. But it is possible to solve, in my opinion.
Here’s what I think about it:
For one, FFI is inherently unsafe and one needs to be well-versed in manual memory management. If you are writing to expose a low-level API using FFI to a garbage-collected language, you need to be familiar with the GC semantics as well, to expose a nicely wrapped module for consumption in that targeted language without the users having to consider manual memory management. Goal in that case is just to integrate very well.
Garbage collectors for sure aren’t always reliable, and your object may not even ever get released. That’s why it’s important to offer something like a free or release method as well.
Quoting this: > attempt to manually extend the lifetime of a finalizable object, and then pray the compiler and GC don’t learn new tricks to invalidate your trick.
I don’t think this is necessarily a workaround. This may be the correct solution. If another object (say, blob data) obtained from the original object wrapper (blob itself) depends on the lifetime of original object, we can extend the lifetime in a way that the new object holds a strong reference to the original object. So even if you lose a reference to the original object, and the new object is in use, it will stay alive at least as long as that new object. There are nicer ways to do something like this in JavaScript, we have WeakMaps! The key is the garbage-collectable (blob contents) and the value (blob, which the contents depend on) is the one being held for at least as long as the lifetime of the key.
Also, I do not see a problem with the blob_contents function. If it was a zero-copy string, it would make sense that we are returning an object that depends on another object but no strong reference is held, which is incorrect. We should explicitly extend the lifetime as mentioned above. But ffi.string copies the string so there is no problem there. The contents pointer is obtained from the blob object and immediately used, how would it get collected in between? Unless you are passing around the blob pointer instead of the wrapper in user-land, which is like inviting for all sorts of memory related problems.
I’d like to mention an interesting problem we faced using Deno FFI related to garbage collection. So Deno FFI allows you to pass Uint8Arrays (as uint8_t *) with zero-copy, essentially the underlying pointer. But then the responsibility is on us to not use that pointer after the array is collected.
What we’d done in a user-land module (to expose SQLite3 bindings, very high performance thanks to v8 fast call api and turbocall JIT in Deno FFI implementation!) is, when passing a string, we first encode it to an Uint8Array and then pass it to FFI. So an intermediate buffer is created, of which reference is not really held anywhere. Now there are two ways to pass buffers to FFI in Deno. One is to set the receiver type to “buffer” where you can pass Uint8Arrays (fast path) or other typed arrays (non-fast), but if you use “pointer” type, you have to first explicitly convert the typed array to pointer using Deno.UnsafePointer.of.
The first way to pass it, is the best one. V8 would pass the underlying pointer to the native API directly for us while also keeping the buffer alive for at least until the FFI call returns. But in the other approach, the buffer could be collected as it was not even stored in a local variable but simply inlined like this: ffifunction(..., Deno.UnsafePointer.of(encodeCString(str)), ...). I discovered that even this can crash if the code runs repeatedly, when doing benchmarks for example! We simply started doing ffifunction(..., encodeCString(str), ...) and the crash was fixed.