abusing the c compiler
code reading
Today I found something really neat in Larceny's foreign function interface. The deal is that often times you need to parse a C structure or a preprocessor definition, and man, parsing C makes a body feel lazy. What's a hacker to do?
Larceny has an amusing take on this problem. The code looks straightforward enough:
;; parse out ent->d_name as a string (define (dirent->name ent) (define-c-info (include<> "dirent.h") (struct "dirent" (name-offs "d_name"))) (%peek-string (+ ent name-offs)))
The define-c-info block calculates name-offs, which is the offset of d_name in the dirent structure. %peek-string is something internal to Larceny that takes a memory address of a NUL-terminated C string and returns a Scheme string.
I had imagined, looking at this, that they had some kind of database of the headers and such, and in a sense they do -- in the form of the C compiler. define-c-info is a macro that runs the C compiler at macro expansion time, compiling and running a generated C program that spits out the relevant information as an s-expression on its stdout.
some people like diagrams
So in this case, if the d_name field starts 11 bytes into the structure, the generated C program will print out (11) on its stdout, and that number gets read in and inserted into the program. In that way dirent->name expands to something like:
(define (dirent->name ent) (define name-offs 11) (%peek-string (+ ent name-offs)))
Cool, no? The C compiler is only needed at compile-time, not at run-time.
Further details can be seen at Felix Klock's 2008 paper on Larceny's FFI.
2 responses
Comments are closed.
The first time I have seen the idea to use the C compiler like that was in Scheme->C [1] compiler (cdecl/sizeof.c). But to be precise it did it only once and afterwards used the collected sizeof+padding info.
There is also a very neat idea like this in the Go implementation. [2] They parse the debugging information from the assembly output of the host compiler.
[1]: https://alioth.debian.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=scheme2c/scheme2c.git;a=blob;f=cdecl/sizeof.c
[2]: http://golang.org/cmd/godefs/
Although in Scheme->C that's a bit pointless and one day when I have time it will get simplified. Since we can inline C code we can just use the offsetof macro.
http://0xab.com/code/matlab-scheme.git/blob/HEAD:/c-macros.sch#l87