documenting language bindings

18 July 2007 7:39 PM (scheme | gnome | guile | texinfo | documentation | cairo | haskell)

I had lunch with Jao before he left for Zurich, throwing around ideas over copious wine. He made the observation that a lack of good documentation is a limit to software's potential impact, with particular reference to the pile of code that I've written or maintained. Point taken! With that thought in mind, I've been trying to focus more on documentation. In this blagpost I will summarize work on documenting language bindings.


Duncan Coutts, the gtk2hs maintainer, called me out at FOSDEM earlier this year. I was presenting about Guile-GNOME, and sheepishly mentioned that the entire project was undocumented. Duncan was kind enough to point me to how gtk2hs does things.

Their bindings are automatically generated at the beginning, then tweaked and maintained by hand. The Haskell folks have developed a documentation system that involves specially formatted comments in the code, similar to Javadoc or gtk-doc. When the bindings are first autogenerated, their generator produces the documentation as well.

The documentation actually comes from the docbook files produced by gtk-doc; that is to say, from the upstream C documentation. They run some basic search and replace operations on the output so that it is more "Haskelly". Seems to be a reasonable way to bootstrap the documentation, and the HTML output certainly looks good.


I asked Murray what gtkmm does, and it seems that they do something similar. The difference is that they combine the C documentation from the docbook files into one XML API-cum-documentation file, then generate their documentation from that.

Also, presumably since C++ is similar enough to C, gtkmm bindings regenerate their documentation all the time, customizing the documentation for only about 5% of the functions. Customizations are maintained in a separate overrides file. The output documentation is made with Doxygen, which looks OK but not as nice as Haddock.


A bit tipsily last night I prompted Andrew Cowie to opine about the same topic. Java-gnome people are apparently awash with contributors, as they decided at some point that they would hand-write all of their language bindings. They do the same with their documentation -- all written by hand, and processed with javadoc. The HTML documentation looks OK, better than Doxygen but it seems that the wrapper itself is incomplete.


As far as I know, the most excellent pygtk documentation seems to be completely written by hand.


Guile-GNOME itself is still undocumented, as whichever way I might go, it will be a lot of work. It pays to invest a few weeks figuring out the right way to go.

As a test case, I looked at seeing how difficult it would be to automatically generate documentation for Guile-Cairo. I cannot write all documentation by hand; it is too much. Instead I looked at reusing the technique from gtkmm/gtk2hs, munging the docbok generated by gtk-doc.

There are three documentation formats that are equally important to me, and one that is less important.

The first one is HTML, so that someone browsing the project's web page can see the status of the binding.

The second one is "online" documentation, so that when I am at the Guile listener I can type (help cairo-get-extents) and get good documentation. (This is also the case when I hack in Emacs with Guile-Debugging, and I type C-h g. More on that later.)

Thirdly we have local searchable documentation, either via Info or via devhelp. Lastly, we have hardcopy output as PDF.

For me and for Scheme users, these requirements point to texinfo as the intermediate format. I can generate good-looking PDF output with indexes, HTML output, and Info, which is actually quite OK. In addition I can write out a text representation of the texinfo into a docstring file, which allows the documentation to be available at runtime without incurring memory use penalties.

I don't have the documentation-generation code nicely packaged yet, but I'm pushing it into other projects. I think I need to let it sit for a while, to see if I actually want to undergo the pain of documenting the bindings for the GNOME stack, given that even for Guile-Cairo some work remains. Anyway, that's the hack of the last few weeks. If you are a bindings author, or less likely, are a would-be Scheme hacker, and are at GUADEC, pull me aside and we can chat about such things. You will know me by the extra-large chops.

8 responses

  1. Duncan Coutts says:

    Thanks for bringing this up again Andy.

    Seems to me that the ideal thing would be for gtk-doc to be able to produce some nice xml/lisp format output (not docbook! - it's almost a write-only format, it doesn't preserve nearly enough semantic information).

    So much like we have these standardised .def files which contain the gtk api info, we should have something similar for the gtk-doc output. Then each project could take that and translate it into whatever format they use, be it markup in source code comments, docbook, texinfo or whatever.

    The worst part of the current system is munging the gtk-doc docbook output files to reconstruct the information into a sane format. It's not quite as bad as html screen scraping, but it's pretty bad.

  2. Peter Russell says:

    Very interesting post. You're quite right that the PyGTK documentation is excellent. I don't think I've ever seen API docs so nicely and neatly done.

  3. brought to you by torres viña sol -- andy wingo says:

    [...] I used some techniques I wrote about previously to generate schemey texinfo from upstream’s docbook for C, with a twist: when generating function docs, we load up the wrapset metadata, and use that to determine which functions are actually in the wrapset, what their arguments are, and if they have a generic function associated with them. [...]

  4. Do my assignment cheap says:

    I likewise actually extremely preferred Javadoc and the Java framework documentation I utilized that for a long time, I found an advantage there was it was somewhat less demanding to make your own particular custom docs for your own classes that streamed well with the framework docs. XCode lets you additionally utilize Doxygen to create documentation for your own particular classes, however, it would take a yet more work to arrange it and additionally the framework class docs, to a limited extent on the grounds that the framework system reports have all the more organizing connected.

  5. Write My assignment Cheap says:

    Since the low-level object interface in Python mirrors the JSON schema exactly, the best, most authoritative source of information for anyone writing bindings for Bokeh are the reference guide sections for the and bokeh.models. In particular, the model's reference has a JSON prototype for every model in the Bokeh object system. These are the currently known projects that expose Bokeh to languages other than Python.

  6. app marketing for android says:

    The fact that in C ++ is inheritance, in C is just a structure in the structure. When programming in the C ++ style, beautiful and sonorous words are used, such as "The Circle class is derived from the Point class" or "The class of the Point inherits from the Circle class and is derived from it." In practice, all this verbiage is that the Point structure is the first field of the Circle structure.

  7. Assignment Help says:

    Many students face the problem of stress in college. Whether it is because of social anxiety, or not being able to clear exam etc. Most of the times, students get stressed due to their piled up assignments. If it is with you also then you might seek for Assignment Help.

  8. reviews says:

    This is where you might have started your search for assignment help.The previous consumers give their feedback on the website to let the other users know about the quality of their service. For example, if you go to the allassignmenthelp Reviews, you can get to know the opinion of different users about their services. You can also check other websites to compare and select the best service for you.

Leave a Reply