We got some new books in at work, among them Peter Norvig's classic Paradigms of Artificial Intelligence Programming (PAIP). It sounds strange, ordering a decade-old book about AI programming in Lisp for a company that works with media from Python and C, but the book has a staying power that outlasts the passage of time.
PAIP is not interesting from an AI perspective. Most all of methods and strategies are outdated at this point. Its strength is in teaching how to optimize programming in a high-level language. See the PAIP retrospective page for how -- skip past the Lisp melancholy at the top, just go directly to "What lessons are in PAIP?" at the bottom.
Lesson 26 on that page is a delightful Alan Perlis quote: "A programming language that doesn't change the way you think about programming is not worth knowing", which was originally published as part of Epigrams of Programming.
I spent the weekend in Paris.
(I'm now basking in the glow of that sentence. Excuse me.)
Went to see the city, and to visit a girl. The former was pretty interesting, the latter left a bit to be desired. Así va la cosa.
My patch made it into 2.6.14-rc5. Neat.
In the shipment that brought us PAIP also came Corbet et al's Linux Device Drivers, third edition. From a brief skim, I find it's a nice complement to Love's Linux Kernel Development -- LDD has more specifics on writing drivers, but LKD is good for an overview of the kernel as a whole.
Unexpected bbq last night. Up drinking and saying stupid things until 4 in the morning. The weather's still nice enough to sit out on the terrace in the evening -- in terms of temperature barcelona wins handily over the french cold wet capital.
Quite pleased to see that Billy Biggs is writing. Yay.
Brian Mastenbrook ruminates on Apple's shift to Intel (via planet lisp). Rather interesting. My initial reaction was that this was bad for computer science to target one architecture, but Brian sees freedom in Apple's JIT compiler (supplied by Transitive).
I occasionally find myself wondering what hardware deficiencies Alan Kay is ranting about -- Neither Intel nor Motorola nor any other chip company understands the first thing about why [the Burroughs B5000] architecture was a good idea, etc. Wha?
A new report on free and proprietary software in public computer labs in Africa is out from the good folks at bridges.org. Skip down to the "Key ground-level findings" for the buzzword-laden summary.
From my experience in Namibia, I think their findings are mostly accurate. Free software has historically worked well only in well-planned, well-supported installations. You can always find random people to administer an isolated installation of windows 98 boxes; linux expertise is much harder to come by. And it is unfortunately true that most computer labs are not sustainable. Do-gooders from $RICH_COUNTRY drop 20 computers in a room, say "go", and then wonder why it doesn't exist two years later.
On the other hand, if well done, free software can be a liberating force in the developing world. Namibia was lucky to have Schoolnet.na, a home-grown organization that focused as much on the human side of computing as the actual hardware.
Went to Madrid last weekend to see my sister Ellen and friend Erinn. Good folks, pleasant town. Hadn't been there for five years, but it's more the same than different.
And, El made me banana bread and gumbo. Taste-o-home in a foreign city.
By star-alignment I met up with some friends from Namibia there in Madrid. They had been travelling four months through west Africa, and had loads of crazy stories to tell. We came back to Barcelona Sunday night, and they flew back to the states this morning. We had a great time listening to all my music from there, our memories of being crunched in bush taxis growing somehow fonder in the distance.
The hardest part of the trip is stopping, though. Two and a half years is a lot of momentum.
Ran across this article on intentional types in dynamic languages the other day. Seems relevant to the D-BUS bindings discussion going on p.g.o, especially the part at the end about input and output. Old is new again.
Fixed a couple bugs in the crossposter, added some regexps to deal with the images.
In a past life I had the, um, experience of working with Monte Carlo N-Particle (MCNP), Los Alamos' nuclear particle interaction simulator. It's used to model the absorption, refraction, and reflection of particles in materials, and also has a component for those interactions to produce new particles. So you can predict the distribution and kind of scattered radiation resulting from bombarding a surface with electrons, for instance. Or you can model a nuclear reactor core. Or a nuclear bomb, a purpose for which it continues to be used.
It's a bit peculiar from a software engineering perspective:
MCNP is written in the style of Dr. Thomas N. K. Godfrey, the principal MCNP programmer from 1975-1989 ... All variables local to a routine are no more than two characters in length, and all COMMON variables are between three and six characters in length ... The principal characteristic of Tom Godfrey's style is its terseness. Everything is accomplished in as few lines of code as possible. Thus MCNP does more than some other codes that are more than ten times larger. It was Godfrey's philosophy that anyone can understand code at the highest level by making a flow chart and anyone can understand code at the lowest level (one FORTRAN line); it is the intermediate level that is most difficult. Consequently, by using a terse programming style, subroutines could fit within a few pages and be most easily understood. Tom Godfrey's style is clearly counter to modern computer science programming philosophies, but it has served MCNP well and is preserved to provide stylistic consistency throughout.
It's from the X division. As in, "Where do you work? The X division. So, what do you do? Sorry, can't tell you that. Please don't ask anything else."
Advogato people, think back to when you first learned how to hack free software. Getting all the auto* confused, figuring out how those macros worked (don't put a space before the paren), making dlopen-able libraries, navigating `info', etc. Bizarre. The extent of the strangeness is evident if we consider the "new project": Who, upon creating a new project, would write the auto* files from scratch? Deep voodoo is best done once and for all.
Above all, though, was CVS. Import, checkin, add, remove, weird options to get new directories into your copy, CVS_RSH, CVSROOT, CVSEDITOR, ssh-agent, etc. More than that was the idea of CVS, that you might record, document, and preserve your changes. Although the first-time CVS user has creating documents for quite some time, the idea that one could version them was completely new. It was a revolution in the way I worked.
Over time, I grew to know CVS like second nature. I still couldn't tag or branch without looking at the redbean book, but with everything else I was a champ. It's like commuting on a bicycle -- after a while you don't even notice it, you just think about where you're going.
I survived like that for five years. Last year, though, I got the itch to hack gstreamer from within guile. I found the old guile-gobject project, took it over as my own, and starting making releases. It was a bit strange because it was split over a CVS archive on gnome.org and one on savannah, but there was nothing I could do about that.
That's about when Andreas Rottmann came in. He took over the FFI generation sub-project and started to hack guile-gnome, adding a couple of wrappers. His changes were on a branch of the FFI package, and the needed changes in guile-gnome on another branch. Then the GNOME platform bindings process began, which demands the whole platform in one tarball. Andreas worked out a way to split guile-gnome into mutiple packages, while retaining the ability to release many from within the same tarball.
The only catch was, to do this I needed to use GNU arch for source management. I didn't want to change. I was happy and productive with CVS, and I didn't even have arch installed on my machine.
I feel like I've been living with Tom Lord for the last nine months. It's like he's over my shoulder, telling me what to name my files, how to build my software, and how to deal with revision control. With apologies to the man himself, whom I've never met, this post chronicles our life together.
(Ha! I tripped your "this is wierd" meter!)
Arch wants to know about everything that is contained in your "project tree". It divides files into five types, which is akin to a manifest type system where regexps are the type predicates. Some regexps are predefined, like files beginning with = or ,. Files that don't match one of the predefined types by default are treated as source.
However, if a file looks like source, but has not been added to the inventory (akin to cvs add), arch will barf. It's as if CVS would stop if you forgot to add a .o file to .cvsignore.
This isn't a huge problem for Tom, because he wants you to build in a separate directory. I won't do that for various reasons. The whole system makes me think of the puritan Protestant approach to economics -- "You've had it good for a long time, we need a contraction to cleanse out the inflation demons. Then we will be reborn and free from sin. This is going to hurt a bit, but we've had it coming." Your project tree should be picked clean as a bone, or else.
Fortunately, one can programmatically construct .arch-inventory files (like .cvsignore) in each directory. I have a script to do that here. In fact, if there is an arch operation that you are lacking, it's usually possible to implement it in sh -- either built on more primitive tla commands, but as a last-ditch effort you can read the archive or control files directly. tla-tools implements a number of them, fix-changelog-conflicts and commit-merge being two of the more useful ones.
I live in the bush of Africa. I carry every byte on my back, walking to my internet connection. I need an offline solution.
Arch provides this in two ways. First, each project tree contains a "pristine" copy of the upstream revision being hacked. That means you can diff and patch your tree without accessing the archive.
Secondly, arch has extensive mirroring capabilities. If you mirror a remote archive locally, you can do any read-only operation on your copy that you could on the remote copy. It is also possible to publish your private archive by mirroring it to a remote location. People can get revisions and make branches from my archive even as I sit here on the homestead.
Without the ability to branch between archives, all of this would be of little value. Arch's ability to branch and merge takes some getting used to, but it's pretty good. Each tree knows the set of patches that have been applied to it, which makes it easy to determine the patches needed from other branches. The star-merge operator automates this process.
I've been pretty happy with all of this. I can do experimental hacking in my branches, only merging them back upstream when the code is in a good state.
I hate being a loser, but I like Scheme too much. My illusions that the rest of the world will one day see the light are dim. I'm almost resigned to hacking in a marginal language, where the pool of available developers is small. The success of projects in such an environment is dependent on the dedication and interest of solitary programmers, because there really isn't anyone else to do it.
That's my rationalization, anyway. How else would one explain guile-gnome? Martin Baulig laid some firm groundwork, making lots of progress with GObject and ORBit wrappers, but the hacking stopped when he did in 2001.
It took two years for development to really start again. In this environment, we need to lower the barriers to development, so that people can take over and hack, even releasing their own versions, all with the full benefits of revision control right from the start. It allows a lone hacker to easily pick up a fallow project without knowing its old maintainer, who may have even dropped off the net.
I'm comfortable with arch now. I wouldn't recommend it to a happy CVS user, but I don't think I will ever go back.
As a final note, I should mention that I can't compare arch to subversion, because I've never used SVN. So don't take this as a polemic. Thanks.
A web log, like this one is, is somewhere between the advogato diary entry and the advogato article. I hope advogato folks don't mind my verbosity.
The count: 75 days until airplane.
Namblish (Namibian broken english) uses the gerund a lot: I am having, I was feeling, etc. I am infected. You are affected.
My learners have found IRC. It's hilarious. I worry about pedophiles a bit -- everyone wants to exchange photos, but I wonder why 30 year old men would want to talk to ninth grade boys. There's not much they could do, though, and I do check up on the kids. Strange to be in this kind of supervisory role at 24.
luis: Yikes, I feel naked! I'll have to get on the administrator's case, who by the way is a KDE contributor. Small world.
Today I worked to make a printer that a local school bought work with linux, specifically SuSE 7.3 (yeah, we're a bit behind). It was ridiculous, and it's totally not the fault of the printer manufacturer.
The printer is a Samsung ML-1210, a 600 dpi laser printer that's specifically advertised to work with linux. Although you'll get a 404 if you try to find the drivers on the manufacturer's web site, Samsung did release it under the GPL, so it's also archived and patched elsewhere. But look at the page I was pointed to!
After a while, I realized that I had to recompile ghostscript. I couldn't believe it. I can't even get a source RPM for 7.3 these days. You have to manually patch makefiles, and with the GNU ghostscript that I downloaded, that's not straightforward process (the directions were wrong).
But let's step back a bit: why the fuck should I have to recompile something to install a printer? Truly boggles the mind, that one.
Anyway, I finally get it working, manually setting the
--prefix so it overwrites the previous installation. Then it turns out SuSE uses some arcane setup for their printing that mandates YaST usage. Let's get it straight: that thing sucks, no matter what Nat Friedman says. But even after fixing Ghostscript, it wouldn't show up in the list of Ghostscript printers!
To cut a long story short, I had to manually edit a strangely-formatted YaST printer database. Why they maintain a list of Ghostscript printers outside of Ghostscript itself, I don't know. But it works now. They're happy. And I'm happy too -- I wasn't expecting to be paid, but I couldn't really refuse at the end, this being on my own time and all.
I see no reason why I shouldn't be the number one wingo on the web. I'm going to start working on making this site relevant, starting with the software page. In the meantime I just need to regain lost ground. You can help by linking to the main page,
the return of advogato
Iyaloo! (An expression of delight in Oshiwambo.) Speaking of which, I just spent the whole week trying to finish a book on Oshiwambo. The next step is to see if I can get it published. That would be pretty nifty -- my first book.
a python snuck up on me
I wanted to kill offlineimap, which wasn't responding to Ctrl-C, so I went to another console and typed "killall python". Ay, there went my blog entry, never to be seen again! From now on, I'll always think of python as part of my desktop.
I had to do some profiling recently, and figured I'd blog about it. I wrote this a couple of weeks ago, and in the meantime people have started blogging about profiling. Funny how free software has a kind of milieu.
I started off investigating gprof. I had this idea in my head that profiling requires recompilation, so I checked the info and recompiled a whole stack of libraries with -g -pg -fprofile-arcs. Then I ran the program, ran gprof on the output and... bloody worthless. All it told me that main() takes 100% of the time.
Turns out gprof can't handle shared libraries. What is this, 1994?
I found a post about qprof, and it sounded like a good idea. Uses the LD_PRELOAD hack to set up some statistical profiling, no recompilation needed. Worked out pretty well.
By default, qprof only does flat profiles (counting the time spent in functions on top of the stack). However, if you configure it with libunwind, also from HP, it will do a version of call profiling. You have to fiddle with the build, manually copying things over and fiddling with the LD_PRELOAD scripts, but it does work.
qprof keeps an internal buffer of program counter (pc) locations. In the flat case it just records the pc in a slot and moves on. In the call case, it records the pc for each frame and moves on. Then at the end it counts up occurences for each pc, runs addr2line on it, and prints it out. However, you don't know where functions are called from. A little hack to store which pc records correspond to a single tick could make that possible, though -- I'll hack it up if I don't lose interest. That way you can get nice tracebacks like from valgrind.
A further annoyance is that functions called recursively are multiply counted. For instance, it reports that scm_ceval (from the guile evaluator) gets called 3246% of the time. Took me a little while to figure that out.
In summary, gprof blows, but has great documentation, except that they don't mention that the program blows. qprof is really useful and easy, but takes some fiddling, and really needs some more loving. Also libunwind on x86-32 is a little buggy, it seems. (Its main target is x86-64).
Anyone will use valgrind, because it's easy. No one profiles because traditionally they have to recompile. qprof (and others like it, oprofile for example) will hopefully change that situation.
(gnome gtk) loads 7 times faster than it used to. 2 seconds is still not good, but it's good enough that maybe I won't notice that gnome-blog loads up in 1.2. jamesh did a damn good job with that library.
To get that performance, I delayed the creation of scheme classes and methods until they are first used in the source, because programs will only ever use a small part of the gtk api. Incidentally gtk2-perl does the same thing. I think that says something about both languages: they can be elegant (although I think that's harder with perl ;) and they can get dirty. By dirty, I mean really low level. For instance, I can define an allocate-instance method on a class such that the instance actually doesn't below to that class. It's useful, but damn, it's ugly.
I really enjoy coding while tipsy, but I hate spilling beer on my keyboard. Something's gotta give. I think it's the position of my glass.