My camera broke; I have only words.
the golden horn
Istanbul is a town full of wonder. And wander: around the cobbled streets of the old town, in the morning, in the evening, alive at all hours.
Last night I shared a water-pipe under the bridge, looking back on the night silhouette of the old town, smoke rings dissipating over the water. As we were walking back to the hostel some hours later, Dimitris noted the waft of baking bread, the start of a new day.
I stepped out on Sunday morning to the GStreamer mini-summit, but was waylaid by the Blue Mosque. Outside it is grey, hard stone and spired; inside it is lush and tactile, the carpet creeping up between your toes. I believe that space can have rhythm. In that place there was a rich visual soundscape, tiled motifs repeating on the macro level, fractally recursing into micro-vegetation, a symphony of space and lines. I stumbled out into the blue sky.
The GStreamer summit was pretty good. We decided to switch to git (from CVS), once some issues are ironed out regarding history and our "common" submodule. We also decided that at some point we should do a new development cycle, but that we needed reasons for doing so; the idea would be to develop a number of features that cannot be done with 0.10 in experimental branches, and once there are enough branches, we pull them together in a quick 0.11 and from there to 0.12 or 1.0. This would be a process that could take a year or two.
In that regard, some interesting points were brought up regarding GLib and GTK+'s plan to break ABI for version 3. The problem is that any library that depends on GLib will break ABI as well, and that includes GStreamer. So given that we will need to break ABI to depend on GLib 3, that gives us a good timetable for a next development series, corresponding to GLib 3's release in about 16 months. I suspect that many projects will want to do the same.
One of my first tasks at Oblong was to migrate their code for playing video from some very tricky, threaded ffmpeg + portaudio code to using GStreamer. The playback interface is fairly standard, but thorough: seeks to time, segment seeks, variable speed and reverse playback, frame stepping, etc. There were some twists: we do colorspace conversion on the GPU, and there's a strange concept of "masks", which is useful for operating on rotoscoped video, and there's integration with the GL main loop.
But anyway, I felt finished with all of that a while ago. The only problem was a lingering memory leak, especially egregious in the context of the art installation, which has yet to switch to my code.
So it was with a queasy, helpless feeling that I sat down and tried to systematize the problem, come up with a test case, and see if I could track down where the leak was. I tried code inspection at first, and I proved my code correct. (Foreshadowing, that.) I at least narrowed down the situations under which it occured. I then despaired for a while, before I hit on the way to make memory leak detection fun: turn it into a tools problem. Now instead of finding the leak, all I needed to do was to find the leak detector!
So I checked out valgrind from CVS; it crashed on me. Then I decided to see if libc had anything to offer me; indeed it does, mtrace. But alack, deadlocks. I even went so far as to include mtrace in my code, and applied Jambor's patch from the bug report on my sources, but lost because the ELF symbol resolution is intertwingled with libc's build system.
So back to valgrind, this time the 3.2.3 version packaged with Fedora, and lo and behold, exhibit A:
My test is adding a video, waiting a while, removing the video, waiting again, then repeating. You can see the obvious video-playing versus video-removed phases.
Fortunately, you can also see the leak, and where it is: in the green part, corresponding to something that's calling g_try_malloc. This was comforting to find. It could have been something involving GL contexts or whatnot, and I'm using the babymunching nvidia drivers. So g_try_malloc was where it was coming from. But what was calling g_try_malloc?
For that, you have to dive into the textual output produced by massif. And sure enough, following things back far enough, you find that it is a GStreamer video buffer:
Context accounted for 7.2% of measured spacetime 0x5F27E60: g_try_malloc (gmem.c:196) 0x52E1487: gst_buffer_try_new_and_alloc (gstbuffer.c:359) 0x530367B: gst_pad_alloc_buffer_full (gstpad.c:2702) 0x53039FA: gst_pad_alloc_buffer (gstpad.c:2823) 0xB9C11BF: gst_queue_bufferalloc (gstqueue.c:502) 0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668) 0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850) 0xBBECB4F: gst_base_transform_buffer_alloc (gstbasetransform.c:1112) 0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668) 0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850) 0xBBECB4F: gst_base_transform_buffer_alloc (gstbasetransform.c:1112) 0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668) 0x53039FA: gst_pad_alloc_buffer (gstpad.c:2823) 0x52F6C68: gst_proxy_pad_do_bufferalloc (gstghostpad.c:182) 0x53034C0: gst_pad_alloc_buffer_full (gstpad.c:2668) 0x53039E6: gst_pad_alloc_buffer_and_set_caps (gstpad.c:2850) 0xC452722: alloc_output_buffer (gstffmpegdec.c:764) 0xC454504: gst_ffmpegdec_frame (gstffmpegdec.c:1331) 0xC45635D: gst_ffmpegdec_chain (gstffmpegdec.c:2236) 0x5303C06: gst_pad_chain_unchecked (gstpad.c:3527) 0x530421C: gst_pad_push (gstpad.c:3695) 0xB9C0A3E: gst_queue_loop (gstqueue.c:1024) 0x531E418: gst_task_func (gsttask.c:192) 0x5F4338E: g_thread_pool_thread_proxy (gthreadpool.c:265) 0x5F41C4F: g_thread_create_proxy (gthread.c:635)
For this level of information, you have to run massif with special options. I ran my test like this:
G_SLICE=always-malloc valgrind --tool=massif --depth=30 ./.libs/lt-vid-player
So now that I knew what was leaking, I decided to run with fewer, longer cycles to see the allocation characteristics were. And thus, exhibit B:
You can see that after the video was removed from the scene the cyan part representing g_try_malloc allocation does not drop down to zero; indeed it starts to "fill up the trough", getting larger at each iteration.
Of course at this point I realized that I probably wasn't freeing the buffer that I kept as a queue between the GStreamer and GL threads on teardown. Indeed, indeed. Two lines later and we have the much more agreeable long-term plot:
Moral of the story: "proof of correctness" is not proof of correctness.
Valgrind turned out to be much more useful to me in this instance than it was when I looked at before, when hacking Python. But again, the CVS/3.3 version was of no use, yet. Since then, 3.3 does indeed do graphs again, but in ascii. As a palliative, the textual output appears to have improved. Still, ascii graphs?
I would like to note an underexplored territory that is ripe for the munching, the "southern buttermilk biscuit -- mallorcan sobrassada" interface. Spreadable sausage on a biscuit. God did truly shine his face on my kitchen this morning.
I got guile-gstreamer working today, which is hot. Version 0.9.90 is totally released. The last bit that I had to do was to leave "guile mode" when the wrapper calls C functions, and to enter guile mode when GClosures are invoked. That is what it takes to be thread-friendly with Guile; while multiple guile threads can be running at the same time, they have to not block, because GC or thread joining requires cooperation from all threads.
So, multithreaded GStreamer and Guile. I'm pleased, this is already more than the old 0.8 wrapper did. Also the process of porting to 0.10 was mostly removing crufty code, which says nice things about the state of GStreamer-the-library.
Greetings gentle reader, I offer these poorly connected vignettes for your eyes' consumption.
My friend Colin just came out with a new album, Soukha. He gave me that link a couple weeks ago, but I still haven't been able to listen to it on the web site because I don't have a Flash player. Today I realized that I have his mp3's from somewhere else, put them on, and was more than duly impressed. Hotness! People should tell him how awesome he is. I think stylistically it's closest to Gotan Project. Very diggable.
I started updating the GStreamer bindings as well. They are available from bzr only, at the moment, pending a release when things are working OK. Already caps and structures are fine, including all of the valued types like int ranges, fractions, fraction ranges, and fourcc's. Today I got miniobjects working, a new fundamental classed type. Now that Guile is fully multithreaded, except for GC, I have a fighting chance of getting callbacks from threads to work as they should. Then I release and the world of scheme+gstreamer hackers rejoices. (Currently when you drive through this world the sign reads "Population: 1".)
gnome foundation elections
The GNOME foundation board elections are upon us, and after renewing my membership I cast my ballot, for a mix of people. The candidates were pretty good this year. May the most voted for persons win?
The cold snapped about a week ago. Brr!
Thanksgiving came and went this year again, and two turkey carcasses were carried out my door. Things went pretty well, with about 35 people showing up in my flat, with a continuous eat-drink-eat-again cycle going on for about 10 hours. Good times! Also this was the first year that I wasn't scared of the turkey. On the flip side I see the tendons in my arm in a different light.
Went to Mallorca a couple of weekends ago for an Aikido seminar with Yamada Sensei of New York. I'd have liked to have seen more of the island; as it was it was a loop of train-eat-drink-sleep. Also good times, that seems to be the theme of this burst-o-blog.
Upcoming: Christmas in Belfast/somewheres around there, a new year turning. GOOD TIMES
Because I egocentrically syndicate myself on advogato, which only shows your most recent post, I've shied away from multiple postings at a time. The result is mashing a bunch of topics into one writing product. This practice has its, um, aspects.
The biggest minus for me is that I have to have all of my thoughts finished at the same time. I'm going to experiment with shorter writings on marginally more focused topics. We'll see how it goes; advo folks might want to look at my last entry, which is more advo-related.
hack hack hack!
Today was Friday at Fluendo. Although I do bitch a lot about various things work-related, the fact that we have a day a week to hack on what we like is hot. It's cunning though, being at the end of the week, as the momentum of the week's projects bleed a bit onto the last day, but still. Hack hack.
So today my goal was to figure out what was up with the GUADEC video and audio archives. People are all up on my case about this, and wha? I just don't have the time I used to. (Reasons for this are in a book I'm working on.)
Anyway, so this is a prelude to the sequel to my last post on conference streaming. What I wanted to say was something about ensuring that you get archives on disk, and then to edit them later.
However I was running into problems. Sometimes totem was having trouble playing the files, for example. Wha? Also there were some sound level issues. Amusingly, in the sala d'actes, an unbalanced cable converter was picking up the radio, for example.
More disturbingly, all players (gstreamer, xine, mplayer, vlc) were playing unsynchronized audio and video for some talks. How could this be? I was getting proper timestamps from the DV feed, but somehow in the encoding process we are producing bad ogg timestamps (granulepos values)? Wha? The italics totally indicate internal dialog.
A little background: the 2005 Guadec video archives suck. I say this having been a part of the process of their creation. Most do not pass oggz-validate. This is because of problems in the GStreamer ogg muxer back then.
Oh man we were pissed, in the American sense. How could we produce bad ogg. Us the exponents of ogg. Gar. So Thomas fixed the ogg muxer For Once And For All, and the world was happy.
Another background: the way that we produced those videos was by watching the talks, then when a new speaker started, we pressed a button in a flumotion-admin client that we had running, which would tell the "disker" flumotion component to start a new file. Because we didn't have any decent cutting tools, we had to rely on this to produce files per-talk. It was a bit of work, but it produced decent results. At least we could do it from anywhere with network access.
Fast forward to 2006. I had streamed a couple of conferences, and thought that it was a pain in the ass to have to have someone rotate the conference archives manually. This was a selfish desire, that although I was the person responsible for streaming, I wanted to enjoy the conference too -- always having to rotate the videos is a drag. So I wrote a lossless cutter for ogg/theora+vorbis, the intention being to let the video capture to disk all the time, then just cut out the talks you want.
So I cut the talks. I get some segfaults, patch some code, update to latest CVS, have to patch it some more, but in the end I get cuts which I believe to be correct. Only problem is, the audio is completely off. As in, 10 seconds out of sync. This should not be possible. I mean, my GStreamer talk (not yet posted) was about synchronization. I know how to do this. What was up?
Well. Long story short, after despairing to Thomas, we figure that if the CPU usage spikes, such that the theora encoder takes too long and we get behind, that it could be that we have to drop frames. This will happen on the capture end of things, if what is processing the raw data is not reading fast enough. In theory this is fine. The encoder will still receive correctly-timestamped data. However, GStreamer's Vorbis and Theora encoders were internally assuming perfect streams (no dropped data), so internally they disregarded the timestamps they were given, choosing instead to produce continuous streams.
The end result is that our GUADEC archives are perfect, in one sense: they present no problems for oggz-validate or for ogginfo (the two programs you should use to validate ogg files). However they are incorrect. In the event of dropped data, the audio and video become unsynchronized.
This is even more of a problem for the long files I chose to record. What a PITA. I have broken files that I will need to manually patch at certain points to resync the vorbis and theora ideas of granulepos. This means we need even more custom tools. Ug.
So the end is that seekers of GUADEC archives will have to wait a bit.
what I meant to say
Writecode. Indeed what I meant to say was that after recognizing the bogus behavior of GStreamer's vorbis and theora encoders, that Mike Smith and I set out to fix them. He hacked up patches while I hacked unit tests. All I wanted to say was that it was really nice to hack GStreamer, after so long away, and that C is fun sometimes.
also wha is the new what
Links I have enjoyed
Richard Stallman interviewed by Z magazine (it's about time). The soul-probing Torture's Long Shadow by Bill Moyers (via titus, whose journal I am enjoying these days). The hyperbolically delicious fuckchristmas.org, via Miguel, who is the first google hit for his first name. There should be a word for that.
Worst plane trip ever
So I realized about half an hour before landing in Washington that I didn't have my tickets any more. I must have lost them in one of the two security searches in Heathrow (security searches in connecting flights?).
I figured I'd just head up to the Virgin Atlantic offices and get them to reprint my paper tickets. After all, this must happen occasionally, and all the information is in the computer anyway. But no. Long story short, I have to buy a new ticket. I can't adequately express my impotent anger at this situation.
On top of it they lost my bag. They called me the next day asking what was in the bag, as in, "what was the deceased wearing?". Two days later I am wearing the same clothes.
(If anyone has any advice about this situation, I'd like to hear it -- wingo at pobox dot com.)
It seems Ronald had a nice semester, which is pleasant. However he seems to be under the mistaken impression that GStreamer is a Fluendo project. By my count 37 people contributed to GStreamer over the 0.10 cycle, which easily swamps the half a dozen people that Fluendo occasionally devotes to the project. I'm sure it was an honest mistake.
American toilets are fascinating. So much water! Toilet paper made of clouds! What an odd place.
It seems I haven't written for a while. I only have the desire to write when I'm walking the streets though. Sitting down in front of a machine turns me into a consumer. Consume p.g.o. Consume random news media. Consume all of my mailing lists. Refresh, refresh. I need therapy or something.
Back to the real world, we definitely cooked a fine thanksgiving dinner a couple of weeks ago. Week and a half, I suppose, and I when I say we I mean the australians. Thanks Mike, Jaime, and Jan. Although if it weren't for Mike I wouldn't have been so hung over Saturday morning and wouldn't have sounded so pathetic on the phone calling for help "Jaime Jan please come bird it still has feathers". Jaime has a good play-by-play of the event, replete with pictures.
Good though. Thanksgiving makes me feel good. Also the gumbo made from the turkey carcass makes me feel good, both while cooking and consuming. Consume consume. (There were 5 pies to consume at this event -- 2 apple-walnut from the lovely tiffany, one pumpkin pie by the lovely me, one apple by the lovely Mike, and one pecan bastard-child pie from all hands in the kitchen. Tasty pies, each and every one.
After the Thanksgiving weekend there was a krazy week at work, putting final polishes on, and putting final dabs of putty in the holes of, GStreamer 0.10. I link to Christian Schaller's hyped-up description of the release, but it's true -- I've been more proud of software before, but not by much. GStreamer 0.10 is quite an accomplishment.
After an odd weekend (intended to be low-key, went out until 7 on Friday night), it was back to work. <![CDATA<
an odd thing to think, as if work defines existence?
work is fun and all (see footnote 1) but I think
25 hours a week would allow me much more poss-
ibilities for personal growth
Did somebody say work? Because I was just thinking that! Lately I've been hacking the clock synchronization stuff into flumotion. It is looking most excellent. It's tough to do a job properly, such that no one will have to come and clean up after you later, but it's also satisfying. I am pleased with work right now.
That's the week kids, a bit of hacking, a bit of aikido, a bit of random vacation days on a Thursday. A bit of cold outside that penetrates the bones, given enough time. I give to almost all beggars in this season. They do not have a kickass chili waiting for them at home as I did today.
Footnote 1: Julien (fluendo chief and also nice fellow) put in a new policy of 20% time, for example like they have at Google, and about which NITI folks often speak. Most excellent in my opinion; tomorrow is the first one. Rock.
Footnote 2: Google links thrive off of descriptive link text. I hope NITI folks appreciate the referrals for "often" and "speak".
Footnote 3: This writing product finishes here. Pending topics: an upcoming trip home, musical discoveries of 2005.
A few quick notes before my battery dies.
For me Thanksgiving will fall on this Saturday. I'm used to it at this point -- it will be my fourth consecutive turkey day out of the states. The market in the center of town has turkeys, and I have a fellow there looking out for a 7-kilo specimen. We'll see what he finds tomorrow. Hopefully he doesn't decide to run it through the meat cutter.
alice and bob notes
John Borwick (sysadmin extraordinaire as well as friend from high school) wrote in to tell me that the network time protocol uses the same strategy as the one I settled on. Of course I read a lot of papers before setting down to do simulations, but I never found a decent explanation of NTP. The FAQ entry he linked to summarizes it nicely. Thanks John!
So yes Dave, it is a (somewhat embarrassing) real life problem -- ask any user of Flumotion that wants to synchronize capture from a webcam and a sound card. Also the rates of the clocks in question can vary (for example with temperature), and have a certain amount of jitter. Finally we're solving that issue now.
I signed up for catalan clases, finally. It's only been 10 months. How the time passes!
Imagine Alice and Bob are in two buildings overlooking the same street. Each of them has a camera and occasionally takes polaroid pictures of the street. When Alice takes a picture, she writes the current time, according to her watch, on the back of the photo. Bob does the same with his photos and his watch. Later on, Alice wants to get the pictures made by both of them and put them in real time order, so she can see what is happening in the street from the two perspectives. Artsy types, wanting the zaniest things.
However, Alice knows that their watches are wrong -- they are hours apart, and one of them is running faster than the other. The other problem is that she doesn't know which one is faster, or by how much, and Bob is in another building so they can't see or hear each other. How then to put the pictures in order?
Given this problem, the solution I settled on was to treat one of the watches as the "correct" time, and to have the other one try to match its time to the "correct" time. For example, if Bob is the one with the correct time, Alice could send messengers to him to ask him the time. Whenever a messenger reaches Bob, he could write the time on a piece of paper and send it back.
Sending a message takes time though, so Alice doesn't know what her time is when Bob records his time. She then settles on a different plan: she writes down her time on a message before sending it to Bob. Bob writes down his time when he gets the message, and then Alice writes down the time when she gets the message back. Then she guesses that Bob wrote down his time in the middle of her times, if the message took the same time to get there as it did to come back.
Alice sends a few messages, and can check whether the difference between her time and Bob's time is increasing or decreasing, to see which clock is going fast. Also that way she can average out the different messengers' trips, because some of them will be faster, some slower, and maybe some messenger finds something more interesting to do than run around with pieces of paper.
In the end she can adjust her time with a rate difference, to account for the different speeds, and an absolute difference, to account for the different starting times.
Alice's problem is the same as capturing video from different machines and trying to mix them so they are in sync. One machine should have the master clock, and the others should try to analyze the difference between their own clock and the master via sending packets over the network. Conversely, the same issues come up if you are sending video to 9 computers with monitors, trying to form a gigantic screen -- all parts of the video should be shown at exactly the same time, which means the clocks must be synchronized, even if they run at slightly different rates.
In the end you end up something like this, where "Local time" is Alice's time, "Remote time" is Bob's time, and "Network time" is what Alice thinks Bob's time is.
I worked out this synchronization algorithm a while ago, but just got around to implementing it in GStreamer last week. It rocks. You make one pipeline's clock export a NetTimeProvider interface, which does Bob's job: receiving UDP packets with one time value, appending Bob's time to them and sending them back.
Then on the other pipeline you instantiate a network client clock and set it on the pipeline. When your webcam captures frames, it timestamps them using the network clock. Very neat. I'll integrate this into Flumotion later this week.
Having settled a number of issues with Flumotion and GStreamer 0.9, I sat down on Monday to check on the condition of DV capture over firewire. I already knew a bit of what to expect: kernel deadlocks. Ever since I got this SMP machine, that's what I see. Turns out DV capture deadlocks on all SMP machines I could find. I suppose I have a talent for breaking machines :-)
But, Mr. Love's book in hand, I set out to figure out what was going on so I could submit a decent bug report, or at least figure out if someone else had already solved the issue. Ended up learning all kinds of neat things about spinlocks and interrupts, eventually (two days later) coming up with a patch that Works For Me (tm).
On the GStreamer side of things, it's the time of the week when the new summary comes out. Catch it on the mailing list, or over at planet gnome news.
Speaking of planets, hello p.g.o!