Tuesday 29 January 2013

dpkg MD5 checksums

My OpenOffice installation stopped working a few days ago after I Changed Nothing (tm) [1], so one of my avenues in investigating the breakage was to check for any unexplained changes to installed files. That happened to me once before [2], back when I worked at Prism, so "obviously" I felt I should check out that possibility again:

$ md5sum -c --quiet /var/lib/dpkg/info/*.md5sums

After much disk grinding (sometimes I'm sure I'm about to see a puff of hard disk powder come out of the fan exhaust), what seems to be a smoking gun:

usr/bin/gnuplot: FAILED
md5sum: WARNING: 1 of 45 computed checksums did NOT match

This is interesting! So I download the deb for gnuplot-x11 and unpack it manually (with binutils' ar), and find the same "wrong" checksum. A friend repeated the procedure and found the same "wrong" checksum, so I'm no longer suspecting a fancy worm/virus that infects new gnuplot binaries as they appear on the filesystem.

It turns out that these mismatching packages have preinst scripts that "divert" files, invalidating the naive checksum. The diverted files are still around, but their names no longer match what's in the lists of MD5 checksums.

And that's where laziness bites me in the behind: I knew by the time I started on my wild goose chase that debsums(1) checked checksums, but since I didn't have it installed and felt too lazy to install it, decided to just run the checksum files through md5sum(1). And after all that effort to get an explanation for these mismatched checksums, I installed debsums(1) anyway and discovered that it knows how to follow diversions!

Now, I'm back to still wanting to know why OpenOffice stopped working.

[1] I upgraded google-chrome, but that update involved only its own package. ooffice seemed to stop working after I tried to open some document that caused it to crash, but I no longer recall the exact sequence of events.

[2] It was almost ten years ago when gethostbyname(3) or some nearby interface seemed to stop working. Suddenly no programs could connect to anything on the Internet anymore. After a bit of bug-chasing I noticed that libc's contents had changed. I don't remember what led me to check that with rpm, but I did. I must have suspected cosmic rays, because I made a copy of libc before rebooting, in order to freeze the corrupted memory contents onto stable storage. Sure enough, after the reboot libc was fine (clearly having been reloaded from the uncorrupted copy on disk), and a diff of a hexdump showed some six bytes that differed, right inside gethostbyname(3). To this day I don't know how I might have forced the kernel to re-read what must have been a very frequently-accessed page.

Monday 14 January 2013

Waterfall spectrogram

A few days ago while visiting my dad, he got a call from Leon, who was transmitting at 137kHz at the time, asking my dad to listen for the signal. We didn't hear anything convincing, but it got me thinking: with some DSP we could grope out the signal from under the noise floor.

I quickly hacked up a pipeline on my laptop involving Pulseaudio's parec, my own FFT tool, and gnuplot. Sure enough, there seemed to be an (inaudible) audio signal at about 390Hz clearly visible in an approximately 10 second integration, giving a sub-Hz resolution. But I wasn't satisfied; I wanted to see the short-term spectra scrolling across the screen in real time. I ended up hacking into the night, but ultimately frustrated by some GTK+ weirdness. (One needs to set a file descriptor to non-blocking mode when calling g_io_add_watch.) I wasn't able to finish anything useful during my visit.

Last night I achieved victory:

This image is just a looped GIF showing two 1-second slices where I said something or my dog bumped the table. The vertical axis spans DC to 22kHz, which obscures much of the interesting stuff in the lower-frequency band where most of the information in speech lives. I'm not finished with this tool; it needs at least some zooming functionality, to adjust the range of frequencies shown, and to expand and contract the colour range I use to indicate spectral intensity. You can grab a copy of the code from repo.or.cz and help out, if you like.