Monday, 14 January 2013

Waterfall spectrogram

A few days ago while visiting my dad, he got a call from Leon, who was transmitting at 137kHz at the time, asking my dad to listen for the signal. We didn't hear anything convincing, but it got me thinking: with some DSP we could grope out the signal from under the noise floor.

I quickly hacked up a pipeline on my laptop involving Pulseaudio's parec, my own FFT tool, and gnuplot. Sure enough, there seemed to be an (inaudible) audio signal at about 390Hz clearly visible in an approximately 10 second integration, giving a sub-Hz resolution. But I wasn't satisfied; I wanted to see the short-term spectra scrolling across the screen in real time. I ended up hacking into the night, but ultimately frustrated by some GTK+ weirdness. (One needs to set a file descriptor to non-blocking mode when calling g_io_add_watch.) I wasn't able to finish anything useful during my visit.

Last night I achieved victory:

This image is just a looped GIF showing two 1-second slices where I said something or my dog bumped the table. The vertical axis spans DC to 22kHz, which obscures much of the interesting stuff in the lower-frequency band where most of the information in speech lives. I'm not finished with this tool; it needs at least some zooming functionality, to adjust the range of frequencies shown, and to expand and contract the colour range I use to indicate spectral intensity. You can grab a copy of the code from and help out, if you like.

No comments:

Post a Comment