July 2024 – FreeDV

This month I’ve been working on a real time implementation of the Radio Autoencoder (RADAE), suitable for Push To Talk (PTT) use over the air.

One big step was refactoring the core Machine Learning (ML) encoder and decoder to a “stateful” design, that can be run on short (120ms) sequences of data, preserving state each time it is called. The result is a set of command line utilities that can work with streaming audio from a headset or radio. This example demonstrates the full receiver stack: the rx.f32 file (off-air float IQ samples) is decoded to audio samples that are played through your speakers:

cat rx.f32 | python3 radae_rx.py model17/checkpoints/checkpoint_epoch_100.pth -v 1 | ./build/src/lpcnet_demo -fargan-synthesis - - | aplay -f S16_LE -r 16000

I spent some time profiling and with a little optimisation, we now have a real time RADAE Tx and Rx that achieves real time encoding and decoding on Desktop and laptop PCs. Quite surprising given it’s still Python code (with the heavy lifting performed in PyTorch and NumPy). With a little more work, we could use these streaming utilities to build a network based RADAE server, a sound card plug in, or a “headless” RADAE system like the ezDV/SM1000.

Our end goal for a RADAE implementation is a C callable library. While low technical risk, a C port is time consuming, and would delay testing the big unknowns in a new speech communication system such as RADAE. There is also the risk of significant rework of the C code if (when) there are any problems with the waveform. So our priority is to test the RADAE waveform against our requirements, and fortunately the Python version is fast enough for that already.

Over the years we’ve discovered many ways to break digital voice systems. These issues are much easier to fix in simulation so I’ve developed many intricate automated tests, for example tests that simulate slowly varying, stationary channels, and other tests that simulate fast fading like the northern European winter. Do carriers (sine waves) in the middle of a RADAE signal cause it to fall over or make it sync by accident? What happens if the Tx and Rx stations have slightly different sample clock frequencies? I won’t bore you with the details here, but a lot of work goes into this stuff.

While giving RADAE a hard time in simulation I tried the mulitpath disturbed (MPD) channel. This has 2 Hz fading and 4ms delay spread, and is encountered in Winter at high latitudes (e.g. NVIS communications during the UK Winter). It’s tough on HF modems. The mission here is “do not fall over with fast fading” – it’s OK if a few more dB of SNR is required. Here is a sample of what the off air received signal sounds like at 3dB SNR, followed by the decoded audio.

Despite the received signal dipping into the noise at times, RADAE seems to handle it OK. I designed the DSP equalization to handle fast fading, but only trained the ML network with a simulation of 1 Hz fading. So I was concerned the ML might fall over but this time we got lucky! Here is the spectrogram of the same signal – at times the fading completely wipes it out.

One innovation is an “End of Over” system. When a transmission ends, an “end of over” frame is sent and the Rx cleanly “squelches” the receive audio. Previous FreeDV modes would run on for a few seconds making R2D2 sounds, as from the receivers perspective it’s hard to know if the transmitter has finished or you are just in a fade.

On another topic this month I also set up a new WordPress host for this site, and spruced up the content a little. I’m more at home with DSP than SPF and MX records but with the kind support from VentraIP I got there eventually. Thanks Bruce Perens for hosting this site for the last few years.

If you are interested in helping out with the RADAE work I have been building up a list of small chunks of work that need doing using the GitHub Issues system. Many of them require general GitHub/C coding/Linux skills, and not hard core DSP or ML. I’ve listed the skills required in each Issue. Please (please!) discuss them with me first (using the Issue comment system) before kicking off your own PR – I have a really good idea what needs to be done and we need to stay focused.

I have written a test plan for the next phase of over the air (OTA) RADAE testing. The goals will be (a) crowd sourced testing of the latest PAPR-optimised waveform over a variety of channels using the stored file system (b) test real time, PTT conversations over real radio channels using RADAE. This will build our experience and no doubt uncover bugs that will require some rework. I’m on track to start this test campaign in August.

This month, FreeDV 1.9.9.2 was officially released, which contained the following bug fixes:

Remove TX attenuation and squelch tooltips. (PR #717)
Disable 800XA radio button when in RX Only mode. (PR #716)

Additional development work was performed after the release of 1.9.9.2. This consisted of the following:

Rendering a green or orange line (depending on sync) on the Waterfall and Plot views to help indicate the frequency offset of the received signal.
Allowing the version tag (i.e. “-devel”) to be overridden during the build process.
Reworked the behavior of the Voice Keyer feature to make it easier to use with multiple voice keyer files.
Fixed bug preventing saving of the previously used path when playing back files.

ezDV also got the following changes:

Added support for reporting to PSK Reporter (similar to the existing support in freedv-gui).
Began work on Ethernet support for ezDV.

More information can be found in the commit history below:

(Note that all commit logs above were generated with the following command line:)

git log --author="member@email" --after "Month 1, 2024" --before "Month 31, 2024" --all > commit.log

Month: July 2024

David’s FreeDV Update – July 2024

Mooneer’s FreeDV Update – June 2024