David’s FreeDV Update – May 2024

The last few months have been focused on building up the DSP code required to try the Radio Auto-encoder (RADAE) over the air. In order to answer the big question of “does it really work” as quickly as possible, I had to skim over many intriguing topics. So now that we have a qualified “yes” to the big question – I’ve returned to some Machine Learning (ML) R&D to explore a some intriguing ideas:

  • Reduction of the “latent dimension” and hence RF bandwidth of the RADAE signal.
  • Encouraging the network to train 2 dimensional constellations rather than 1D.
  • Training for low Peak to Average Power Ratio (PAPR) – a potential 6dB improvement.

To date RADAE has used a “latent dimension” of 80 symbols every 40ms, which are mapped to 20 OFDM carriers at 50 symbols/s, resulting in a RF bandwidth of 1000 Hz. I spent some time exploring how to to reduce this to dimension 40, i.e. a 10 carrier, 500 Hz bandwidth signal. This would result in more efficient use of spectrum. With fewer carriers our pilot based equalization work better as there would be more power per pilot symbol. Fewer carriers also helps reduce PAPR. On the negative side, classical communications theory predicts a narrower bandwidth signal will perform worse on HF channels, and may be less power efficient (e.g. BER performance of 8PSK versus QPSK).

The original RADAE design has a one dimensional bottleneck that limits the amplitude of real valued symbols to +/-1. Given additive noise, the network would always place constellation points at +/-1 in order to minimize the effect of noise. As the dimension reduced, distortion increased as there was nowhere in 1D space to place additional constellation points without being unduly affected by noise. I reasoned that encouraging the network to train two dimensional constellations would help. For example in classical digital systems, we can use an 8PSK constellation, each point is equal distance away from the origin. If the SNR is high enough, this can send more information per symbol than QPSK.

So I arranged the elements of the latent vector in complex number pairs (e.g. 20 complex valued symbols for a 40 element latent vector), and set up a two dimensional bottleneck that constrained the magnitude of the complex symbols trained by the network. This worked, I can now obtain good performance from a dimension 40 system. Curiously, the resulting constellations are circles, rather than discrete points.

Constellation of PSK symbols when trained with a 2D bottleneck on the symbol magnitude.

Also this month I developed a method for comparing ML models objectively. The method runs the training database through a trained model at a range of SNRs, and produces curves of model “loss against Eq/No” for the model (Eq is the energy of one PSK symbol). I feel there is a reasonable match between these curves and the subjective speech quality. Having an objective method of measuring a models performance lets me know if I’m on the right track with a ML model design without tedious listening tests.

Loss v Eq/No curves for 4 models. model05 (m5) is the control – this was used for the recent the OTA test campaign, and is a dim=80 1D bottleneck. Model 17 looks comparable (PAPR optimised 2D bottleneck), however m14 & m18 are not so great.
As above, but loss v C/No. This normalizes for the different symbol rates. Now m18 is dim=40, so only has half as many symbols to send across the channel. Given the same Tx power, we therefore have twice the energy per symbol. It now looks competitive to m5 and M17.

OK, so now we have an objective measure for comparing models, a way of training lower dimensional models, and some understanding of 2D constellations: i.e. how to train them, and what to expect from the 2D constellations developed by training.

Using these tools, I attempted to build a PAPR optimised ML model. I estimate a low PAPR waveform has the potential to provide a further 6dB improvement at the receiver compared to a classical DSP OFDM waveform – so this is definitely worth exploring. This requires a “time domain” 2D bottleneck that simulates the way a power amplifier saturates. Combining this with multipath training is tricky, and I have tried several different approaches. At the time of writing I believe I have a way forward with a hybrid time-frequency domain model, and am currently evaluating the results. The design uses OFDM and classical DSP for equalisation, and ML for PAPR optimisation, and achieves a PAPR of less than 1 dB.

Here are some samples that show the PAPR optimised waveform over a simulated multipath poor (MPP) fast fading channel. The both have the same “peak power to noise” P/No ratio. Imagine them both being transmitted from the same radio with 100W peak power, over the same (really bad) HF radio channel, to the same receiver.

Peter, VK5APR, using SSB at a P/No of 39dB (Rx SNR -2.4dB)
Peter, VK5APR, using RADAE model18 also at a P/No of 39dB (Rx SNR 3.4dB)

Note the difference in the receiver SNR. The “S” in S/N is the RMS power at the receiver, which is lower for SSB as the SSB PAPR is higher (around 6dB, after compression). The goal of most radio systems is to maximise the RMS power at the receiver. So with the same transmitter, we have achieved around 6dB higher SNR at the Rx by carefully minimising the PAPR of the RADAE waveform.

Here are the spectrograms, note the model18 dim 40 RADAE signal uses only about 750 Hz of RF bandwidth (500 Hz for the ML PSK symbols plus some bandwidth for OFDM overheads). The moth-eaten effect is the multipath channel wiping out chunks of the signal.

There are many other areas we could explore (e.g. ML based equalization), but as we don’t have infinite time, I’m choosing to time box the ML R&D before we lock in a V1.0 design, and proceed to real time implementation.

Next month I will round out the ML design work, address a few other bugs, and attempt to arrive at a RADAE design suitable for our first real time implementation.

The Right to Innovate in the HF Data Space

On the HF data front, I’ve been working with Simon DJ2LS to test and merge several libcodec2 PRs to support FreeDATA. This work has improved protocol efficiency and enabled Simon to “homebrew” his own custom OFDM waveforms. His first attempt at a new waveform has roughly doubled the highest data transfer speed of FreeDATA. Simon is working on a new FreeDATA release that includes these improvements. We also have a 16QAM prototype waveform under development, which in high SNR channels, will double the speed again.

One of the PRs supports custom configuration of the OFDM modem, for example you can plug in the number of carriers, symbol rate, and number of bits per frame at “init time” without writing any C code. Empowering Hams (and indeed anyone) to build their own HF data waveforms is important. This work “preserves the right to innovate” in the HF data space, a key value of the ARDC.

One Reply to “David’s FreeDV Update – May 2024”

  1. Hi David, the second audio demonstration, was FANTASTIC, with no band noise in the background, compared with the first one, which was on SSB, however, as you said, the audio in the second demonstration was not understandable, however, it was wonderful that there was no band noise in the background, so when you get the audio more understandable, it will be one AMAZING, audio demonstration!!

Leave a Reply

Your email address will not be published. Required fields are marked *