FreeDV will have a booth at the North Hall of Orlando HamCation this weekend (near the HamCation prize booth). If you happen to be in the area, come check us out! More information about HamCation can be found at https://www.hamcation.com/.
Additionally, Mooneer K6AQ will be giving a talk about/demoing FreeDV’s new RADE mode tomorrow (February 7) at 3:30pm in CS II (the set of popup tents in the back of the fairgrounds). Hope to see you there too!
This month was spent firming up the preview release that just came out. This involved fixing various bugs discovered by the RADE test team, such as various crashes that began occurring during the implementation of SNR and RADE text support (required for FreeDV Reporter to fully work with RADE). Audio quality was also improved and the transmission of the End Of Over (EOO) block was also made more reliable, ensuring that users reported received signals to FreeDV Reporter and PSK Reporter more often.
Speaking of PSK Reporter, this is the worldwide map of activity over the last 24 hours as a result of this work, indicating heavy interest in RADE and FreeDV more generally:
For February, the focus is going to be on promoting the work we’ve done as a project. We’ll be at Orlando HamCation in just a few days (North Hall, booth 119, as well as a talk given by me on Friday February 7th at 3:30pm)–hope to see you all there!
More information can be found in the commit history below:
At the start of this month I did battle with the problem of SNR estimation on the RADE V1 signal. As I have mentioned previously, this had some challenges due to the lack of structure in the RADE constellation. After a few false starts I managed to get something viable running using the properties of the pilot symbols. The plot below shows the estimated against actual SNR for a range of channels. In the -5 to 10dB range (of most interest to us) it’s within 1dB for all but the MPP (fast fading) channel where the reported estimate reads a few dB lower than the actual (Note Es/No roughly the same as SNR for this example).
I’ve started work on RADE V2, where we hope to use lessons learned from RADE V1 to make some improvements and develop a “stable” waveform for general Ham use. This month I have made some progress in jointly optimising the PAPR and bandwidth of the RADE signals. For regulatory purposes, the bandwidth of signals like OFDM are often specified in terms of the “occupied bandwidth” (OBW) that contains 99% of the power. The figure below shows the spectrum of a 1000 symbols/s signal with a 1235 Hz 99% occupied bandwidth OBW in red.
Machine Learning Equalisation
Also for RADE V2, I have been prototyping ML based equalisation, and have obtained good results for some examples using the BER of QPSK symbols as a metric. The plot below shows the BER against Eb/No for the classical DSP (blue), and two candidate ML equalisers (red and green, distinguished by different loss functions). The channel had random phase offsets for every frame, which the equaliser had to correct. The three equalisers have more or less the same performance.
These results show the equalisation function can be performed ML networks, with equivalent performance to classical DSP.
Project Management
Quite a bit of admin this month, including time spent recruiting prospective new PLT members, updating budgets, and our annual report. Not as much fun as playing with machine learning, but necessary to keep the project running smoothly.
It was time to write our annual report for the ARDC who have kindly funded this project for the last two years. Writing this report underlined what a good year we had in 2024, some highlights:
The development and Beta release of the Radio Autoencoder RADE V1 which is well on the way to meeting our goals of being competitive with SSB and high and low SNRs. Special thanks to Jean-Marc Valin for your mentoring and vision on this project!
The BBFM project, paving the way for high quality speech on VHF/UHF land module radio (LMR) applications, in collaboration with Tibor Bece and George Karan.
New data modes to support FreeDATA, in collaboration with Simon DJ2LS.
The release of ezDV and continued maintenance of freedv-gui largely by Mooneer’s efforts.
Peter Marks joining our Project Leadership Team. He’s already making a big impact – thanks Peter!
This is the second preview release of FreeDV containing the new RADE mode. For more information about RADE’s development, check out the blog posts on the FreeDV website:
* Signal to noise ratio (SNR) is now displayed while receiving RADE signals. * Received signals are now reported to FreeDV Reporter (without callsigns) once per second. Once a callsign is received (at the end of the transmission), the callsign is reported to both FreeDV Reporter and PSK Reporter. * Fixed bug preventing sync indicator from turning green with RADE. * Visual Studio Redistributable is now installed if your PC does not already have it. (This is required for the Python packages FreeDV uses.) * Fixed bug preventing Request QSY button from being enabled in RADE mode. * RADE has been renamed to RADEV1 in the UI and FreeDV Reporter. * macOS binaries are now signed and notarized, avoiding the need for the workaround in the previous build. * Fixed issue causing FreeDV to segfault on exit when RADE is running. * Python files are now precompiled to improve startup time. * Core RADE code is now in C (versus Python). * Uninstaller now fully cleans up after Python. * Audio chain is cleaned up to improve audio quality. * README has been updated to clarify Linux instructions and to provide a link to a script to auto-build with RADE support. (Thanks @barjac!) * Maximum SNR displayed in the main window is now 40 dB to reflect real-world testing. * “devel” in the version string is shortened to “dev” and incremented to “dev2” to reflect the second preview build.
Limitations:
* Multiple RX mode is not supported. If you choose RADE and push Start, that’s the only mode you can work; you’ll need to stop, choose another mode and start again to work FreeDV with the existing modes. * Squelch cannot currently be disabled with RADE. It’s unknown at this time whether disabling squelch is possible. * Due to compilation problems, 2020/2020B modes are disabled. * There is currently no Windows ARM build; this will hopefully be included in a future preview build. You may be able to use the 64-bit Intel/AMD Windows build in the meantime. * Minimum hardware requirements haven’t been fully outlined, so your system currently may not be able to use RADE. Future planned optimizations may improve this.
Other notes:
* The below builds are significantly bigger than previous releases. This is due to needing to include Python and the modules that RADE requires. Planned porting to C/C++ will eventually negate the need for Python. * The Windows build includes Python but not the modules that RADE requires. As part of the install process, the version of Python built into FreeDV will go out to the internet to download the needed modules. * As development is expected to happen quickly, these preview builds have a six month expiry date (currently July 30, 2025). * 32-bit Windows is no longer supported due to its likely inability to work with RADE.
An important goal of our project is improved speech quality over SSB and both low and high SNRs. We have anecdotal reports of good performance of RADE compared to SSB, but need an objective, controlled way of comparing performance. For speech systems this generally means ITU-T P.800 or P.808 standards based subjective testing. However this is complex and requires skills, experience and resources not available to our team.
A few months ago Simon, DJ2LS suggested the use of Automatic Speech Recognition (ASR). More recently, when discussing the issue of subjective testing, Jean Marc Valin also suggested ASR and provided suggestions for a practical test system. So I spent much of December building up a framework for ASR tests.
The general idea is to take a dataset of speech samples, pass them through simulations of SSB and RADE over HF radio channels, then use a ASR engine to detect the words in the received speech. A post processing system then compares the detected words to the original words and determines the Word Error rate (WER) as a performance metric. Our work uses the Librispeech dataset, and the Whisper ASR system.
These sentences are complex English sentences, spoken quickly with no contextual cues. I have trouble understanding many of them on the first listen. This is a much tougher test than the typical low SNR Amateur Radio contact where someone shouts their callsign 5 times then reports “5 by 9”. For example, here is one sample from the Librispeech dataset processed with SSB/RADE/original (listen to the original last); SSB and RADE were at about 6dB SNR on a MPP (fading) channel.
The plot below show some initial results over 500 sentences. The x-axis is receiver SNR measured in a 3kHz noise bandwidth. The y-axis is the word error rate WER). Green is RADE, and blue SSB. The solid lines are for a AWGN channel, the dashed lines the multi-path poor (MPP) fading channel. The dots (placed arbitrarily on the x-axis) in the lower right are controls, e.g. the FARGAN synthesizer used by RADE with no encoding, 4kHz band limited speech, and the original, clean speech.
A low word error rate (WER), say 5%, would correspond to an effortless “armchair copy”; a 30% WER could be the limits of practical voice communication (1 in 3 words can’t be understood). The distance between the RADE and SSB curves shows the benefits of RADE, at least using this test.
For example, if you draw a line across the 10% WER level, RADE achieves this (dashed MPP curves) at 3dB, SSB at 12dB. The x-axis doesn’t include the PAPR advantage of RADE, which is roughly an additional 5dB when using a transmitter with the same peak power output (depending on how hard the SSB is compressed).
Also this month I have been working on SNR measurement of received RADE signals. This is quite challenging, due to the lack of structure in the ML-generated RADE constellation. At present I’m attempting to use a classical DSP approach using the pilots symbols. This will be the last feature we will add to RADE V1, as we’d like to use the lessons learned to start designing RADE V2.
This month involved more improvements to the FreeDV GUI application. One improvement involved the unit test framework; it’s now possible to capture the features decoded by RADE (prior to being fed into the FARGAN codec). This is useful for quantifying changes in the receive pipeline and ensuring that what’s encoded by RADE is also mostly returned by the decoder on a clean channel.
The biggest improvement, however, is the implementation of the same LDPC based callsign encoding and decoding system that’s used in the legacy FreeDV modes. This data is placed in what’s known as the End Of Over (EOO) block at the end of the RADE transmission and allows the application to report received callsigns to FreeDV Reporter and PSK Reporter, albeit only at the end of the transmission. FreeDV Reporter specific logic was added to mitigate this by reporting that a RADE signal is being received once a second while still in sync (just with no callsign), hopefully still allowing people to see that someone’s possibly decoding them in real time.
Since we’re touching the FreeDV Reporter logic, it was also a good opportunity to make some significant changes to the FreeDV Reporter service and website. First, the “left the chat”/”entered the chat” messages were removed by PLT request in order to make it easier to see actual chat messages. Next, the separate popup window for viewing who’s in the chat was removed in favor of an always-visible bar at the bottom of the chat tab containing the callsigns of the users that are logged into chat. The message backlog was also extended to 30 days (from 7 days) and preserved into a database so that the chat messages aren’t lost in the event that the FreeDV Reporter server needs to be restarted.
Besides the above, there were some other minor fixes with the Windows installer/uninstaller along with logic added to detect whether microphone permissions have been granted. RADE is also now called RADEV1 in the FreeDV application to differentiate it versus a future version 2 of RADE. Some infrastructure was also added to be able to sign macOS builds (required to avoid errors involving “damaged” applications in newer versions of macOS).
In any case, we’re now going to focus on additional testing prior to releasing a new preview build of FreeDV for general usage. Hopefully we’ll have additional updates on that soon.
More information can be found in the commit history below:
Tibor Bece and George Karan are collaborating with me on the baseband FM (BBFM) project. Tibor and George are veterans of the land mobile radio (LMR) industry, having worked together for many years and helped develop commercial VHF and UHF radio hardware with over 2 million units manufactured. They are pretty excited about the Radio Autoencoder work and what it could mean for LMR.
George has managed to build the RADE V1 stack, and run the ctests on a variety of embedded platforms, including AM625 – this is a high end embedded processor with enough power to run RADE (including the FARGAN stack); and a Librem 5 phone!
Tibor has been interfacing the BBFM ML stack to a COTs LMR radio, using a modified conventional digital voice frame structure to carry the “analog” BBFM symbols. Unlike my passband demo, this implementation has direct access to the FM modulator and discriminator so it’s a “DC coupled” arrangement – closer to what a real world, commercial implementation would look like.
Like me, Tibor was initially thinking the speech quality and low SNR performance of this technology was in the “too good to be true” category. However he has now performed controlled experiments on his (very well equipped) RF work bench, as was quite surprised to be getting high quality speech at RX signals levels down to -125dBm, several dB lower than analog FM or digital LMR systems like P25 would allow. At this low RF level the cut off is due to framing of the RADE symbols (not BBFM), as he never dreamed it would be necessary to operate at such a low SNR.
Tibor writes:
The 11dB SINAD point (around -121dBm) is where the squelch would normally fail to open, and a P25 frame would start dropping out. The RADE decoder munches through this with great ease, there is some barely perceptible degradation.
All I can say – WOW!
Here are samples (over the same radios) of analog FM and BBFM at various RF input levels from Tibor’s workbench:
FM at -124dBmBBFM at -124dBmFM at -121dBmBBFM at -121dBmFM at -117dBmBBFM at -117dBmFM at -110dBmBBFM at -110dBm
This month was focused on improving the integration of RADE with the freedv-gui application. One way this was done is through the creation of an automated test framework in the latter. This framework allows for the injection of audio into the receive or transmit chain and analysis of the result. Currently, we can retrieve the number of times FreeDV goes in and out of sync as well as analyze the loss between the result decoded by freedv-gui and the loss from the RADE reference decoder.
Another benefit of this automated test framework is that we can now automate testing of the FreeDV receive and transmit chain as part of our Continuous Integration process (CI). CI allows FreeDV developers to get immediate feedback when a change breaks existing functionality versus waiting until a user reports breakage after a release, improving the user experience. That said, there was significant initial effort involved in getting virtual audio devices working in our CI environment (and in the case of Linux testing, getting a working virtual GUI environment running).
On the RADE side, some minor work was done as part of the C port to ensure that freedv-gui could still compile. This involved ensuring that files weren’t defined more than once, as well as removing the version of libopus built by FreeDV in favor of the RADE version.
Further improvements will be made in our testing process over the next few months to ensure that freedv-gui produces the best result from RADE and integrates functionality currently missing from RADE (such as reporting of received callsigns).
More information can be found in the commit history below:
This month I conducted a successful test of the Baseband FM (BBFM) waveform, over a short UHF radio link on my bench. This demonstrates high quality, 8000 Hz audio bandwidth speech being transmitted over the air (OTA) using commodity FM UHF radios and machine learning. It’s early days, but the speech quality already appears competitive with analog FM and any VHF/UHF digital voice system I am aware of.
Here is a sample of the “ideal” BBFM audio ( a perfect channel), and the audio through the UHF radio link. The initial word “G” is missing due a sync issue that will be cleaned up soon.
The experimental system was a Yaesu FT-817 with a Rigblaster USB sound interface as the transmitter into a dummy load, and a Yaesu VX3 handheld with a USB dongle sound card as the receiver. I used the Python “passband” modem developed last month so the signal could be sent over the regular 300-3000 Hz audio bandwidth that commodity FM radios provide (i.e. no DC coupling to the FM modulator or special mods).
To test the modem I can send BPSK symbols instead of ML symbols – in this case I could see a bit of distortion on the scatter diagram. However when I plug the ML symbols back in the audio sounds just fine, indicating the system is quite robust to noise as expected. It’s early days so I haven’t set the deviation carefully or fine tuned the system, but this is a fine start.
C Port of Core ML
The next chunk of work from November was a C port of the Python core encoder and decoder at the heart of the RADE system. Fortunately, this is very close to RDOVAE that is now part of Opus, so much of the Opus ML code could be re-used, with the main change being a new set of weights. The C port resulted in a significant drop in CPU load, in fact it’s now hard to measure on my laptop.
Profiling suggests the remaining receiver Python DSP now dominates the CPU load. However I am reluctant to port this to C as (a) it’s complicated so this would take months and (b) I have some improvements planned for RADE V2 which, if successful, will make much of this DSP unnecessary.
End of Over Text
Unlike earlier FreeDV modes RADE V1 does not, at present, have a way of sending small amounts of text over the channel (along side the voice). This is particularly useful for “spotting” RADE signals, e.g. on FreeDV reporter and PSK reporter. We have plans for a text system in RADE V2. but this is several months away. As an interim solution for RADE V1, we are building up a text system that uses the currently empty “End of Over” frame to send digital data. Turns out we have room for 180 bits available there. So every time an over ends, a chunk of text can be sent by the system. I have developed the modem DSP side of this, and it seems to work OK on simulated fading channels at moderate SNRs (e.g. 6dB on fading channels).
Conference Paper
Finally, I have been working on a conference paper on the HF RADE system. This is new technology for HF speech communications, and combines several disparate technologies, e.g. machine learning, speech coding, OFDM, and HF radio. So I am putting in some effort to document and publish the work in a paper, hopefully at a conference (TBD) in 2025.
This month, additional work was done to clean up bugs encountered in the C API for the RADE library. One bug in particular involved an interaction between the threading already present in freedv-gui and the threads Python itself creates (i.e. for PyTorch); this bug resulted in extremely slow operation and even deadlocks in some cases.
Once this work was completed, it was time to integrate it into RADE. This work culminated into the first preview release of FreeDV with RADE support. Initial feedback thus far has been extremely positive, indicating that we’re on the right track with meeting the goals set out in the ARDC grant.
Further work over the next few months will involve fixing bugs discovered by users of this preview release as well as work on adding missing functionality (such as received callsign reporting) and a port of the main logic in the library to C to reduce/eliminate the need for Python.
More information can be found in the commit history below: