Floating-Point Audio Library Extension

I can't answer regarding these forked copies, but I can tell you just yesterday I got I2S slave mode working on Teensy 4.0 with the original audio library. Code is on github now and will be in the 1.49 release. As you can see in this commit, the code was mostly correct but little details like configuring the input select registers and getting the TCR4 FSD bit correct on the receive vs transmit side (one side syncs with the other) are the thorny little details.

When (if) anyone does decide to support I2S slave mode on any forked copy, hopefully this known good code will help.
 
Just some thoughts after browsing github.

It would be a good start for the 16-bit to whatever bit upgrade if the lower level (I2S and TDM, PCM is not supported, meaning MSB left justified, positive frame sync, no 1 BLCK delay in the beginning of the frame?) drivers supported only 32-bit bit word size and DMA, but then there were 16-bit or 32-bit FP libraries built on top of 32-bit DMA buffers to handle the conversion from 32-bit to 16-bit or 32-bit floating point. Even tough the 16-bit libraries were still used for audio processing the summing (mixing) could be implemented as a conversion from 16 to 32-bit integers which would increase the headroom a lot. Also synthesizer voices could be summed into 32-bit words even though the single voices would still be processed using the 16-bit algorithms.

The core audio should have no dependencies to audio library, now the audio_block_t is used everywhere. It would be easy to change the I2S drivers etc. to use their own 32-bit data definitions but leave the 16-bit audio library to use the current 16-bit audio_block_t (could be renamed to audio_block_i16_t though).

Btw,
It seems the floating point library still converts everything back to i16 format, and also configures the I2S word length as 16-bit:

I2S0_TCR5 = I2S_TCR5_WNW(15) | I2S_TCR5_W0W(15) | I2S_TCR5_FBT(15);

void AudioOutputI2S_F32::convert_f32_to_i16(float32_t *p_f32, int16_t *p_i16, int len) {
for (int i=0; i<len; i++) { *p_i16++ = max(-32768,min(32768,(int16_t)((*p_f32++) * 32768.f))); }
}
 
I am exploring the rewrite of an old project, the DSP-10 radio. I have had good experience with the T3.6 floating point radio receiver that Frank DD4WH put together. But, I would also like to build radio-/signal-related float objects, such as Chip Audette has done per this thread. The combination would seem to be a winner. I am currently playing with a couple of new float blocks that I am constructing to Chip's format, that seems to be a viable approach, but is not really integrated with the Teensy Audio Library.

Where does this stand? Should I manually combine the two library pieces, as in Chip's examples? Chip, what is the best way to combine radio pieces with your work?. And then there is T4 compatibility--any thoughts out there? Thanks to all for the efforts on Teensy floating point. Bob Larkin W7PUA
 
The combination would seem to be a winner.

Totally agreed! Yes, that would be great! Some thoughts:

* I am puzzled how to use the floating point lib by Chip in combination with the frequent updates of Teensyduino. So, how could one switch between fixed/floating point versions of the audio lib. For this, the floating point fork would have to be somehow merged/integrated into the official audio lib
* there are some nice modules by DerekR, I think he called them "Audio SDR", maybe those can be integrated too?
* we definitely need a newer CMSIS for T3.6 in the official audio lib. Without that, we will not proceed. For T4, no problem, the audio lib already has a newer version which works perfectly.
* T4 compatibility: There would have to be some changes/#ifdefs etc., but that should not be a major problem. To make my SDR code compatible with T4, it took "only" a few days of work
* and, more focused on SDR topics: we should think about using fast convolution for the major filtering tasks: FIR filtering in the time domain is much slower even for short filter lengths. But first of all: we would need decimation and interpolation modules in floating point, because with the CMSIS q15 fixed point decimation/interpolation routines, it seems we cannot reach the required audio quality.

Just my quick thought on your plans!
 
Well, there is a life beyond the Audio lib :)
In such projects (SDR) there is not much need for the lib - You just use the in/out parts (->no need for float there) and filters from it - right?
The lib has so much more, which is not used. I'd imagine it would'nt be much hassle to create a minimal, specialized lib.
You could even omit the whole "audio block system" in such a minimal lib and use a different aproach.
How does it work on STM32?
 
Last edited:
on the STM32 we do not use such things as "audio libs" at all. It is just one interrupt/DMA system that gathers the audio blocks and all the processing is done inside a loop.

The Convolution SDR uses only the queue objects from the audio lib and everything else (including all the filters) is done by hand or with CMSIS routines.

So you think integration into the standard audio lib would be too much hassle and one should create a specialized floating point SDR lib instead which builds up on Chips´ lib?
 
on the STM32 we do not use such things as "audio libs" at all. It is just one interrupt/DMA system that gathers the audio blocks and all the processing is done inside a loop.
You can do the same with a Teensy.

So you think integration into the standard audio lib would be too much hassle and one should create a specialized floating point SDR lib instead which builds up on Chips´ lib?
No, not a integration. A new lib without dependencies on the audio library. But you could just copy the I2S code.
Or, no lib at all and just use the I2S code in your program.
 
..this way you can use any datatype, blocksize and sample frequency you want - without any fiddling.
On T4, you can even use different frequencies for in- and output (by using two I2S') - i.e. 192KHZ input, 22kHz(plenty for radio - might free some resources?) audio output
Of course, you'll loose all the effects, filters, synths, mixers etc - but you don't need them for a SDR.
 
Hello to both Franks!! Good to be in touch again.

The DD4WH receiver seems to do somewhat what Frank B is suggesting? I have worked my way through that code and I think Frank DD4WH would agree that there is a non-trivial learning curve. It works and can be efficient in resources. This basic approach works, but is tough for casual experimentation.

I find not having a system to start with the big picture really overwhelms somebody getting started. Being able to have the Design Tool graphical interface, along with small functional blocks, makes a big difference in understanding. It places a burden on the block designer, especially if you want flexibility in block size, sample rate, decimation, etc. But it would be so neat to use.

I went and looked at DerekR's receiver. He has useful code, and a clean implementation, it would seem,. But, he again has the big SDR block that doesn't ask one to experiment and rebuild things.

We haven't mentioned it, but GNU radio does capture the spirit of small blocks and experimentation. It would seem great for simulation but it is not a drop in for Teensy.

Chip started to add a parameter object that had block size and sample rate control at compile time. This doesn't seem to be finished, but I wonder if this method could be useful? Maybe one could add decimation and the ability to track decimation time slots. But all this, if integrated into the 16-bit library, would need a way to hide it for regular audio applications, I think.

I'm trying to understand the possibilities, at this point. So thanks for giving this some thought and sharing. Bob
 
Elaborating a bit on decimation. I always seem to get involved with narrowing bandwidths to improve S/N and this runs out of processor without decimation. Decimation has blocks running at sub rates that can be handled nicely with a variable that keeps track of which sub-block is being processed. So, I think you need a couple of variables in audio_block_f32t to say what the decimation level is and which sub-block is being processed next.

But, this leads to a hodge-podge of objects that can handle some collection of changes in block size, sample rates, decimation and whatever. How might this be handled? Is it unrelated to Teensy Audio and belongs somewhere else? Back to the original question of some unifying plan, or not?

Bob
 
Bob, just some thoughts on this decimation block question:

* it could become even more complicated, because FrankB mentioned somewhere that it could also be possible and maybe helpful to have different ADC and DAC sample rates (only Teensy 4). So, maybe sample incoming wideband FM radio IQ signals at 256kHz and have the DAC output the processed audio at 16/32kHz -> we can get rid of the output interpolation block in that case, so saving some CPU cycles [I think the DAC often is a sigma-delta DAC with oversampling, so we do not need an analog reconstruction filter!? not sure about this]
* as far as I understand, the decimation blocks would also do the lowpass-filtering, not only the down-sampling. So, the sample rate would also have to be passed to the block in order for the block to be able to dynamically compute the filter coeffs for the lowpass filter, which are dependent on the sample rate and the desired bandwidth
* the decimation block could/should also use two-stage decimation (depends on the decimation factor) in order to save CPU cycles for the lowpass filters which can have fewer taps then

Trying to summarize the parametres to pass to a decimation block:
* input Block size
* decimation factor (do we allow for non-power-of-two factors???)
* desired sample rate in
* desired sample rate out (somewhat redundant...)
* desired bandwidth
* ...

Hmm, maybe this is not really compatible with a standard audio lib.

And: I really like your idea of having a GNU radio like style with small blocks that can be put together!
 
Bob, just some thoughts on this decimation block question:

* it could become even more complicated, because FrankB mentioned somewhere that it could also be possible and maybe helpful to have different ADC and DAC sample rates (only Teensy 4).

Yes, on different I2S interfaces (or mqs as output, maybe )
If you say that this would be useful, I could take a closer look and see what exactly is possible.
If it makes any sense, I can tune an I2s Input for 24 or 32 bit (I don't believe that it would be useful, but you are the xperts :)
 
..as i dont see much chance for this proposal here, i need something to play with :)
I can do a "SDR specialized audio-lib" - or better "SDR lib" with in-and outputs only as a first step.
Then, later, we can add more functionality and use a modified AUDIO-GUI to put things together.

I think that, without the needed massive changes, the audio lib has more or less reached the end of its development anyway. There will not be much progress in the next 1-2 years. Maybe there will be one or two new features. That's it.
 
Last edited:
It would be great to have you join the SDR efforts initiated by Bob! However, it would be necessary to very clearly structure things and be very clear in the goals we want to achieve and the target Teensys we want to cover (only 3.6 & T4.x ?).

Bob, also, I could imagine you have some specific requirements for DSP-10 ?
 
I'd say we can concentrate on the Teensy 4 only - There is a Teensy 4.1 with more pins & features in the pipeline. If I understand Paul correctly it's a matter of weeks/months.
This way would'nt have to worry about the limits (memory/speed) of Teensy 3.6, and maybe we can use double for some parts.
 
Last edited:
hmm, using double would make it necessary to rewrite large parts of the SDR code, because we would not be able to use CMSIS functions any more, is that true? At least I could not find routines for FFT, decimation etc. for double precision in the CMSIS lib. It may well be worth thinking about switching to the fftw lib for doing FFT, because it would allow using double and using more flexible FFT sizes (if that is necessary). However, as I said, a major rewrite.
Also, two aspects when thinking about double:
* what would we gain? Is the audio quality better then in any respect. A year ago, I was convinced, but now I am not so sure anymore if that is the case. For IIR filter coeffs, yes, it would make sense, but we would not use IIR filtering a lot, or do we?
* On T4: are double calculations faster than doing float calculations with optimized functions (sinf, for example?). I do not think so.
 
No, double is slower than float!
I don't know the CMSIS very good - if you say it does not support doubles, it may be easier not to use them.
Edit: You have mail.
 
Last edited:
Good morning! You two have been busy while I slept!

Why not just support 4.x? For me this would involve re-wiring the PCB for the Control Box
http://www.janbob.com/electron/SDR_Ctrl1/SDR_Ctrl1.html
That would be well worth it for the improvements in performance. I would wait for 4.1 for hardware changes, but 4.0 allows software playing.

Frank DD4WH-type, your object definitions with
Trying to summarize the parameters to pass to a decimation block:
* input Block size
* decimation factor (do we allow for non-power-of-two factors???)
* desired sample rate in
* desired sample rate out (somewhat redundant...)
* desired bandwidth
* ...
could be really powerful for somebody getting started, especially if it could be integrated into a Design Tool. You get an error when you try to defy physics before you even get to compile time.

One other thought, you can have an implementation of an IIR filter in double without needing more than floats for ins and outs. Almost nobody could need more than the 24-bits of dynamic range floats give for ins and outs. And then can't you have an IIR filter with compile-time selection of the internal size. That makes experimenting easy, and we all think that is fun :)

Time for morning coffee... Bob
 
Yes, when I wrote "Teensy 4" I really meant "Teensy 4.x" :)
I'd use a FIFO for the data - something like the Audio Libs "Audioblocks", with inbuilt queue. And larger blocks.
DD4WH if I read your code correctly, you use 32*128 samples - is this correct? Does this have a mathematical reason?
 
the reason for using 32 * 128 samples is the following:
* I use fast convolution for complex filtering of the I&Q signals
* I would like to have a steep filter skirt
* thus I need a large FFT/iFFT size for brickwall filtering, I decided to go for 1024
* in order to lower the CPU load I decided to decimate the date BEFORE the FFT/iFFT fast convolution filtering with a decimation-by-8 [this also makes the filter skirts 8 times steeper :)]
* input sample rate: 96ksps -> decimated to 12ksps -> maximum audio bandwidth: 5kHz
* input samples 32 * 128 decimated by 8 means 4 * 128 = 1024 input samples for the FFT/iFFT
* if I want higher audio bandwidth for AM reception, I just switch the sample rate to a higher rate, decimation factor and everything else stays the same

BTW, I wonder, if it is OK for you, Chip, to hijack your thread, or if we should open up a new thread!?
 
A question. This relates to being able to have multiple signal flow paths and being able to turn them on and of depending on things like AM, SSB, FM, Transmit/Receive. And, of course, we don't want to spend resources on blocks that are not being used.

For instance, each object could have a method "enable," or something, that could set/unset a private variable "enabled." The question is, if at the top of the update(), there was a statement, "if (!enabled) return;" would that still allow the rest of the audio streams to proceed? This would not execute transmit of data at the bottom. Is that an issue? Or is there a better way, altogether? Bob
 
Referencing back to the original topic of this thread... just wanted to share that most of the floating point audio library extension is working for me on a Teensy 4.0. I just removed the hardware-specific parts, e.g. I2C I/O. Since the ADC's and DAC's I'm using are 16 bit, I just use the standard integer library for I/O, then convert to/from float for processing. Floating point filters and analysis are essential for the low-frequency biometrics projects I'm working on these days. thanks Chip!
 
It has been a dormant period for this topic, but not for the library. The floating point Teensy Audio-like library, aka "OpenAudio_ArduinoLibrary" has seen a lot of fixing and additions. I think it is mature enough to be really useful for audio projects. The library is at https://github.com/chipaudette/OpenAudio_ArduinoLibrary More is being added, but my original goal of being able to do projects like the DD4WH SDR type is about there.

All (I hope) the classes are compatible T3.x and T4.x.
Support for varying sample rates via the Settings and I2S I/O is working
A collection of SDR "radio" classes have been added.
A group of complex input FFTs have been added that when used with I-Q SDR's double the frequency range.
The complex FFT's are now 256, 1024, 2048 and 4096 in size, at least for the T4.x.
Inter-mixing of conventional Teensy I16 classes and these F32 ones works well. Very useful.
There are several classes, like the sine-cosine generator, that use the F32 precision to lower distortion.
Use of data block sizes other than 128 is in part of the library. I recommend using 128 if possible.
There is no support for decimation (multiple sample rates) between objects. I wish this was simple, but it isn't for me!

Then, there is a Design Tool, http://www.janbob.com/electron/OpenAudio_Design_Tool/index.html This has a lot of documentation included in the right side help panels. Jannik is helping by providing with an enhanced Design Tool that would track data paths by data type, allowing mixes of I16 and F32 objects. When finished, that will replace the one in the link.

Also, there is a group, https://groups.io/g/keithsdr that is doing a Teensy 4.1 radio project that is using the library.

Importantly, note that this library reflects the help from many contributors. Huge thanks for all the help.

Have fun. Bob
 
An update of an addition to the F32 library.. An FM Detector, called "RadioFMDetector_F32" has a single low I-F input, such as at 15 kHz. There are two outputs, one being continuous audio to drive tone decoders and such, and the other output being gated by the squelch. The algorithm uses a precision arc-tangent phase detector followed by a differentiator. This is very linear and accurate. A programmable low-pass filter follows the detector and includes standard NBFM de-emphasis. The squelch is noise derived with a band-pass filter above the voice range feeding a full-wave squelch detector. The squelch low-pass is programmable to adjust squelch tails.

There is a ReceiverFM.ino example that shows how to use all this..

A companion phase/frequency modulator is in the works as is a tone and DTMF detector.

Information on use and functions is at the F32 design tool http://www.janbob.com/electron/OpenAudio_Design_Tool/index.html?info=RadioFMDetector_F32
 
Hi to All - I have been away from most of the electronic stuff for the last 6-months. I'm happy to report that my wife is happy with the repairs for her broken hip and I am equally happy with my new pacemaker to fix a broken AV connection. We are both back 100% and ready to get with it.

I was able to get some time to play with the radio related floating-point blocks. Most traditional radio functions are in the library at this point and it is being used for several SDR projects. But for now, I just wanted to report on a new sub-audible tone detector. This CTCSS (see Wikipedia) system is very common on analog FM systems and this goes with the AM-FM generator and the FM detector classes that make the needed functions for a full (narrow band) FM system.

The elements of the CTCSS detector block/class include
* It works for all frequencies from 67 to 254 Hz, i.e., all 50 channels.
* Decimation by 16 allows good accuracy and minimal processor.
* The Goertzel filter is narrow enough to only include one channel.
* The 23/24 bits of F32 has not been an issue, at least with decimation that is used.
* A bit different approach is the uses of a reference channel covering the 67-254 Hz band. This includes a four staggered notches on the tone frequency.
* The Goertzel algorithm includes a"trick" so that fractional cycles of the tone give the proper answer.
* Built-in filtering supports 44.1, 48, 96 and 100 kHz sampling rates. Other rates need a few .ino supplied filter coefficients.
* Reliable detection of tones works fine on signals that are too weak for voice to be heard.
* For convenience, signals can be gated by the tone presence through the detector block.

The class is analyze_CTCSS_F32 and the files .h and .cpp have the same name. These are at Chip's OpenAudio_ArduinoLibrary and the support Design Tool is http://www.janbob.com/electron/OpenAudio_Design_Tool/index.html. Look for Analyze/toneCTCSS on the left side.

Cheers, Bob
 
Back
Top