Fast Convolution Filtering with Teensy 4.0 and audio board

Status
Not open for further replies.
you can mount audioboard below or above Teensy.
Only thing to pay attention to is that the pin-labels Gnd 0,1..12, etc correspond exactly. That is why they are printed on the pcbs

I myself prefer Teensy above audioboard, even if uSD is hard to reach.
 
@DD4WH:
Could you post which changes have to be made to the code
"Fast Convolution Filtering with Teensy 4.0 and audio board (Github)"
for using the T4-audio shield rev D instead of the separate DAC and ADC?
 
Please read post #91
your question is also answered in the code itself in the comments
 
Last edited:
DD4WH:
Compiling the code as it is I get this error message:
....
c:/program files (x86)/arduino/hardware/tools/arm/bin/../lib/gcc/arm-none-eabi/5.4.1/../../../../arm-none-eabi/bin/ld.exe: o:\TEMPOR~1\arduino_build_502961/Teensy_Convolution_filtering.ino.elf section `.bss.dma' will not fit in region `RAM'
c:/program files (x86)/arduino/hardware/tools/arm/bin/../lib/gcc/arm-none-eabi/5.4.1/../../../../arm-none-eabi/bin/ld.exe: region `RAM' overflowed by 10400 bytes
collect2.exe: error: ld returned 1 exit status
 
Which Teensy version did you choose in Teensyduino? Which Teensyduino version do you use? Which sketch did you use? Give complete code or EXACT link to the code.
 
Try the code inside the folder guitar_cabinet_impulse: copy all the files into a folder, not only the .ino-file
 
Yes, guitar compiles.
The not-compiling uniformly... file is moved by the IDE to the sketch folder (which is on a separate drive).
 
@DD4WH, I'm wondering if you could help me clarifying a few things. It's about arm_cfft_f32() function. The docs for it seem require much more of a context comparing to what I have. I'm looking to modify the code to work with a single channel (so that I can apply two different IRs to both channels separately).

I'm looking at the code and it goes like:
Code:
// calculation is performed in-place the FFT_buffer [re, im, re, im, re, im . . .]
arm_cfft_f32(S, fftin, 0, 1);

The question is about those real and imaginary. Is that function designed to work on the stereo only? Should I always have the data for both channels in all the structures being passed to it?

Thanks!
 
@hoho: sorry, but you would have to explain more precisely what you are trying to do. The original code applies an impulse response to the left audio channel and the same impulse response is applied to the right audio channel.
So, what do you want to do? Apply one IR to the left channel and a different IR to the right channel? You would have to rework the code for that and of course you would need additional memory, because you need the memory for TWO IRs.
The current code uses a complex-to-complex-FFT (arm_cfft_f32), so it is possible to get two channels IN and two channels OUT in order to minimize processing load. If you use a real-to-complex FFT (arm_rfft_f32), you could theoretically use an FFT of half the size, if you only want one channel IN, one channel OUT. However, you do not save half of the processor cycles, but only 30%, and you do not save memory at all, because the buffers for the real-to-complex-FFT have to be the same size . . . so I do not know it is worth applying real-to-complex FFT in this case.

@ItsMe: hmm, there is no code with that name in the given github repo . . . so I do not know what you mean. Does the code in one of the repositories compile for your Teensy 4.0?
If you only want to filter audio with Fast Convolution, you only need to take this code from the first post #1(https://forum.pjrc.com/threads/5726...nd-audio-board?p=212845&viewfull=1#post212845)
However, it will only compile for Teensy 4.0, not for any other Teensy model !
 
@DD4WH: I am referring to this code from the repo.: "uniformly_partitioned_convolution.ino" which give the error message not enough memory or so. The guitar ino compiles.

"(@ItsMe: hmm, there is no code with that name in the given github repo . . . so I do not know what you mean. Does the code in one of the repositories compile for your Teensy 4.0?")

@DD4WH: I will give this a try.
If you only want to filter audio with Fast Convolution, you only need to take this code from the first post #1(https://forum.pjrc.com/threads/5726...nd-audio-board?p=212845&viewfull=1#post212845)
However, it will only compile for Teensy 4.0, not for any other Teensy model ![/QUOTE]

Thank you very much.
 
@DD4WH, thanks, I think arm_rfft_f32() is the droid I was looking for! :).

I connect one instrument to one channel and another instrument to the second channel. And I want to have different IRs for the connected instruments (say, one for the guitar and the second one for the ukulele). I realize that I have the memory limits which will put some constraints on the length of the IR if I want to apply both simultaneously. I suppose I'm not looking to optimize the processor (given that the 22016 taps takes 93.48% of the memory and only 50.62% of the processor). But I certainly don't want to process stereo to only pick one channel in the end. So, I will try to rework for arm_rfft_f32().

Thanks!
 
As I mentioned, using a real-to-complex FFT with the CMSIS functions will not save memory (because the output buffer has to be two times the length of the FFT and you need a separate input buffer: in a complex-to-complex-FFT your processing is done in-place one buffer of size 2*FFT length), if I remember correctly from my experiments with the arm_rfft function. And -as you mentioned- memory is the limiting factor here, not processing speed. So it might not be worth playing with rfft except of course for didactic reasons.
 
As I mentioned, using a real-to-complex FFT with the CMSIS functions will not save memory (because the output buffer has to be two times the length of the FFT and you need a separate input buffer: in a complex-to-complex-FFT your processing is done in-place one buffer of size 2*FFT length), if I remember correctly from my experiments with the arm_rfft function. And -as you mentioned- memory is the limiting factor here, not processing speed. So it might not be worth playing with rfft except of course for didactic reasons.
May I put forward a slightly different interpretation. If one has only real valued samples, then rfft is always the better choice. You can do the FFT roughly 2 times faster and you only get the frequencies that matter (e.g. for a spectrogram). No need to waste a imaginary vector and discarding the negative frequencies after FFT.
 
May I put forward a slightly different interpretation. If one has only real valued samples, then rfft is always the better choice. You can do the FFT roughly 2 times faster and you only get the frequencies that matter (e.g. for a spectrogram). No need to waste a imaginary vector and discarding the negative frequencies after FFT.

Fully agreed, Walter. But if you want to use the CMSIS functions for real FFT, you do not save memory because the CMSIS function still wants a buffer of the same size as in the complex FFT (even worse, you need an additional buffer of 1*FFT length). But maybe I am doing something wrong in using these functions?

I have a sketch here comparing real and complex FFT, where you can see the different buffer requirements:

https://forum.pjrc.com/threads/58054-CMSIS-5-3-0-and-CMSIS_DSP-1-7-0-on-teensy-4?p=219628&viewfull=1#post219628

Processing speed will probably be faster with the real FFT, thats right.
 
But maybe I am doing something wrong in using these functions?
I'm pretty sure on that :) Sorry.
CMSIS rfft implementation complies to standard usage.
e.g. a 1024 data vector is treated as a 512 point complex vector, and a 512 point complex FFT is executed. the result is then unscrambled to generate the correct 512 complex spectral values.
The trick is the unscrambling.
The rfft instance structure which holds info on 512 cFFt and the unscrambling operation may be larger than the 1024cFFT instance structure. I have not checked the sizes.

BTW, for fast convolution, the unscrambling may not be needed when spectral multiplication is replaced by a somewhat complex operation.
 
The CMSIS documentation says (https://www.keil.com/pack/doc/CMSIS/DSP/html/group__RealFFT.html) the real length N FFT uses a buffer of size N (because it holds the complex output values of a length N/2 internal FFT). That is something I can understand.
However, the CMSIS arm_rfft_f32 routine wants an output buffer of size N * 2, otherwise it refuses to work (the code compiles, but gets stuck when run, see the code linked to in post #119). That is something I do not understand, but apart from that everything seems to work fine, if the buffer is large enough. And it is for this peculiarity that I think it is not worth doing real FFT with the CMSIS function, if you want to save memory.
 
I was curious ..I see that you can generate your filter parameters with a function call /method ....can the code be modified to produce Hilbert filters with +/- 45 deg phase offset ..it be a nice feature to produce sdr frontend hilbert filters on the fly ..im processing in the time domain ron my SDR ight now ..and understand conceptually processing in the frequency domain but still reviewing code ..Great job by everyone on here ...thanks for putting all these code examples up..
 
The CMSIS documentation says (https://www.keil.com/pack/doc/CMSIS/DSP/html/group__RealFFT.html) the real length N FFT uses a buffer of size N (because it holds the complex output values of a length N/2 internal FFT). That is something I can understand.
However, the CMSIS arm_rfft_f32 routine wants an output buffer of size N * 2, otherwise it refuses to work (the code compiles, but gets stuck when run, see the code linked to in post #119). That is something I do not understand, but apart from that everything seems to work fine, if the buffer is large enough. And it is for this peculiarity that I think it is not worth doing real FFT with the CMSIS function, if you want to save memory.
OK, frank
the RFFT really calls a N/2 complex FFT (generating N/2 frequencies), to go back to real time series you take the N/2 frequencies, after manipulation, and come back to N real numbers.
 
@ Keith: My part in all of this was to port DD4WH's work from his inline-sketch code into a PJRC Audio library function. I'm not an expert on filter math. So, I can't help with the Hilbert question. But, for the segmented FFT, you have to tell Matlab, (for instance) to generate minimum-phase FIR coefficients, or the low latency of this filter will be lost. I would guess that Matlab could generate the Hilbert filter coefficients- but I'm retired now and no longer have the Matlab program I used to have available at the university where I worked.
 
Thanks for the insight ...yes Matlab is in the home here ..(daughter is in engineering at University of Waterloo)..If I remember correctly you are in Halifax ..Im west of Totonto ..cheers ..and thanks ..my dsp skills are lacking as I've not worked with DSP for several years ..So playing catchup and grabbing some online vourses ..
 
Status
Not open for further replies.
Back
Top