Fast Convolution filtering in floating point with Teensy 3.6

Status
Not open for further replies.
wow, Walter, that´s a lot of code to digest! Will take me a while to think through it and test on my machine.

Just two quick Q:

- the lowpass filter cutoff freq is 0.6 * sample rate / 2, right?

- the IQ signal frequency (within the noise) is 48 /128 * sample rate = 0.375 * sample rate


Q1 So, the signal should show up well in the spectrum at 0.375, but it shows up at 0.75 (factor 2!?) . . .

Q2 Did you also test whether a signal of say 0.8 would be suppressed by the filter? Because that would resemble the problem with the performance at higher freqs

Frank,
Following Matlab convention I normalize the filter to Nyquist so fc = 1 is Nyquist (in your case it is 0.5)
On figure 1 is Nyquist

A signal 48/128 has 48 cycles in 128 samples so the frequency is about 16.5 kHz or 0.75 Nyquist as you see on spectrogram.
In fact the figures show the original data with signal and the filtered data (signal suppressed)
 
I uploaded my files onto GitHub https://github.com/WMXZ-EU/SDR
I actual version uses
- periodic timer to simulate acquisition
- software interrupt to process data

Even if the repository is called SDR the actual program is only a filter.

next step could be to interface to audio card
 
next step could be to interface to audio card
The limit of the audioshield, MIC, is ~230000 sampels/sec. (did not test line-in)

But we have the way faster ADC - i'm working on a "fast_adc" input for the audio.lib.
On T3.6, according to the manual, the limit is 1.2MSamples/sec - (we can try overclocking - wonder how much is possible without loosing too much precision)
 
I uploaded my files onto GitHub https://github.com/WMXZ-EU/SDR
I actual version uses
- periodic timer to simulate acquisition
- software interrupt to process data

added feature:
Can now specify high-pass filter (with Kaiser window)

noise_48o128._128_10_10_HP.jpg

will next think about band-pass filter
 
Last edited:
The limit of the audioshield, MIC, is ~230000 sampels/sec. (did not test line-in)

But we have the way faster ADC - i'm working on a "fast_adc" input for the audio.lib.
On T3.6, according to the manual, the limit is 1.2MSamples/sec - (we can try overclocking - wonder how much is possible without loosing too much precision)

on-board SAR will need either Anti-aliasing filter,
or could try a decimating low/bandpass filter (poor-man's sigma-delta replacement), but then time-domain FIR may be more efficient.
 
Walter, thats great! As I interpret it, you are no longer seeing the effect of unsufficient stopband attenuation in the high frequencies? Very good! That would mean, that there still is a bug in my code, that produces this. Will continue looking for that!

It is however very hard for me to grasp what you do in your code. I have no idea, what all this interrupt stuff does and how it works. And also I must admit, my knowledge of working with pointers is rather limited, so I will need some days to understand how your pointers work and which variables are dealt with in the FFT filtering . . .
 
OK, learned a lot about pointers while looking at your code! ;-) Also I now understood well your FFT filtering subroutine.

And: I found my bug! It was connected with copying the FIR coeffs into the filter mask (i+=2 instead of i++ in the loop)! Now stopband attenuation works very nicely! (I have corrected it in one of my first posts)

@Walter: Thanks a lot for all the good coding and MATLABing, that kept my motivation on a high level, so I could finally find my bug!

I am not yet decided whether I should really try to reduce the overlap, which is 50% in my code and 25% in your code. It surely depends on the application. If I want the steepest filter for my radio, a 50% overlap would allow me to use a 2048tap-Filter with a 4096 point FFT: with 25% overlap the max no. of taps would be 1024 with a 4096point FFT. However, if I have additional noise reduction algorithms etc. running in the background, 25% overlap would be more efficient with the same Filter strength. Maybe I will make the code flexible to allow both . . . let´s see.

My next steps would now be to:

- add your awesome highpass, bandpass, notch possibilities to the code
- add sideband selection (which means simply stuffing zeroes into the FFT output buffer before the complex multiply)
- add IF, ie. selecting the frequency to demodulate in the 44.1k frequency band --> shifting bins to the left and the right
- add AM demodulation (= sqrt(I*I+Q*Q) of the iFFT output)

BTW.: I rediscovered this nice paper which describes a lot if these things for an SDR (also nice to see that the Teensy 3.6 has the same processing power as a PC in 2002 ;-))
http://www.arrl.org/files/file/Technology/tis/info/pdf/021112qex027.pdf
 
Yes, but the only reason why they were so simple was that there were no Teensys available in those times ;-).

And the selectivity/sensitivity was a "bit" lower. And no SSB demodulation ;-). But no battery needed . . .
 
OK, learned a lot about pointers while looking at your code! ;-) Also I now understood well your FFT filtering subroutine.

And: I found my bug! It was connected with copying the FIR coeffs into the filter mask (i+=2 instead of i++ in the loop)! Now stopband attenuation works very nicely! (I have corrected it in one of my first posts)

@Walter: Thanks a lot for all the good coding and MATLABing, that kept my motivation on a high level, so I could finally find my bug!
Thanks, I had a moment of motivation.
I am not yet decided whether I should really try to reduce the overlap, which is 50% in my code and 25% in your code. It surely depends on the application. If I want the steepest filter for my radio, a 50% overlap would allow me to use a 2048tap-Filter with a 4096 point FFT: with 25% overlap the max no. of taps would be 1024 with a 4096point FFT. However, if I have additional noise reduction algorithms etc. running in the background, 25% overlap would be more efficient with the same Filter strength. Maybe I will make the code flexible to allow both . . . let´s see.
This sounds reasonable
My next steps would now be to:

- add your awesome highpass, bandpass, notch possibilities to the code
- add sideband selection (which means simply stuffing zeroes into the FFT output buffer before the complex multiply)
- add IF, ie. selecting the frequency to demodulate in the 44.1k frequency band --> shifting bins to the left and the right
- add AM demodulation (= sqrt(I*I+Q*Q) of the iFFT output)

BTW.: I rediscovered this nice paper which describes a lot if these things for an SDR (also nice to see that the Teensy 3.6 has the same processing power as a PC in 2002 ;-))
http://www.arrl.org/files/file/Technology/tis/info/pdf/021112qex027.pdf

I guess, you could combine SSB selection with filter mask (edit: should say with modulator).

BTW have you seen the 5W SRD tranceiver
https://www.kickstarter.com/projects/hobbypcb/rs-hfiq-5w-software-defined-radio-sdr-tranceiver
 
Last edited:
Frank (DD4WH),
I added a Hilbert transform (type = 4)
I generates a complex filter where real part is simply a delay and the imaginary part is the Hilbert transform (with Kaiser window)
(window length is now twice the number of coefficients)

So you could do for

USB modulation (Matlab language)
Code:
f1 = fftfilt(ht, f0);
f2= real(f1).*cos(2*pi*fc*t) - imag(f1).*sin(2*pi*fc*t);

where
ht are the (complex) Hilbert transform coefficients
fo is the real audio signal
fc is the modulation (IF) frequency
f2 is the real USB signal

and for USB demodulation
Code:
f3 = fftfilt(ht, f2);
f4= real(f3).*cos(2*pi*fc*t) + imag(f3).*sin(2*pi*fc*t);

where
f4 is again the real audio signal

to change to LSB you have to change to signs before ' imag(.)'

Caveat: It worked on matlab but have not tried with real I/Q signal

Question? what is the typical IF for SDR?

The QRP article by VE3MKC talks about 11 kHz. Is that typical or only driven by the 44.1 kHz sampling of the audio card?

Note:
an alternative approach to do Hilbert transforms would be to simply put all negative frequencies to zero and multiply positive frequencies by 2, letting DC unchanged. This may be useful if you wanted to combine Hilbert transform with band-pass filter.

edit: some errors corrected in Hilbert filter
 
Last edited:
Walter, thanks for that, I can hardly keep up with all your new code and good hints!

Yes, Hilbert transforms are very nice for SDRs. The alternative approach for Hilbert transfom sounds very tricky. It seems in the frequency domain some things are quite easy to perform!

I am not sure, I probably have to read a little more, but as far as I have understood it, for demodulation of USB/LSB with the Fast convolution approach, no more Hilbert transform is necessary. You would just set the bins of the unwanted sideband to zero. I will experiment with that a bit and see how/if it works and report back here . . .

Thanks also for the link to the Transceiver, which I did not know yet!

I built an mcHF transceiver last year (which I can recommend to anyone interested, the RX is excellent and beats many (if not most ;-)) commercial transceivers on the market), which also is an I/Q-SDR, but works as a standalone radio with an STM32F4 and does not need a PC anymore. http://www.m0nka.co.uk/
We have been optimizing/extending the software of the mcHF for quite a while now. https://github.com/df8oe/mchf-github
The mcHF uses the phasing approach with Hilbert transforms for opposite sideband rejection.

My impression is, that there is no standard IF for SDRs. The IF for SDRs is often sampling frequency / 4 or by / 8, because then you can take a simple and efficient approach for frequency translation to baseband (complex quadrator oscillator with a precalculated sin/cos wave and without cos/sin calculation on the fly --> for the difference see my code in the Weaver demodulation thread). in the mcHF, for example, the user can choose between -6,+6,-12,+12k (sample rate 48k). In my Teensy SDR I use sampling rate / 8, that is 5025Hz. it really does not matter much for reception, what IF you have, the noise you want to avoid with the IF approach is mainly below 2kHz. Sometimes with IFs of sample rate / 8, you can run into trouble with AM reception when you have a very wide bandwidth you listen with (wider than samplerate/8) and your carrier falls exactly into your IF. Then the reception gets distorted (but offtuning a few hundred Hz helps and everything is fine again). Another trade-off is that you would like an IF small enough so that on the spectrum display you have enough room up and below your receive frequency and the receive frequency is not too near to the edges of your spectrum display.
 
just a quick report on preliminary tests on real IQ SSB data:

Frequency tuning by bin shifting AND LSB demodulation works !

both are implemented using only two lines inserted after the FFT before the complex multiply (I begin to like frequency domain):

Code:
for(i = 0; i < 2048; i++)
     {
        FFT_buffer[3072 * 2 + i] = FFT_buffer[i + FFT_shift]; // shift LSB 
        FFT_buffer[i] = 0.0; // fill USB with zero
     }

the result is "demodulated" LSB with suppressed opposite sideband and lowpass filtered by the FIR filter with 2049 taps, using 7696 microseconds for 16 x 2 audio blocks, meaning 16.5% processor load (7696/16/2900) and using 66 audio buffers. More in the next days.
 
Why dont you use a standard DDC architecture? A software DDS will give you tons of flexibility. Plus decimating the output stream will further help with compute efficiency.
 
Hi Jim,

not quite sure what you mean with DDC. As far as I know, fast convolution filtering is being used in direct sampling architectures (where the antenna signal is directly sampled with a very high sample rate; thus these mostly use FPGAs and high-speed ADCs; an example is the Perseus SDR) AND also in IQ-SDRs that use a hardware quadrature sampling detector to produce "narrowband" (typically 48k, 96k, or 192k) I&Q signals from the antenna signal which are then sampled by the PC or a microprocessor like the Teensy.

In my view, if we wanted to use DDC (as I would like to receive signals from DC to 30MHz), we would need a much faster processor (and ADC) than the Teensy, because it is restricted to sampling up to 1.2MHz, as far as I know (and it would be good to have at least 12bit, better 14bit resolution at that sample rate), which restricts our upper receive frequency to 600kHz. But maybe you had another idea and I misunderstood your proposal of the DDC?

And also I did not get the idea with the software DDS and decimation. What would I need the software DDS for and how could I use decimation of the output stream?

Sorry for so many questions, seemingly I have to learn a lot more about SDR.

All the best,

Frank
 
just a short update on my progress:

- spent a considerable amount of time on finetuning the spectrum display for smoothness and nice look and feel
- spent some time on trying to understand why demodulation is so easy in the frequency domain.

I used this paper and found it very helpful:

http://www.3db-labs.com/01598092_MultibandFilterbank.pdf

Lessons learned:

- overlap of 25% is probably best in terms of efficiency, but for the steepest filters I use 50% overlap and an FFT of 4096 points with a 2049 tap filter, which is the maximum of taps you can use with a 4096 point FFT

- frequency shifting is very easy in frequency domain: you just "rotate" the output of the FFT in the buffer and feed the result to the iFFT. However, the resolution is limited by the number of bins in the FFT (4096 bins for 44.1kHz is 10.7Hz per bin) AND the overlap factor (which is 2 in my case and 4 in Walters case). I discovered that when trying different shifts (and having strange audio when using odd bin numbers as the shift!) and read about that problem afterwards in the paper ;-). So, it is very easy to use an IF for the SDR and do the frequency translation to baseband by rotating the bins in the buffer. So, no complicated quadrature oscillator necessary anymore for frequency translation!

- I was very astonished that demodulation of LSB/USB in frequency domain is not necessary! You just feed the real output of the iFFT to your output audio and that´s it. To choose between LSB/USB you fill the lower/upper bins with zeros!

- AM demodulation can be done in different ways: take everything that´s in all bins and calculate magnitude by sqrt(I*I+Q*Q) OR choose the sideband (by filling the other sideband with zeros) and calculating magnitude. Very nice to be flexible when receiving stations with interference in one sideband

- the FIR filter that is calculated on-the-fly with the desired properties by the Teensy itself can very easily be visualised by just plotting the FFT_filter_mask (which is the FFT of the filter coefficients) with the spectrum display function. So when you change your filter bandwidth/type, you can verify your filter by visually inspecting it --> quite sexy

- I built in the sample rate function by Frank B: that enables the user to choose the bandwidth of the spectrum display (up to 96kHz possible with 96k sample rate; yes, Nyquist is == sample rate, because we use I&Q as input, which effectively doubles the sample rate!). However, at 96k sample rate, the processor is at 45%, so that is probably the limit, but at 96k we can use the full 2049 tap brickwall filtering

Will post the code in a few days, have to clean up a lot ;-). I will open up a new SDR thread, so this thread can still focus on questions of convolution specifically.
 
- I was very astonished that demodulation of LSB/USB in frequency domain is not necessary! You just feed the real output of the iFFT to your output audio and that´s it. To choose between LSB/USB you fill the lower/upper bins with zeros!
Someone said, this is waste of processing, (as you estimate also a 90 deg shifted audio signal, that you throw away)
But this is then handy for AM:
- AM demodulation can be done in different ways: take everything that´s in all bins and calculate magnitude by sqrt(I*I+Q*Q) OR choose the sideband (by filling the other sideband with zeros) and calculating magnitude. Very nice to be flexible when receiving stations with interference in one sideband

I will open up a new SDR thread

Yes, let's do that. I'm getting the mcHF boards right now and plan to replace the STM processor with a teensy. Not sure yet, how to do that. Or even replace the whole ui board with an AK4558 + Teensy3.6.
 
Someone said, this is waste of processing, (as you estimate also a 90 deg shifted audio signal, that you throw away)
I don´t think so, it´s not thrown away, but -as far as I understand- it serves to suppress the mirror.

Yes, let's do that. I'm getting the mcHF boards right now and plan to replace the STM processor with a teensy. Not sure yet, how to do that. Or even replace the whole ui board with an AK4558 + Teensy3.6.
Yes, would be great to have a highend codec board! But that´s probably overkill for a shortwave radio ;-). Even with the Teensy Convolution SDR you will probably hear no difference in sound quality. The digital noise from the Teensy will alway be louder than the 80dB SNR you can reach with the Teensy audio board codec or even 100dB that you can reach with the AK4558. But for other purposes it sounds great!
The mcHF is a great board, but it has so many features (including transmitting), that it´s not the best board to start for an SDR. but it´s a great thing, if you want to develop something an a high level, although the power HF amp needs some improvement.

Will open the SDR thread in a minute (also with some recommendation for rather "simple" SDR hardware to start with for the Teensy Convolution SDR).

Here is the Teensy Convolution SDR thread:

https://forum.pjrc.com/threads/40590-Teensy-Convolution-SDR-(Software-Defined-Radio)
 
Last edited:
Is that possible to do Fast Convolution filtering with Fixed point devices something like Teensy 3.2 ?? Maybe one of you guys can modify the code for Teensy 3.2
 
Hi Wicky21,

if you want code for fast convolution, you better write it yourself ;-) [or have a look at the code that is already there, try to understand it and modify it for your purposes, 2nd option: buy a Teensy 3.6 and run the code that is available on github].

Come on, you spend so much time in reading and writing in this forum in order to get others to write code for you: just write it yourself, that is time better spent . . .

Have fun with the Teensy,

Frank
 
Yes. Should start to write a code. Anyway is their proper document that explain how FFT/Convolution algorithm works?
 
Last edited:
Status
Not open for further replies.
Back
Top