Teensy Convolution SDR (Software Defined Radio)

Yes, but double is much slower than float. Ok, don't know if that matter much with 600MHz (or way more, overclocked.. needs a heatsink! )
 
Hi people,

I've just converted my SI5351 tuner clock to direct quadrature generation without need for a divide by 4 Johnson counter. The I and Q outputs are on CLK0 and CLK1. It generates a very clean quadrature pair BUT the downside is that it only reaches down to 80m (with the PLL running below the 600MHz specified limit).

I've built a library for it, and it's working great on my SDR. If anyone is interested, we can work out a way to get the code to you.

My current project may be of interest to this thread. My goal is to create a single Audio Library block that is a complete "SDR-in-a-block". Let's call it AudioSDR. It will take the IQ input at the standard 44.1kHz sample rate and produce the demodulated audio. In the first pass I will limit it to SSB and CW modes. I am using frequency domain processing throughout.

So far the block 1) takes the complex FFT of the IQ input, 2) band-pass pre-filters (with selectable bandwidth) centered on 8kHz (to avoid the crap around dc), 3) uses a complex frequency-domain frequency-shifter from the 8kHz down to dc, 4) does a frequency-domain Hilbert transform (90 deg phase-shift), 5) does a sum/difference for USB/LSB selection, and 6) does audio post-filtering. Different pre-filter bandwidths are provided for SSB and CW. Everything is done in the standard 128 length Audio block, in a length 256 complex FFT, using float32 arithmetic. I've written the code, looked at the performance on the work bench, but have not yet connected it to the SDR front-end. It's working, but not quite ready for prime-time...

Today I've been looking at a new idea for a 90 deg phase shifter based on manipulating the spectrum according to the strict frequency domain definition of the Hilbert transform. I've simulated it in MATLAB, and the first pass at Teensy code looks to be quite promising. There's more work to be done...
 
Hi DerekR
The 74c74 does a clean job too and so cheap, but well thought out. What's 80m though?

More interesting is your "SDR-in-a-block" comments. For people like myself who go as far as writing code using existing library's for various items like TFT's, encoders, i2c and so on, this could be the way to go. I look forward to your developments.

Cheers, Rob
 
@Frank: OK, did not know that, I thought the FPU of the new T4 would handle double as fast as float ;-)

@Rob: if you use the Si5351 directly, your need for 4x higher LO frequency is avoided, so you can directly receive up to 290MHz with the Si5351 :) Your Johnson counter would allow for 290/4 only, so thats 72.5MHz upper limit . . .
80m: have a look here --> https://en.wikipedia.org/wiki/Amateur_radio_frequency_allocations

@Derek: sounds extremely exciting! The Si5351 has been used that way (using two 90-degree-phase-shifted outputs for SDR applications, thus avoiding the need for a 4x higher LO frequency) by Hans Summers for the QCX and by Danilo DB4PLE for our UHSDR:

https://www.qrp-labs.com/qcx.html

https://github.com/df8oe/UHSDR/blob/active-devel/mchf-eclipse/drivers/ui/oscillator/osc_si5351a.c

However, as you already said, the hardware needs to account for a switch in the IQ inputs, if you wish to receive lower than about 3.5 MHz (switch to 4xLO with Johnson counter).


I am really looking forward to your SDR-lib! Seems you take some different approaches at some parts in the audio path, I have some problems in understanding though:

How do you perfom your bandpass filter at step 2) ? Do you perform a complex multiply with the FFT results of FIR bandpass filter coeffs?
If yes, you could probably design your filter coeffs so that you do not need a Hilbert transform any more, because the filtering automatically selects the desired sideband.

How do you perform the frequency translation? Do you just shift the bins in the output of the FFT?

Sound very nice, however I have problems in understanding the need for the Hilbert transform.

All the best wishes,

Frank DD4WH
 
Frank - quick reply:
1) Yes - Hans Summer at QRP Labs was the inspiration for the SI5131 mod, but I couldn't find any published code at the time I wrote my own. I certainly confirm Hans' conclusion that the SI5131 data sheet is a disaster!
2) SDR processing: Yes I use complex arithmetic as far. as I can. Assume that the input f(t) = f_i(t) + jf_q(t) (where j = sqrt(-1)). Take the complex FFT using arm_cfft_f32(...) and we're on our way...

(a) FIR filter by multiplying by the complex FFT of the filter coefficients
(b) Frequency shift by a circular rotation of the filter coefficients. Note this only works for shifts that are an integer multiple of the line spacing, but that is not a problem - we just design the filter appropriately.
(c) The Hilbert transform using a FIR filtering method poses a bit of problem because the real and imaginary parts of the waveform need different filters. So the complex FFT must be decomposed into two separate FFTs of the I and Q components. But that's easy to do - simple arithmetic.
(d) Sideband selection is done by adding/subtracting the two resulting spectra.

Your question about the need for the Hilbert transform is a good one! In the good old days we had two methods of SSB generation/reception - the "filtering" method and the "phasing" method (then the "third" or "Weaver" method came along). Without the Hilbert transform you have a "filtering" receiver, while with it you have a "phasing" receiver. But here's the deal - the filtering method doesn't need quadrature processing at all! In the good old analog days we just used a real signal in a superheterodyne receiver with a sharp IF filter (a xtal lattice filter or a mechanical filter). You could throw out your quadrature signal processing completely and still have great SDR provided you have a good digital filter.

With both the transform and filter in our SDR we have a hybrid approach, with the best of both worlds and can expect much better sideband suppression. I need more space to explain the complex Fourier theory of how the Hilbert transform improves things... I think I'll build a little SDR that uses a single real input as a demo.
 
Derek, thanks a lot for the detailed explanations!

(b) Frequency shift by a circular rotation of the filter coefficients. Note this only works for shifts that are an integer multiple of the line spacing, but that is not a problem - we just design the filter appropriately.

excellent! Thats totally new to me, I thought you had to rotate the frequency bins of the FFT output (and not the coefficients) for frequency translation!? And it seems this is an excellent method how to implement "passband tuning" without retuning the LO, I have to think about that a little more.

Very exciting to hear about your "hybrid approach"! We experience about 60dB of opposite sideband suppression by filtering just using Fast Convolution. Would be excellent if you find a way to improve this by using Hilbert transform and merging filtering and phasing. But then I & Q signals have to be even better calibrated for phase & amplitude. I originally thought that the imperfect sideband suppression would be caused by the imperfection of I & Q phase and amplitude and not by the imperfection of the filtering . . .

Yes, a demo would be superb!
 
... I thought you had to rotate the frequency bins of the FFT output (and not the coefficients) for frequency translation!?
You are absolutely correct. My bad! Of course you rotate the FFT data. Brain fart!
I should say that because we are working on a length N data buffers in a zero padded length 2N FFT, there is a phase shift in the output data which either 0 or 180 deg on adjacent output buffers. It took some head scratching to figure out what was going on. If you do it in a length N FFT everything is fine. The reason for the 2N FFT is that the data is already there for use in convolutions.

I think for passband tuning you'd be better of in the time domain:

F(w-w_0) = Fourier{e^(jw_o t) f(t))}

In other words, complex multiply multiply your signal f(t) by (cos (w_0 t) + j sign(w_0 t )) because you will not be limited to the line spacing in the FFT and can choose any w_0.
 
@Frank
Have you ever considered using ADCs and DACs instead of SGTL5000. At least there should be no twinpeaks problem.
 
Martin,
yes, we thought about this. I am not sure whether the actual 12bit resolution of the ADCs (even if you set them to 16bit) will be sufficient for a high quality radio, but who knows whether the SGTL5000 is so much better . . . it would make the radio hardware even less complex.

I do not consider twinpeaks a real problem any more, because it can easily be detected (either by looking at the spectrum display, by automatically detecting with Bob Larkins cross correlation calculation or with my Moseley & Slump implementation of the automatic I & Q correction) and then you just reset the codec until its gone.
 
I should say that because we are working on a length N data buffers in a zero padded length 2N FFT, there is a phase shift in the output data which either 0 or 180 deg on adjacent output buffers.

You mind a reference for phase shift if you zero pad data?
 
You mind a reference for phase shift if you zero pad data?

I have never seen a reference to what happens after a spectral rotation in the FFT of a zero padded data set. What happened was that when I tried it in hardware I noted a sign change in successive data blocks under certain conditions, so I went back to the formal math definition of the DFT and followed the operations of the forward transform with zero padding, rotation and inverse transform, and sure enough the sign (phase) changes are there!

I'll write it up as a little pdf memo because the math is too involved to describe in words...

BTW - the single AudioBlock SDR is working great, and I'm now using it as my "goto" SDR kernel. I hope to be posting it on the forum real soon. The execution time for the block is 770 usecs from IQ input to demodulated SSB/CW/AM out, which leaves oodles of time for other processing.
 
Last edited:
Hi Derek,

sounds very interesting, looking forward to it!

Sorry for my potentially silly question, my approach to DSP is more a way of trial and error than really having available any formal math . . .

Why do you use zero padding? Wouldnt it be more efficient to use some kind of overlap-and-save or overlap-and-add and fill the FFT buffer with the overlapping data?

But maybe that is already what you are doing.
 
Yes I'm doing overlap-and-add, and the zero padding is for that very purpose. You are probably aware that when doing frequency domain convolution (via FFT) of two time functions of lengths P and Q samples you must use a zero padded FFT of at least P+Q+1 samples to avoid wrap-around errors. With the audio block length of 128, I limit my FIR filters to length 62 to allow sufficient room, and do the FFT in length 256 zero padded buffers.

@WMXZ -- Just going back to the 180 deg phase shift (sign change) between successive buffers when frequency shifting by rotation of the FFT of a zero padded signal, it only occurs when the rotation is an odd number of lines, ie to a frequency that does not correspond to a line in the spectrum of the original length N waveform.
 
Hi Derek,

thanks for your clarifying reply!

I had to read over that again in my favourite book (Rick Lyons "Understanding Digital Processing") and one of my favourite papers on Fast Convolution Filtering by Borgerding:

https://pdfs.semanticscholar.org/9f10/d8c9237f1c383862225dec0d715c6f63961c.pdf

It seems I misunderstood your statement on zero padding: you do not need zero padding in overlap-and-save, you just discard the portions of the circular convolution that wrap around and use the last part of the previous input block as the beginning of the next block. Thats what the Teensy Convolution SDR does:
* take 16 blocks of 128 samples = 2048 samples
* decimate-by-4
* decimate-by-2
* take these 256 samples plus the last 256 and put them into the FFT buffer (--> 50% overlap)
* calculate 512-point-complex FFT
* complex-multiply with FFT result of the complex Kaiser windowed bandpass coefficients of a 257-tap FIR filter
* calculate 512-point-complex inverse FFT
* take the 256 samples from the 2nd part of the iFFT output (discard first 256 samples)
* interpolate-by-2
* interpolate-by-4
* take 256 * 8 = 2048 samples and play them

As your approach uses overlap-and-add, you have to use zero-padding, I see that. I missed that detail, because I never used overlap-and-add, only overlap-and-save.

One question would be whether you could use a 128 tap FIR filter with your 256 point FFT? That would give steeper filter skirts. However, I might be overstressing the DSP rules with an FFT length only 2 times as long as the FIR filter length compared to the rule of thumb of Borgerding "FFT length = 4 * FIR filter length" . . .

All the best,

Frank DD4WH
 
Hi Frank,
I am not familiar with any "rule of thumb" regarding length of FFT buffers, and am not familiar with any Brogerding reference. (FWIW -I might just mention that I taught Signal Processing/DSP at the graduate level for 50+ years...) As far as I know, as long as the criterion N>P+Q+1 is satisfied there will be no errors in either the "ovelap-add" or the "overlap-save" methods (and either one will give the same result). One problem to be aware of is that when you cascade (connect in series) linear filters, the duration of the impulse response of the pair is the sum of the individual durations (the FIR filter coefficients are the impulse response). If you are processing the cascaded filters as a single FFT multiplication you have to be careful to make sure that your zero padded record can hold the combined impulse response. In my AudioSDR I have two filters - the bandpass pre-filter and the Hilbert transformer, each of which is limited to length < 64, so we just sneak in with length 256 FFT's.

The AudioSDR is basically ready to go for beta testing but during my nightly insomnia, last night I had the idea of using "frequency-sampling" FIR filters for the design of the pre-filter instead of my current fixed designs. This way the user could vary the bandwidth with a public function for example, mySDR.setBandwidh(bandwidth). There will be defaults for SSB, CW, and AM but these can be changed in real-time. I'm currently evaluating this in MATLAB and I hope to have it working on the Teensy today.
 
Frank,

Here's a good way to deal with the EEPROM. Main changes:
1. Only define the structure once as global and put default values in at the definition.
2. Added 'char version_of_settings[4];' to the end of the structure and set it = CONFIG_VERSION #define that is a string of 3 chars that are used to identify the structure. The EEPROM_LOAD() will look for these first to determine if the data structure matches. If you want to add more settings to the structure you add them before 'version_of_settings' and increment the CONFIG_VERSION. It will automatically update the settings.
3. If EEPROM_LOAD() doesn't find the correct CONFIG_VERSION at the correct place during setup() it will use the defaults and also save the defaults to the EEPROM with the correct CONFIG_VERSION.

Also in the structure definition deleted some redundant data that doesn't change, (in the comments).

Also a slight error in the first member of the E structure that caused me a little trouble:

unsigned long long calibration_factor; has long twice This gets past the Arduino compiler but makes the structure the wrong size and my code correctly wouldn't work. Took a while to find.

Also, there is another place you write to the EEPROM that should be put in the settings.

So, If you replace EEPROM_LOAD() and EEPROM_SAVE() with the code below the first time you run it a message will display about the config but everything will be fine. The second time you run it everything will be up to date.


Code:
#define CONFIG_VERSION "md4"  // ID of the E settings block, change if structure changes

struct config_t {
    unsigned long calibration_factor;
    long calibration_constant;
    //unsigned long long freq[NUM_BANDS];   //these don't need to be in eeprom as they are in rom and don't change
    //int mode[NUM_BANDS];
    //int bwu[NUM_BANDS];
    //int bwl[NUM_BANDS];
    //int rfg[NUM_BANDS];
    int band;
    float32_t LPFcoeff;
    int audio_volume;
    int8_t AGC_mode;
    float32_t pll_fmax;
    float32_t omegaN;
    int zeta_help;
    uint8_t rate;
    float32_t bass;
    float32_t treble;
    int agc_thresh;
    int agc_decay;
    int agc_slope;
    uint8_t auto_IQ_correction;
    float32_t midbass;
    float32_t mid;
    float32_t midtreble;
    int8_t RF_attenuation;
    uint8_t show_spectrum_flag;
    float32_t stereo_factor;
    float32_t spectrum_display_scale;
    int32_t spectrum_zoom;
    uint8_t NR_use_X;
    float32_t NR_PSI;
    float32_t NR_alpha;
    float32_t NR_beta;
    char version_of_settings[4];  // validation string
  } E ={  //these will be the default settings when saved values are not found
  calibration_factor,
  calibration_constant,
  band,
  LPF_spectrum,
  audio_volume,
  AGC_mode,
  pll_fmax,
  omegaN,
  zeta_help,
  SAMPLE_RATE,
  bass,
  treble,
  agc_thresh,
  agc_decay,
  agc_slope,
  auto_IQ_correction,
  midbass,
  mid,
  midtreble,
  RF_attenuation,
  show_spectrum_flag,
  stereo_factor,
  spectrum_display_scale,
  spectrum_zoom,
  NR_use_X,
  NR_PSI,
  NR_alpha,
  NR_beta,
  CONFIG_VERSION
};

void EEPROM_LOAD() {
  // To make sure there are settings, and they are YOURS!
  if (EEPROM.read(0 + sizeof(E) - 1) == E.version_of_settings[3] &&// this is '\0'
      EEPROM.read(0 + sizeof(E) - 2) == E.version_of_settings[2] &&
      EEPROM.read(0 + sizeof(E) - 3) == E.version_of_settings[1] &&
      EEPROM.read(0 + sizeof(E) - 4) == E.version_of_settings[0])
   { // ID correct, use settings from EEPROM
    eeprom_read_block(&E, 0, sizeof(E));
    delay(100);
    Serial.println("loaded settings from EEPROM");
    calibration_factor = E.calibration_factor;
    calibration_constant = E.calibration_constant;
    //for (int i = 0; i < (NUM_BANDS); i++)   //these don't need to be in eeprom as they are in rom and don't change
    //  bands[i].freq = E.freq[i];
    //for (int i = 0; i < (NUM_BANDS); i++)
    //  bands[i].mode = E.mode[i];
    //for (int i = 0; i < (NUM_BANDS); i++)
    //  bands[i].FHiCut = E.bwu[i];
    //for (int i = 0; i < (NUM_BANDS); i++)
    //  bands[i].FLoCut = E.bwl[i];
    //for (int i = 0; i < (NUM_BANDS); i++)
    //  bands[i].RFgain = E.rfg[i];
    band = E.band;
    LPF_spectrum = E.LPFcoeff;
    audio_volume = E.audio_volume;
    AGC_mode = E.AGC_mode;
    pll_fmax = E.pll_fmax;
    omegaN = E.omegaN;
    zeta_help = E.zeta_help;
    zeta = (float32_t) zeta_help / 100.0;
    SAMPLE_RATE = E.rate;
    bass = E.bass;
    treble = E.treble;
    agc_thresh = E.agc_thresh;
    agc_decay = E.agc_decay;
    agc_slope = E.agc_slope;
    auto_IQ_correction = E.auto_IQ_correction;
    midbass = E.midbass;
    mid = E.mid;
    midtreble = E.midtreble;
    RF_attenuation = E.RF_attenuation;
    show_spectrum_flag = E.show_spectrum_flag;
    stereo_factor = E.stereo_factor;
    spectrum_display_scale = E.spectrum_display_scale;
    spectrum_zoom = E.spectrum_zoom;
    NR_use_X = E.NR_use_X;
    NR_PSI = E.NR_PSI;
    NR_alpha = E.NR_alpha;
    NR_beta = E.NR_beta;
  } else {
    // settings aren't valid! will overwrite with default settings
    Serial.println("Couldn't find saved settings in EEprom, will use defaults");
    EEPROM_SAVE(); //save with default values read during setup
  }
} // end void eeProm LOAD

void EEPROM_SAVE() {
  E.calibration_factor = calibration_factor;
  E.band = band;
  E.calibration_constant = calibration_constant;
  E.LPFcoeff = LPF_spectrum;
  E.audio_volume = audio_volume;
  E.AGC_mode = AGC_mode;
  E.pll_fmax = pll_fmax;
  E.omegaN = omegaN;
  E.zeta_help = zeta_help;
  E.rate = SAMPLE_RATE;
  E.bass = bass;
  E.treble = treble;
  E.agc_thresh = agc_thresh;
  E.agc_decay = agc_decay;
  E.agc_slope = agc_slope;
  E.auto_IQ_correction = auto_IQ_correction;
  E.midbass = midbass;
  E.mid = mid;
  E.midtreble = midtreble;
  E.RF_attenuation = RF_attenuation;
  E.show_spectrum_flag = show_spectrum_flag;
  E.stereo_factor = stereo_factor;
  E.spectrum_display_scale = spectrum_display_scale;
  E.spectrum_zoom = spectrum_zoom;
  E.NR_use_X = NR_use_X;
  E.NR_PSI = NR_PSI;
  E.NR_alpha = NR_alpha;
  E.NR_beta = NR_beta;
  eeprom_write_block (&E, 0, sizeof(E));
  Serial.print("saved settings version: ");Serial.print(E.version_of_settings);Serial.println(" in EEPROM");
} // end void eeProm SAVE
 
Last edited:
In my AudioSDR I have two filters - the bandpass pre-filter and the Hilbert transformer, each of which is limited to length < 64, so we just sneak in with length 256 FFT's.
One trick I regularly apply is to Hilbert transform the (FIR) Filter coefficients and to apply the now complex Filter to the real data, giving me the desired complex result with one filter delay.

Edit: if Filtering is done by FFT, explicit Hilbert is not needed and can be done in Frequency domain.
 
Last edited:
Hi bicycleguy,

thanks for the suggestion!

I do not understand the code fully yet, so I will take some more time to digest it. However, a couple of things to comment on:

- calibration factor does seem to be used consistently in the whole code: it is defined as unsigned long long everywhere, so no need to change this to long!
//unsigned long long freq[NUM_BANDS]; //these don't need to be in eeprom as they are in rom and don't change
- that is not true: if you comment these out, your radio will not remember all the changes you made in the different HAM radio and broadcast radio bands! So, we do need these variables in the EEPROM, because these are changed by the user

The problem of having to comment the EEPROM_load during the very first loading of the software is a minor one, I think. Also it has a didactical component of forcing the user to have a detailed look at the code ;-). No, you are right, this can be frustrating for the first-time-user, but I think -if we change this- it should be made more bullet-proof, as Frank B suggested.

Frank DD4WH
 
Hi Franks,
Thinking overnight my perspective is definitely as a first time user. I'm always loading different code into the same Teensy so expecting to have something in eeprom is a problem. However, I realized that my current suggestion will loose any data that a person would have saved in the eeprom when the structure changes which is a bummer. So that means you need to handle revisions, ugh more complication.

Frank B is correct about the checksum. I didn't want to add the complexity to my first suggestion. In reality I have my code dealing with program prefs in a separate file that is reusable for any program with minor changes. It doesn't do the block EEPROM reads and writes. It uses bytes and only writes if the data changes and checks each write.

Not sure why you don't use separate files but since you don't I didn't.

Mike
 
Frank B,
Thanks for the code tip. Wasn't sure which one to use so used the _crc_ibutton_update(). Had to put #include <util/crc16.h> to make it work. Should I have copied the function or used it like this?
Seemed to work ok. Tried a few tests.
 
- calibration factor does seem to be used consistently in the whole code: it is defined as unsigned long long everywhere, so no need to change this to long!
Frank DD4WH
When defined as:
Code:
struct config_t{
  unsigned long long calibration_factor;
  long calibration_constant;
The compiler (mistakenly? )makes members:
long
calibration_factor
calibration_constant

It doesn't make any difference except the wasted 4 bytes and if your checking the struct size. I changed my code to not use the size of the structure directly when referencing it since the crc code handles those kinds of errors.

All the other places you mentioned it's defined also make an extra variable 'long' so nothing wrong with fixing it (I think)

- that is not true: if you comment these out, your radio will not remember all the changes you made in the different HAM radio and broadcast radio bands! So, we do need these variables in the EEPROM, because these are changed by the user
Your right of course about that. I must have had a brain fart. If I actually had the rest of the hardware I would have noticed !

...forcing the user to have a detailed look at the code ;-). No, you are right, this can be frustrating for the first-time-user, but I think -if we change this- it should be made more bullet-proof, as Frank B suggested.

I have added the CRC and other changes to make it work correctly. This required a global variable the load routine can set on failure to load, so that a ref to the save routine at the end of the setup can save all the changes that setup() does to the variables. This kind of works like your current comment out the load and then save and un-comment, but automatically and no restart required.

The save routine only writes bytes that have changed and tells you how many, which can be kind of interesting. I have tested corrupting the EEPROM and it seems to catch it.

If your interested I can post the code, should comment it more.
 
Hi bicycleguy,

sounds good, would like to try to integrate your code into the script, if you allow!

However, calibration_factor has to be type unsigned long long, please change that in your code. --> the Si5351 library that I use has a resolution of 1 cHz (0.01Hz), so I think we would need a 64bit type here considering all the multiplications and divisions with the frequency variables and calibration handling.

calibration_constant is type long, that is correct.

I have not seen any problems or warnings with that using the standard Arduino IDE and Teensyduino as the compliler, so that must be a problem of your specific compiler, I think!

Frank DD4WH
 
Back
Top