Fast Convolution filtering in floating point with Teensy 3.6

Status
Not open for further replies.

DD4WH

Well-known member
My longterm plan is to realize a Software Defined Radio which implements different filtering and demodulation strategies: phasing, Weaver demodulation, and Fast Convolution Filtering. The latter was the most challenging for me. With a hint by Alberto, finally today I managed to get audio filtered through a 513 (!!!) tap FIR filter by fast convolution.
Unlike standard filtering techniques, fast convolution works in the frequency domain like this (taken from Lyons 2011: Understanding Digital Signal Processing):
- FFT of the FIR filter coefficients = Filter mask
- fill FFT buffer with audio samples of last time and of now
- perform FFT
- complex multiply the output of the FFT with the filter mask
- perform inverse FFT
- discard half of the output samples and take other half of the output samples of the iFFT as the filtered audio samples

This technique is best done in 32bit integer math or floating point because otherwise precision of the calculations is not sufficient

In order to demonstrate the power of this filter technique I attach here a plot of the implemented filter response of the FIR filter with 512 taps:

FIR_filter_512_coeffs.png

This is achieved with a 1024point FFT/iFFT sequence. However, my script also runs with 2048 and 4096 point FFT/iFFTs, thus it should be able to achieve even steeper filter responses. Here is a list of the processor time that I measured for different FFT/iFFT sizes in microseconds:
point FFT/iFFT
FFT / iFFT pointsaudio samples processedTime for processing in usec
256128472
512256839
10245121966
204810244214
409620487574

To be able to process audio data in realtime at a sample rate of 44118sps, we have 2900 microseconds available for 128 samples, 5800usec for 256 samples etc.
That means this filtering technique needs only about 16% of the processor time to achieve this filtering! And it is the same processor load regardless if I use a 256point FFT/iFFT (equivalent to 129tap FIR filter) or a 4096 point FFT/iFFT (equivalent to a 2049tap FIR filter!!! which really is a brickwall filter . . .).

However, my favourite Filter coefficient program (Iowa Hills FIR Filter designer) only calculates coefficients for a FIR filter of up to 512 taps. I haven´t got a copy of MATLAB, but I have Octave (which I have not used much).

Has anybody got a hint how to calulate FIR filter coefficients with OCTAVE for filter larger than 512 taps? I would really like to push the filtering to the limit, just to try what is possible with the Teensy. [BUT: Maximum number of taps for the FIR filter is FFT_length / 2 +1]

Here is the script for Fast Convolution Filtering (STEREO, because finally I need it to treat both, I & Q):
(It runs ONLY on Teensy 3.5/3.6 AND when using the new version of the CMSIS library, which is really easy to achieve, see the code for the thread link where this is explained.)


Code:
/*********************************************************************************************
 * (c) Frank DD4WH 2016_11_23
 * 
 * FFT Fast Convolution = Digital Convolution
 * 
 * in floating point 32bit 
 * 
 * tested on Teensy 3.6
 * 
 * audio queue optimized by Pete El Supremo 2016_10_27, thanks Pete!
 * final hint on the implementation came from Alberto I2PHD, thanks for that!
 * audio processing in float32_t with the NEW ARM CMSIS lib (see https://forum.pjrc.com/threads/38325-Excellent-results-with-Floating-Point-FFT-IFFT-Processing-and-Teensy-3-6?p=119797&viewfull=1#post119797),
 * and see here: https://forum.pjrc.com/threads/38325-Excellent-results-with-Floating-Point-FFT-IFFT-Processing-and-Teensy-3-6?p=120177&viewfull=1#post120177 
 * MIT license
 */

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <Metro.h>
#include "font_Arial.h"
#include <ILI9341_t3.h>
#include <arm_math.h>
#include <arm_const_structs.h>

#include "Filter_coeffs.h"

#define BACKLIGHT_PIN 0
#define TFT_DC      20
#define TFT_CS      21
#define TFT_RST     32  // 255 = unused. connect to 3.3V
#define TFT_MOSI     7
#define TFT_SCLK    14
#define TFT_MISO    12

ILI9341_t3 tft = ILI9341_t3(TFT_CS, TFT_DC, TFT_RST, TFT_MOSI, TFT_SCLK, TFT_MISO);

Metro five_sec=Metro(2000); // Set up a Metro

// this audio comes from the codec by I2S2
AudioInputI2S            i2s_in; 
           
AudioRecordQueue         Q_in_L;    
AudioRecordQueue         Q_in_R;    

AudioPlayQueue           Q_out_L; 
AudioPlayQueue           Q_out_R; 
AudioAnalyzeFFT256       myFFT;
AudioOutputI2S           i2s_out;           

AudioConnection          patchCord1(i2s_in, 0, Q_in_L, 0);
AudioConnection          patchCord2(i2s_in, 1, Q_in_R, 0);

AudioConnection          patchCord5(Q_out_R,0,myFFT,0); 

AudioConnection          patchCord3(Q_out_L, 0, i2s_out, 1);
AudioConnection          patchCord4(Q_out_R, 0, i2s_out, 0);

AudioControlSGTL5000     sgtl5000_1;     //xy=265.212

int idx_t = 0;
int idx = 0;
int64_t sum;
float32_t mean;
int n_L;
int n_R;
long int n_clear;

int peak[512];
int barm[512];

ulong samp_ptr = 0;
// bool FFT_state = false;

const int myInput = AUDIO_INPUT_LINEIN;

#define BUFFER_SIZE 128

// complex FFT with the new library CMSIS V4.5
const static arm_cfft_instance_f32 *S;
/***************************************************
 * Define your FFT_size HERE, possible values below
 * change filter coeffs accordingly
 ***************************************************/
//const uint32_t FFT_length = 256; 
//const uint32_t FFT_length = 512; 
const uint32_t FFT_length = 1024;
//const uint32_t FFT_length = 2048;
//const uint32_t FFT_length = 4096;

const uint32_t N_BLOCKS = FFT_length / 2 / BUFFER_SIZE; 
float32_t float_buffer_L [BUFFER_SIZE * N_BLOCKS];
float32_t float_buffer_R [BUFFER_SIZE * N_BLOCKS];
float32_t FFT_buffer [FFT_length * 2] __attribute__ ((aligned (4)));
float32_t last_sample_buffer_L [BUFFER_SIZE * N_BLOCKS];
float32_t last_sample_buffer_R [BUFFER_SIZE * N_BLOCKS];

// complex iFFT with the new library CMSIS V4.5
const static arm_cfft_instance_f32 *iS;
float32_t iFFT_buffer [FFT_length * 2] __attribute__ ((aligned (4)));

// FFT instance for direct calculation of the filter mask
// from the impulse response of the FIR - the coefficients
const static arm_cfft_instance_f32 *maskS;
float32_t FIR_filter_mask [FFT_length * 2] __attribute__ ((aligned (4))); 

int8_t first_block = 1;
uint32_t i = 0;


void setup() {
  Serial.begin(115200);
  delay(1000);

  // Audio connections require memory. and the record queue
  // uses this memory to buffer incoming audio.
  // for the large FFT sizes we need a lot of buffers
  AudioMemory(60);

  // Enable the audio shield. select input. and enable output
  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.5);
  sgtl5000_1.adcHighPassFilterDisable(); // does not help too much!

  pinMode( BACKLIGHT_PIN, OUTPUT );
  analogWrite( BACKLIGHT_PIN, 1023 );

  tft.begin();
  tft.setRotation( 3 );
  tft.fillScreen(ILI9341_BLACK);
  tft.setCursor(10, 1);
  tft.setTextSize(2);
  tft.setTextColor(ILI9341_WHITE);
  tft.setFont(Arial_14);
  tft.print("Floating point audio processing");

 /****************************************************************************************
 *  init complex FFT
 ****************************************************************************************/
   switch (FFT_length) {
    case 16:
      S = &arm_cfft_sR_f32_len16;
      iS = &arm_cfft_sR_f32_len16;
      break;
    case 32:
      S = &arm_cfft_sR_f32_len32;      
      iS = &arm_cfft_sR_f32_len32;
      break;
    case 64:
      S = &arm_cfft_sR_f32_len64;
      iS = &arm_cfft_sR_f32_len64;
      break;
    case 128:
      S = &arm_cfft_sR_f32_len128;      
      iS = &arm_cfft_sR_f32_len128;
      break;
    case 256:
      S = &arm_cfft_sR_f32_len256;
      iS = &arm_cfft_sR_f32_len256;
      maskS = &arm_cfft_sR_f32_len256;
      break;
    case 512:
      S = &arm_cfft_sR_f32_len512;
      iS = &arm_cfft_sR_f32_len512;
      maskS = &arm_cfft_sR_f32_len512;
      break;
    case 1024:
      S = &arm_cfft_sR_f32_len1024;
      iS = &arm_cfft_sR_f32_len1024;
      maskS = &arm_cfft_sR_f32_len1024;
      break;
    case 2048:
      S = &arm_cfft_sR_f32_len2048;
      iS = &arm_cfft_sR_f32_len2048;
      maskS = &arm_cfft_sR_f32_len2048;
      break;
    case 4096:
      S = &arm_cfft_sR_f32_len4096;
      iS = &arm_cfft_sR_f32_len4096;
      maskS = &arm_cfft_sR_f32_len4096;
      break;
  }

 /****************************************************************************************
 *  Calculate the FFT of the FIR filter coefficients once to produce the FIR filter mask
 ****************************************************************************************/

// the FIR has (FFT_length / 2) + 1 taps = coefficients, so we have to add (FFT_length / 2) -1 zeros before the FFT
// in order to produce a FFT_length point input buffer for the FFT 
    // copy coefficients into real values of first part of buffer, rest is zero
    for (i = 0; i < (FFT_length / 2) + 1; i+=2)
    {
        FIR_filter_mask[i * 2] = FIR1 [i];
        FIR_filter_mask[i * 2 + 1] =  0.0; 
    }
// FFT of FIR_filter_mask
// perform FFT (in-place), needs only to be done once (or every time the filter coeffs change)
    arm_cfft_f32(maskS, FIR_filter_mask, 0, 1);    
    Serial.print("Filter mask FFT done ");

///////////////////////////////////////////////////////////////77
// PASS-THRU only for TESTING
/////////////////////////////////////////////////////////////77
/*
// hmmm, unclear, whether [1,1] or [1,0] or [0,1] are pass-thru filter-mask-multipliers??
// empirically, [1,0] sounds most natural = pass-thru
  for(i = 0; i < FFT_length * 2; i+=2)
  {
        FIR_filter_mask [i] = 1.0; // real
        FIR_filter_mask [i + 1] = 0.0; // imaginary
  }
*/
 /****************************************************************************************
 *  begin to queue the audio from the audio library
 ****************************************************************************************/
    delay(100);
    Q_in_L.begin();
    Q_in_R.begin();    
} // END SETUP

int16_t *sp_L;
int16_t *sp_R;

void loop() {
 
 elapsedMicros usec = 0;
/**********************************************************************************
 *  Get samples from queue buffers
 **********************************************************************************/
    // we have to ensure that we have enough audio samples: we need
    // N_BLOCKS = fft_length / 2 / BUFFER_SIZE
    // FFT1024 point --> n_blocks = 1024 / 2 / 128 = 4 
    // when these buffers are available, read them in and perform
    // the FFT - cmplx-mult - iFFT
    //
    // are there at least N_BLOCKS buffers in each channel available ?
    if (Q_in_L.available() > N_BLOCKS && Q_in_R.available() > N_BLOCKS)
    {   
// get audio samples from the audio  buffers and convert them to float
    for (i = 0; i < N_BLOCKS; i++)
    {  
    sp_L = Q_in_L.readBuffer();
    sp_R = Q_in_R.readBuffer();

      // convert to float one buffer_size
     arm_q15_to_float (sp_L, &float_buffer_L[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
     arm_q15_to_float (sp_R, &float_buffer_R[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
     Q_in_L.freeBuffer();
     Q_in_R.freeBuffer();
    }

    // this is supposed to prevent overfilled queue buffers
    // rarely the Teensy audio queue gets a hickup
    // in that case this keeps the whole audio chain running smoothly 
    if (Q_in_L.available() > N_BLOCKS * 4 || Q_in_R.available() > N_BLOCKS * 4) 
    {
      Q_in_L.clear();
      Q_in_R.clear();
      n_clear ++; // just for debugging to check how often this occurs [about once in an hour of playing . . .]
    }
  
/**********************************************************************************
 *  Digital convolution
 **********************************************************************************/
//  basis for this was Lyons, R. (2011): Understanding Digital Processing.
//  "Fast FIR Filtering using the FFT", pages 688 - 694
//  numbers for the steps taken from that source
//  Method used here: overlap-and-save

// 4.) ONLY FOR the VERY FIRST FFT: fill first samples with zeros 
      if(first_block) // fill real & imaginaries with zeros for the first BLOCKSIZE samples
      {
        for(i = 0; i < BUFFER_SIZE * 2 * N_BLOCKS; i++)
        {
          FFT_buffer[i] = 0.0;
        }
        first_block = 0;
      }
      else
      
// HERE IT STARTS for all other instances
// 6 a.) fill FFT_buffer with last events audio samples
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        FFT_buffer[i * 2] = last_sample_buffer_L[i]; // real
        FFT_buffer[i * 2 + 1] = last_sample_buffer_R[i]; // imaginary
      }

    // copy recent samples to last_sample_buffer for next time!
    arm_copy_f32(float_buffer_L, last_sample_buffer_L, BUFFER_SIZE * N_BLOCKS);
    arm_copy_f32(float_buffer_R, last_sample_buffer_R, BUFFER_SIZE * N_BLOCKS);

// 6. b) now fill audio samples into FFT_buffer (left channel: re, right channel: im)
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        FFT_buffer[FFT_length + i * 2] = float_buffer_L[i]; // real
        FFT_buffer[FFT_length + i * 2 + 1] = float_buffer_R[i]; // imaginary
      }

// perform complex FFT
// calculation is performed in-place the FFT_buffer [re, im, re, im, re, im . . .]
        arm_cfft_f32(S, FFT_buffer, 0, 1);

// complex multiply with filter mask
     arm_cmplx_mult_cmplx_f32 (FFT_buffer, FIR_filter_mask, iFFT_buffer, FFT_length);

/*  // just for testing: pass-thru !  
     for(i = 0; i < FFT_length * 2; i++)
    {
      iFFT_buffer[i] = FFT_buffer[i];
    }
*/

// perform iFFT (in-place)
        arm_cfft_f32(iS, iFFT_buffer, 1, 1);

// our desired output is the real part (left channel) AND the imaginary part (right channel) of the second half of the FFT_buffer
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        float_buffer_L[i] = iFFT_buffer[FFT_length + (i * 2)]; 
        float_buffer_R[i] = iFFT_buffer[FFT_length + (i * 2) + 1];
//        float_buffer_L[i] = iFFT_buffer[(i * 2)]; 
//        float_buffer_R[i] = iFFT_buffer[(i * 2) + 1];
      }

    for (i = 0; i < N_BLOCKS; i++)
    {
      sp_L = Q_out_L.getBuffer();
      sp_R = Q_out_R.getBuffer();
      arm_float_to_q15 (&float_buffer_L[BUFFER_SIZE * i], sp_L, BUFFER_SIZE); 
      arm_float_to_q15 (&float_buffer_R[BUFFER_SIZE * i], sp_R, BUFFER_SIZE); 
      Q_out_L.playBuffer(); // play it !
      Q_out_R.playBuffer(); // play it !
    }  

/**********************************************************************************
 *  PRINT ROUTINE FOR ELAPSED MICROSECONDS
 **********************************************************************************/
 
      sum = sum + usec;
      idx_t++;
      if (idx_t > 1000) {
          tft.fillRect(240,50,90,20,ILI9341_BLACK);   
          tft.setCursor(240, 50);
          mean = sum / idx_t;
          tft.print (mean);
          Serial.print (mean);
          Serial.print (" microsec for 2 stereo blocks    ");
          Serial.println();
          Serial.print (" n_clear    ");
          Serial.println(n_clear);
          idx_t = 0;
          sum = 0;
         
      }

     }
/**********************************************************************************
 *  PRINT ROUTINE FOR AUDIO LIBRARY PROCESSOR AND MEMORY USAGE
 **********************************************************************************/
          if (five_sec.check() == 1)
    {
      Serial.print("Proc = ");
      Serial.print(AudioProcessorUsage());
      Serial.print(" (");    
      Serial.print(AudioProcessorUsageMax());
      Serial.print("),  Mem = ");
      Serial.print(AudioMemoryUsage());
      Serial.print(" (");    
      Serial.print(AudioMemoryUsageMax());
      Serial.println(")");
/*      tft.fillRect(100,120,200,80,ILI9341_BLACK);
      tft.setCursor(10, 120);
      tft.setTextSize(2);
      tft.setTextColor(ILI9341_WHITE);
      tft.setFont(Arial_14);
      tft.print ("Proc = ");
      tft.setCursor(100, 120);
      tft.print (AudioProcessorUsage());
      tft.setCursor(180, 120);
      tft.print (AudioProcessorUsageMax());
      tft.setCursor(10, 150);
      tft.print ("Mem  = ");
      tft.setCursor(100, 150);
      tft.print (AudioMemoryUsage());
      tft.setCursor(180, 150);
      tft.print (AudioMemoryUsageMax());
     */ 
      AudioProcessorUsageMaxReset();
      AudioMemoryUsageMaxReset();
    }
   spectrum();
}

 void spectrum() { // spectrum analyser code by rheslip - modified
     if (myFFT.available()) {
    int scale;
    scale = 2;
  for (int16_t x=2; x < 100; x+=2) {

     int bar = (abs(myFFT.output[x]) * scale);
     if (bar >180) bar=180;
     // this is a very simple IIR filter to smooth the reaction of the bars
     bar = 0.2 * bar + 0.8 * barm[x]; 
     if (bar > peak[x]) peak[x]=bar;
//     tft.drawFastVLine(x, 210-bar,bar, ILI9341_PURPLE);
     tft.drawFastVLine(x*2+10, 210-bar,bar, ILI9341_PINK);

     tft.drawFastVLine(x*2+10, 20, 210-bar-20, ILI9341_BLACK);    

     tft.drawPixel(x*2+10,209-peak[x], ILI9341_YELLOW);

     if(peak[x]>0) peak[x]-=1;
     barm[x] = bar;
  }
  } //end if

   } // end void spectrum

Filter_coeffs.h

Code:
// LOWPASS, Fs = 44118sps, Fc = 4.65kHz, Raised Cos, Kaiser Window, KaiserBeta = 6.6, 513 taps, 80dB stopband attenuation
float32_t FIR1 [513] = {6.871453131273526080E-6, 3.445208686058514050E-6,-2.267576792967956220E-6,-7.919106902676396050E-6,-10.78330803716723630E-6,-9.023454130073321980E-6,-2.727716424898928960E-6, 5.830806219449149450E-6, 12.95263921038207760E-6, 15.07004322439275690E-6, 10.46645638312625340E-6, 352.9895673672170920E-9,-11.30136043391740590E-6,-19.27117070103877340E-6,-19.36773512201770590E-6,-10.53810844057831630E-6, 4.246565645390340650E-6, 18.88131923910259150E-6, 26.59255574530914860E-6, 23.00867797048227100E-6, 8.469591168820601370E-6,-11.59072016534943120E-6,-28.58389907785851850E-6,-34.38412012066380900E-6,-25.12542511874780260E-6,-3.438442404986254970E-6, 22.07369790699343070E-6, 40.16548142864498060E-6, 41.83631164663676570E-6, 24.68029431209209430E-6,-5.356921369655561090E-6,-35.87676946745603600E-6,-53.06496452714677050E-6,-47.85805435255078780E-6,-20.52159581641252470E-6, 18.60301812321676710E-6, 52.87758953002247610E-6, 66.35847481556579910E-6, 51.09982205584921640E-6, 11.46773656911063460E-6,-36.76187456915871370E-6,-72.56446574739206310E-6,-78.73777304608509550E-6,-50.00909485652042010E-6, 3.582085651938113460E-6, 59.95323217943641220E-6, 93.96507511644892930E-6, 88.52025560782911380E-6, 42.92080484068521430E-6,-25.51467591852549430E-6,-87.83752761919105010E-6,-115.5998385779402470E-6,-93.69719863854982120E-6,-28.18268580960215530E-6, 54.85414729897830450E-6, 119.5106512276131810E-6, 135.4700545070515430E-6, 92.02479796172386270E-6, 4.312256142316567060E-6,-91.60761750746233640E-6,-153.4226992285940700E-6,-151.0898538008062530E-6,-81.15967262677578730E-6, 29.82131007206200830E-6, 135.1176733260618620E-6, 187.3332849135738910E-6, 159.5690163151246280E-6, 58.83695341793738720E-6,-74.80066892174815510E-6,-183.9358373761230610E-6,-218.3152940660573900E-6,-157.7507032381709560E-6,-23.08507710114890090E-6, 130.4626205385996510E-6, 235.7322319962519830E-6, 242.8171610526223670E-6, 142.4043234780449440E-6,-27.53277196456661000E-6,-195.7153542023962700E-6,-287.2564798196244170E-6,-256.7907900508363350E-6,-110.4692644381636340E-6, 93.66466053090076120E-6, 268.3884569597703380E-6, 334.3634604657972890E-6, 255.8882334451354270E-6, 59.34036139065401730E-6,-174.9209539475816650E-6,-345.1340708370582890E-6,-372.1147967561435620E-6,-235.7253504313817700E-6, 12.81889441169864520E-6, 269.6506718361622460E-6, 421.3967525005456310E-6, 394.9629158680992300E-6, 192.2051845858361500E-6,-106.7539047042511700E-6,-374.7648593146286660E-6,-491.4672567532300600E-6,-397.0193777286223170E-6,-121.8880852669094990E-6, 221.8094711193295150E-6, 485.6286529121493340E-6, 548.6316252709169700E-6, 372.4031540954385950E-6, 22.39008286279281280E-6,-355.6579784851013530E-6,-596.0420324019763710E-6,-585.4217320307434420E-6,-315.6580489304485010E-6, 107.2138253653659630E-6, 504.0909578730086200E-6, 698.3258335996976030E-6, 593.9670621475752340E-6, 222.2219282155563180E-6,-266.0085519273678190E-6,-660.8989251159424610E-6,-783.5245170801475750E-6,-566.4403476402001160E-6,-88.92438393084313480E-6, 450.9092143633432100E-6, 817.8616459956699600E-6, 841.7306498093511210E-6, 495.5822225788455740E-6,-85.51557986657856250E-6,-656.4123945110995920E-6,-964.8662924347729590E-6,-862.5284067179455860E-6,-375.2828393541101950E-6, 300.0237739563860370E-6, 874.4560667028388250E-6, 0.001090164472359596, 835.5451098502492190E-6, 201.1919988203605380E-6,-550.7704621765309410E-6,-0.001094411832878682,-0.001180771142940600,-751.0914695446640510E-6, 28.67561545477996530E-6, 830.8681654758361220E-6, 0.001303226870386641, 0.001222999403169315, 600.8634212405887640E-6,-313.3764878151378070E-6,-0.001130197118226046,-0.001485724826421536,-0.001203115683866600,-378.6719410188165970E-6, 648.5525101941651660E-6, 0.001435382003559500, 0.001625065164448771, 0.001108090582798773, 81.16264269073093370E-6,-0.001026057991404070,-0.001729935230903342,-0.001703349699989170,-926.4122571275300970E-6, 291.5150652380575020E-6, 0.001433728599806572, 0.001994571378751822, 0.001702353869617076, 648.9226260857874420E-6,-735.1283955788730960E-6,-0.001855312749965784,-0.002207684835071205,-0.001604349313729314,-269.6323687549075880E-6, 0.001240905613202777, 0.002270573976250003, 0.002345968474751946, 0.001392974246376057,-213.5305137380872220E-6,-0.001795230168128267,-0.002655557437678700,-0.002385135660152189,-0.001054099336284485, 798.0813615715164820E-6, 0.002379496857325687, 0.002982994513348000, 0.002300690858936965, 576.6296833914587980E-6,-0.001476484847955344,-0.002970123506655021,-0.003222795508393209,-0.002068674975946207, 46.82224341581670050E-6, 0.002235830760167846, 0.003538685683553156, 0.003342549504198230, 0.001666287960147844,-819.4630819009767040E-6,-0.003057686303493348,-0.004052105369877611,-0.003307907326939082,-0.001072258550084767, 0.001740121893785564, 0.003918105633284077, 0.004472770064543100, 0.003082657772322536, 266.7788691729235780E-6,-0.002803421498178969,-0.004787738231008889,-0.004758370868813245,-0.002628195722905225, 769.2719925567854490E-6, 0.004000439690934184, 0.005631914927179969, 0.004861092447298931, 0.001901873229013341,-0.002056274728030315,-0.005320052262134744,-0.006410479055585143,-0.004725485495793865,-853.3044223474688580E-6, 0.003619832446757073, 0.006751311278615875, 0.007076909840564227, 0.004283710631210142,-583.2260378596162130E-6,-0.005498466619796607,-0.008287597049579185,-0.007575796990001497,-0.003445350620907208, 0.002504300176841653, 0.007758897336365033, 0.009934258644397245, 0.007836513216583964, 0.002075147714846438,-0.005070705639957433,-0.010528539554898025,-0.011724185678847697,-0.007757512119751574, 59.16993475261030260E-6, 0.008585873029180020, 0.014072760273724966, 0.013754502383653571, 0.007164384929012854,-0.003425237154328352,-0.013709030445404981,-0.019011254332136619,-0.016291757581157761,-0.005678473688921063, 0.009173391211196489, 0.022179432200976727, 0.027096212934009833, 0.020173702289179009, 0.002167007193506445,-0.021164172207011101,-0.040435286753901352,-0.045576331708279419,-0.029395106851093578, 0.009478483248810876, 0.065374553209471109, 0.126649403304741365, 0.178813587395760448, 0.208766865575148675, 0.208766865575148675, 0.178813587395760448, 0.126649403304741365, 0.065374553209471109, 0.009478483248810876,-0.029395106851093578,-0.045576331708279419,-0.040435286753901352,-0.021164172207011101, 0.002167007193506445, 0.020173702289179009, 0.027096212934009833, 0.022179432200976727, 0.009173391211196489,-0.005678473688921063,-0.016291757581157761,-0.019011254332136619,-0.013709030445404981,-0.003425237154328352, 0.007164384929012854, 0.013754502383653571, 0.014072760273724966, 0.008585873029180020, 59.16993475261030260E-6,-0.007757512119751574,-0.011724185678847697,-0.010528539554898025,-0.005070705639957433, 0.002075147714846438, 0.007836513216583964, 0.009934258644397245, 0.007758897336365033, 0.002504300176841653,-0.003445350620907208,-0.007575796990001497,-0.008287597049579185,-0.005498466619796607,-583.2260378596162130E-6, 0.004283710631210142, 0.007076909840564227, 0.006751311278615875, 0.003619832446757073,-853.3044223474688580E-6,-0.004725485495793865,-0.006410479055585143,-0.005320052262134744,-0.002056274728030315, 0.001901873229013341, 0.004861092447298931, 0.005631914927179969, 0.004000439690934184, 769.2719925567854490E-6,-0.002628195722905225,-0.004758370868813245,-0.004787738231008889,-0.002803421498178969, 266.7788691729235780E-6, 0.003082657772322536, 0.004472770064543100, 0.003918105633284077, 0.001740121893785564,-0.001072258550084767,-0.003307907326939082,-0.004052105369877611,-0.003057686303493348,-819.4630819009767040E-6, 0.001666287960147844, 0.003342549504198230, 0.003538685683553156, 0.002235830760167846, 46.82224341581670050E-6,-0.002068674975946207,-0.003222795508393209,-0.002970123506655021,-0.001476484847955344, 576.6296833914587980E-6, 0.002300690858936965, 0.002982994513348000, 0.002379496857325687, 798.0813615715164820E-6,-0.001054099336284485,-0.002385135660152189,-0.002655557437678700,-0.001795230168128267,-213.5305137380872220E-6, 0.001392974246376057, 0.002345968474751946, 0.002270573976250003, 0.001240905613202777,-269.6323687549075880E-6,-0.001604349313729314,-0.002207684835071205,-0.001855312749965784,-735.1283955788730960E-6, 648.9226260857874420E-6, 0.001702353869617076, 0.001994571378751822, 0.001433728599806572, 291.5150652380575020E-6,-926.4122571275300970E-6,-0.001703349699989170,-0.001729935230903342,-0.001026057991404070, 81.16264269073093370E-6, 0.001108090582798773, 0.001625065164448771, 0.001435382003559500, 648.5525101941651660E-6,-378.6719410188165970E-6,-0.001203115683866600,-0.001485724826421536,-0.001130197118226046,-313.3764878151378070E-6, 600.8634212405887640E-6, 0.001222999403169315, 0.001303226870386641, 830.8681654758361220E-6, 28.67561545477996530E-6,-751.0914695446640510E-6,-0.001180771142940600,-0.001094411832878682,-550.7704621765309410E-6, 201.1919988203605380E-6, 835.5451098502492190E-6, 0.001090164472359596, 874.4560667028388250E-6, 300.0237739563860370E-6,-375.2828393541101950E-6,-862.5284067179455860E-6,-964.8662924347729590E-6,-656.4123945110995920E-6,-85.51557986657856250E-6, 495.5822225788455740E-6, 841.7306498093511210E-6, 817.8616459956699600E-6, 450.9092143633432100E-6,-88.92438393084313480E-6,-566.4403476402001160E-6,-783.5245170801475750E-6,-660.8989251159424610E-6,-266.0085519273678190E-6, 222.2219282155563180E-6, 593.9670621475752340E-6, 698.3258335996976030E-6, 504.0909578730086200E-6, 107.2138253653659630E-6,-315.6580489304485010E-6,-585.4217320307434420E-6,-596.0420324019763710E-6,-355.6579784851013530E-6, 22.39008286279281280E-6, 372.4031540954385950E-6, 548.6316252709169700E-6, 485.6286529121493340E-6, 221.8094711193295150E-6,-121.8880852669094990E-6,-397.0193777286223170E-6,-491.4672567532300600E-6,-374.7648593146286660E-6,-106.7539047042511700E-6, 192.2051845858361500E-6, 394.9629158680992300E-6, 421.3967525005456310E-6, 269.6506718361622460E-6, 12.81889441169864520E-6,-235.7253504313817700E-6,-372.1147967561435620E-6,-345.1340708370582890E-6,-174.9209539475816650E-6, 59.34036139065401730E-6, 255.8882334451354270E-6, 334.3634604657972890E-6, 268.3884569597703380E-6, 93.66466053090076120E-6,-110.4692644381636340E-6,-256.7907900508363350E-6,-287.2564798196244170E-6,-195.7153542023962700E-6,-27.53277196456661000E-6, 142.4043234780449440E-6, 242.8171610526223670E-6, 235.7322319962519830E-6, 130.4626205385996510E-6,-23.08507710114890090E-6,-157.7507032381709560E-6,-218.3152940660573900E-6,-183.9358373761230610E-6,-74.80066892174815510E-6, 58.83695341793738720E-6, 159.5690163151246280E-6, 187.3332849135738910E-6, 135.1176733260618620E-6, 29.82131007206200830E-6,-81.15967262677578730E-6,-151.0898538008062530E-6,-153.4226992285940700E-6,-91.60761750746233640E-6, 4.312256142316567060E-6, 92.02479796172386270E-6, 135.4700545070515430E-6, 119.5106512276131810E-6, 54.85414729897830450E-6,-28.18268580960215530E-6,-93.69719863854982120E-6,-115.5998385779402470E-6,-87.83752761919105010E-6,-25.51467591852549430E-6, 42.92080484068521430E-6, 88.52025560782911380E-6, 93.96507511644892930E-6, 59.95323217943641220E-6, 3.582085651938113460E-6,-50.00909485652042010E-6,-78.73777304608509550E-6,-72.56446574739206310E-6,-36.76187456915871370E-6, 11.46773656911063460E-6, 51.09982205584921640E-6, 66.35847481556579910E-6, 52.87758953002247610E-6, 18.60301812321676710E-6,-20.52159581641252470E-6,-47.85805435255078780E-6,-53.06496452714677050E-6,-35.87676946745603600E-6,-5.356921369655561090E-6, 24.68029431209209430E-6, 41.83631164663676570E-6, 40.16548142864498060E-6, 22.07369790699343070E-6,-3.438442404986254970E-6,-25.12542511874780260E-6,-34.38412012066380900E-6,-28.58389907785851850E-6,-11.59072016534943120E-6, 8.469591168820601370E-6, 23.00867797048227100E-6, 26.59255574530914860E-6, 18.88131923910259150E-6, 4.246565645390340650E-6,-10.53810844057831630E-6,-19.36773512201770590E-6,-19.27117070103877340E-6,-11.30136043391740590E-6, 352.9895673672170920E-9, 10.46645638312625340E-6, 15.07004322439275690E-6, 12.95263921038207760E-6, 5.830806219449149450E-6,-2.727716424898928960E-6,-9.023454130073321980E-6,-10.78330803716723630E-6,-7.919106902676396050E-6,-2.267576792967956220E-6, 3.445208686058514050E-6, 6.871453131273526080E-6, 0.0f};

// LOWPASS, Fs = 44118sps, Fc = 4.4kHz, Kaiser Window, KaiserBeta = 7.6, 257 taps, 80dB stopband attenuation
// Iowa Hills FIR Filter Designer
//float32_t FIR1 [257] = {  7.328762614559201970E-6, 1.699530910501760860E-6,-7.475206154865301040E-6,-16.58864707313134800E-6,-20.71725038234058400E-6,-15.89322634590453380E-6,-1.546623827977725350E-6, 18.15229207888054930E-6, 35.05906151777181630E-6, 40.08343621023521310E-6, 27.49895263826912740E-6,-1.338406569559799310E-6,-36.87236974177827160E-6,-63.99381985769387170E-6,-68.00815642169322930E-6,-41.74761656253530620E-6, 9.365463746343056780E-6, 67.16686132592890600E-6, 106.5897713856251840E-6, 105.9269752259186110E-6, 57.54588018547838150E-6,-25.87661803103766900E-6,-113.3408031635732270E-6,-166.3106155289682140E-6,-154.8903042568662730E-6,-72.89559803429361300E-6, 55.27165709883256990E-6, 180.4567630373264820E-6, 246.7312913678270830E-6, 215.3307374164887450E-6, 84.68164945983102140E-6,-103.1082426170837890E-6,-274.2770110461190140E-6,-351.3425059712318440E-6,-286.8093405666396620E-6,-88.46579759213780390E-6, 176.1727105475958690E-6, 401.1666294078363540E-6, 483.3235639121454600E-6, 367.7504095060265290E-6, 78.29232257440359890E-6,-282.5230317340425470E-6,-567.9662444819534810E-6,-645.2953634062116630E-6,-455.1735661234249620E-6,-46.50593910926322390E-6, 431.5134058566782190E-6, 781.8503636979066870E-6, 839.0685661736512200E-6, 544.4290076521535870E-6,-16.42621160044870620E-6,-633.8211595687887440E-6,-0.001050196222342095,-0.001065404193656754,-628.9352480453173940E-6, 122.1103596179346910E-6, 901.5124886093840360E-6, 0.001380499415738101, 0.001323804968983299, 699.9072083139226380E-6,-284.4827416383485570E-6,-0.001248206918152512,-0.001780388256273617,-0.001612355412927966,-746.0432374336776320E-6, 520.2037805257599530E-6, 0.001689436729121183, 0.002257812645409072, 0.001927626922710254, 753.1072482572329820E-6,-849.3225009519289870E-6,-0.002243358362155522,-0.002821523408162229,-0.002264660848988334,-703.2857940274367370E-6, 0.001296442617129755, 0.002932081798002042, 0.003482031752368026, 0.002617038102056222, 574.0963347876687520E-6,-0.001892807564106649,-0.003784093355115586,-0.004253383954450271,-0.002977038347256079,-336.4193933759605670E-6, 0.002680096306067828, 0.004838678364053398, 0.005156393049966772, 0.003335885777371288,-49.20399049568850810E-6,-0.003717534447192724,-0.006154204345667625,-0.006224663422812675,-0.003684072232652828, 639.8623649234300500E-6, 0.005095841740168318, 0.007824422768965554, 0.007516452924108570, 0.004011742566743573,-0.001525533556200337,-0.006966531592685411,-0.010013078932329953,-0.009140093724559383,-0.004309122109956730, 0.002863960302125023, 0.009609845644003705, 0.013035802144374114, 0.011315405143426487, 0.004566962290015101,-0.004971252790332698,-0.013616299025791751,-0.017586400829978907,-0.014549640096248489,-0.004776978274764704, 0.008607429932232010, 0.020487455293014531, 0.025528436622625647, 0.020292174349273998, 0.004932252120862281,-0.016214726742582217,-0.035489595022107710,-0.044077991770366636,-0.034860358428049180,-0.005027576424117068, 0.042444580267464491, 0.099042763400386813, 0.152722135412216936, 0.191126786440214774, 0.205059716796875013, 0.191126786440214774, 0.152722135412216936, 0.099042763400386813, 0.042444580267464491,-0.005027576424117068,-0.034860358428049180,-0.044077991770366636,-0.035489595022107710,-0.016214726742582217, 0.004932252120862281, 0.020292174349273998, 0.025528436622625647, 0.020487455293014531, 0.008607429932232010,-0.004776978274764704,-0.014549640096248489,-0.017586400829978907,-0.013616299025791751,-0.004971252790332698, 0.004566962290015101, 0.011315405143426487, 0.013035802144374114, 0.009609845644003705, 0.002863960302125023,-0.004309122109956730,-0.009140093724559383,-0.010013078932329953,-0.006966531592685411,-0.001525533556200337, 0.004011742566743573, 0.007516452924108570, 0.007824422768965554, 0.005095841740168318, 639.8623649234300500E-6,-0.003684072232652828,-0.006224663422812675,-0.006154204345667625,-0.003717534447192724,-49.20399049568850810E-6, 0.003335885777371288, 0.005156393049966772, 0.004838678364053398, 0.002680096306067828,-336.4193933759605670E-6,-0.002977038347256079,-0.004253383954450271,-0.003784093355115586,-0.001892807564106649, 574.0963347876687520E-6, 0.002617038102056222, 0.003482031752368026, 0.002932081798002042, 0.001296442617129755,-703.2857940274367370E-6,-0.002264660848988334,-0.002821523408162229,-0.002243358362155522,-849.3225009519289870E-6, 753.1072482572329820E-6, 0.001927626922710254, 0.002257812645409072, 0.001689436729121183, 520.2037805257599530E-6,-746.0432374336776320E-6,-0.001612355412927966,-0.001780388256273617,-0.001248206918152512,-284.4827416383485570E-6, 699.9072083139226380E-6, 0.001323804968983299, 0.001380499415738101, 901.5124886093840360E-6, 122.1103596179346910E-6,-628.9352480453173940E-6,-0.001065404193656754,-0.001050196222342095,-633.8211595687887440E-6,-16.42621160044870620E-6, 544.4290076521535870E-6, 839.0685661736512200E-6, 781.8503636979066870E-6, 431.5134058566782190E-6,-46.50593910926322390E-6,-455.1735661234249620E-6,-645.2953634062116630E-6,-567.9662444819534810E-6,-282.5230317340425470E-6, 78.29232257440359890E-6, 367.7504095060265290E-6, 483.3235639121454600E-6, 401.1666294078363540E-6, 176.1727105475958690E-6,-88.46579759213780390E-6,-286.8093405666396620E-6,-351.3425059712318440E-6,-274.2770110461190140E-6,-103.1082426170837890E-6, 84.68164945983102140E-6, 215.3307374164887450E-6, 246.7312913678270830E-6, 180.4567630373264820E-6, 55.27165709883256990E-6,-72.89559803429361300E-6,-154.8903042568662730E-6,-166.3106155289682140E-6,-113.3408031635732270E-6,-25.87661803103766900E-6, 57.54588018547838150E-6, 105.9269752259186110E-6, 106.5897713856251840E-6, 67.16686132592890600E-6, 9.365463746343056780E-6,-41.74761656253530620E-6,-68.00815642169322930E-6,-63.99381985769387170E-6,-36.87236974177827160E-6,-1.338406569559799310E-6, 27.49895263826912740E-6, 40.08343621023521310E-6, 35.05906151777181630E-6, 18.15229207888054930E-6,-1.546623827977725350E-6,-15.89322634590453380E-6,-20.71725038234058400E-6,-16.58864707313134800E-6,-7.475206154865301040E-6, 1.699530910501760860E-6, 7.328762614559201970E-6};

// dto., LOWPASS, but Fstop = 2.4kHz
//float32_t FIR1 [257] = { 9.478463659369410270E-6, 12.70714431137183450E-6, 14.68103429127791950E-6, 14.49479956796871960E-6, 11.41776493385259000E-6, 5.089908387289337810E-6,-4.309783664626112730E-6,-15.96101513067834610E-6,-28.39919448816325340E-6,-39.63275865624029850E-6,-47.38350090008037800E-6,-49.43060901359754670E-6,-44.01681098337634520E-6,-30.25633471196833570E-6,-8.473212210829382580E-6, 19.60204134783623790E-6, 50.83739644240948510E-6, 80.94167884055040930E-6, 104.9457640360280950E-6, 117.8839462069065290E-6, 115.5969258699817460E-6, 95.54485687295468210E-6, 57.49932403509221500E-6, 3.982199741563685610E-6,-59.66009953683904850E-6,-125.6166648245185230E-6,-184.4170010490886970E-6,-226.1219357052953000E-6,-241.7718032967744650E-6,-224.9039573656878590E-6,-172.9139623166668120E-6,-88.02747811810078820E-6, 22.32198221733446530E-6, 145.8503861051778760E-6, 266.5624313454762840E-6, 366.6059349745512460E-6, 428.6566860210803040E-6, 438.5445607213444530E-6, 387.7582441874295110E-6, 275.4385174216719180E-6, 109.4984246530900410E-6,-93.40500699602969800E-6,-309.1847879979926570E-6,-508.8861885377647810E-6,-662.2701068971941820E-6,-742.0130864527443460E-6,-727.9819761102048690E-6,-610.9685330377087670E-6,-395.2792265263719860E-6,-99.67553074565914530E-6, 243.6550989706558430E-6, 592.1658834020230420E-6, 897.8352547039144160E-6, 0.001113363272414600, 0.001198931500159666, 0.001128619044471772, 895.5235359829191570E-6, 514.7245801221930610E-6, 23.44875125042007370E-6,-521.8695534083930170E-6,-0.001051525485351555,-0.001490937302885498,-0.001770553565044442,-0.001836027031766926,-0.001657252324891159,-0.001234892469535187,-603.2442372959892510E-6, 171.3004493121681830E-6, 996.3912453261834800E-6, 0.001764023560227131, 0.002363959489187634, 0.002698687062332011, 0.002697985676222414, 0.002331050101614698, 0.001614284387375592, 613.3111436446814650E-6,-561.5863128156847780E-6,-0.001766535064740374,-0.002840704319673461,-0.003626702778115438,-0.003992281005830110,-0.003850494531688887,-0.003175456564540999,-0.002011170915866102,-471.6670509964064310E-6, 0.001268310404008350, 0.002991592296925078, 0.004464144275072356, 0.005465069441068737, 0.005817406532078847, 0.005415676565785563, 0.004246239431140068, 0.002397161183427105, 55.41698862280428270E-6,-0.002509249355730281,-0.004972790417330560,-0.006994933531054830,-0.008262695927882365,-0.008534126892031858,-0.007676629965273138,-0.005694473397163862,-0.002741058487521368, 886.8925359691697850E-6, 0.004774005560083403, 0.008428254993829033, 0.011336501373060305, 0.013027821672442942, 0.013137516758874564, 0.011464091607285475, 0.008011814198809461, 0.003012680456452188,-0.003076353975003412,-0.009602371789443273,-0.015770253315109159,-0.020716778313256249,-0.023598066990079193,-0.023681458694612422,-0.020431966058645848,-0.013583568141761258,-0.003186844116896969, 0.010373305309214511, 0.026393073960790544, 0.043898042774577158, 0.061720528009800393, 0.078598932800310356, 0.093289519444314517, 0.104679489125996686, 0.111889913052296328, 0.114357958984375008, 0.111889913052296328, 0.104679489125996686, 0.093289519444314517, 0.078598932800310356, 0.061720528009800393, 0.043898042774577158, 0.026393073960790544, 0.010373305309214511,-0.003186844116896969,-0.013583568141761258,-0.020431966058645848,-0.023681458694612422,-0.023598066990079193,-0.020716778313256249,-0.015770253315109159,-0.009602371789443273,-0.003076353975003412, 0.003012680456452188, 0.008011814198809461, 0.011464091607285475, 0.013137516758874564, 0.013027821672442942, 0.011336501373060305, 0.008428254993829033, 0.004774005560083403, 886.8925359691697850E-6,-0.002741058487521368,-0.005694473397163862,-0.007676629965273138,-0.008534126892031858,-0.008262695927882365,-0.006994933531054830,-0.004972790417330560,-0.002509249355730281, 55.41698862280428270E-6, 0.002397161183427105, 0.004246239431140068, 0.005415676565785563, 0.005817406532078847, 0.005465069441068737, 0.004464144275072356, 0.002991592296925078, 0.001268310404008350,-471.6670509964064310E-6,-0.002011170915866102,-0.003175456564540999,-0.003850494531688887,-0.003992281005830110,-0.003626702778115438,-0.002840704319673461,-0.001766535064740374,-561.5863128156847780E-6, 613.3111436446814650E-6, 0.001614284387375592, 0.002331050101614698, 0.002697985676222414, 0.002698687062332011, 0.002363959489187634, 0.001764023560227131, 996.3912453261834800E-6, 171.3004493121681830E-6,-603.2442372959892510E-6,-0.001234892469535187,-0.001657252324891159,-0.001836027031766926,-0.001770553565044442,-0.001490937302885498,-0.001051525485351555,-521.8695534083930170E-6, 23.44875125042007370E-6, 514.7245801221930610E-6, 895.5235359829191570E-6, 0.001128619044471772, 0.001198931500159666, 0.001113363272414600, 897.8352547039144160E-6, 592.1658834020230420E-6, 243.6550989706558430E-6,-99.67553074565914530E-6,-395.2792265263719860E-6,-610.9685330377087670E-6,-727.9819761102048690E-6,-742.0130864527443460E-6,-662.2701068971941820E-6,-508.8861885377647810E-6,-309.1847879979926570E-6,-93.40500699602969800E-6, 109.4984246530900410E-6, 275.4385174216719180E-6, 387.7582441874295110E-6, 438.5445607213444530E-6, 428.6566860210803040E-6, 366.6059349745512460E-6, 266.5624313454762840E-6, 145.8503861051778760E-6, 22.32198221733446530E-6,-88.02747811810078820E-6,-172.9139623166668120E-6,-224.9039573656878590E-6,-241.7718032967744650E-6,-226.1219357052953000E-6,-184.4170010490886970E-6,-125.6166648245185230E-6,-59.66009953683904850E-6, 3.982199741563685610E-6, 57.49932403509221500E-6, 95.54485687295468210E-6, 115.5969258699817460E-6, 117.8839462069065290E-6, 104.9457640360280950E-6, 80.94167884055040930E-6, 50.83739644240948510E-6, 19.60204134783623790E-6,-8.473212210829382580E-6,-30.25633471196833570E-6,-44.01681098337634520E-6,-49.43060901359754670E-6,-47.38350090008037800E-6,-39.63275865624029850E-6,-28.39919448816325340E-6,-15.96101513067834610E-6,-4.309783664626112730E-6, 5.089908387289337810E-6, 11.41776493385259000E-6, 14.49479956796871960E-6, 14.68103429127791950E-6, 12.70714431137183450E-6, 9.478463659369410270E-6 };

// BANDPASS Fs = 44118sps, Fc = 2.0kHz, bandwidth = 1.324kHz Kaiser Window, KaiserBeta = 7.6, 257 taps, 80dB stopband attenuation
// Iowa Hills FIR Filter Designer
//float32_t FIR1 [257] = {16.61866773479811950E-6, 19.94613337054439480E-6, 21.29563655473358710E-6, 20.12425300443531740E-6, 16.39366853431405600E-6, 10.68919544786741230E-6, 4.215411625141377350E-6,-1.356969432477939330E-6,-4.190108410209962390E-6,-2.688138188638207640E-6, 4.050701286956446270E-6, 15.82862741394291280E-6, 31.08717219851485680E-6, 46.87678423399312780E-6, 59.11256947541127000E-6, 63.12360636272744330E-6, 54.44533420394996880E-6, 29.74661243139744120E-6,-12.26326443768168950E-6,-70.12601713393657120E-6,-139.2955894056957220E-6,-212.2990212746223050E-6,-279.4539930704328300E-6,-330.1048511403636210E-6,-354.2362782939503060E-6,-344.2320430481552190E-6,-296.4843303376538300E-6,-212.5428895255537900E-6,-99.53208858067959850E-6, 30.34164237528758080E-6, 161.2294257459753060E-6, 275.8979513215931550E-6, 358.5120995029542430E-6, 397.5145321226719940E-6, 388.1004940400010240E-6, 333.7928218129888480E-6, 246.7188501340256150E-6, 146.3714441825771080E-6, 56.87879371166833660E-6, 3.076130405200208530E-6, 5.922469021692320370E-6, 77.99015280909182480E-6, 219.8338842995363790E-6, 417.9925918761219350E-6, 645.1858098218745000E-6, 862.9538006704217420E-6, 0.001026598160733021, 0.001091866377316668, 0.001022459116078967, 797.1913328353425640E-6, 415.5630802610967290E-6,-99.37538268467884000E-6,-701.6472414193301570E-6,-0.001326744086469984,-0.001899473896474262,-0.002344334452274706,-0.002596969281511279,-0.002614938474580782,-0.002385976109729689,-0.001932145511897164,-0.001308831760545092,-598.2665719417420860E-6, 101.8413126900205160E-6, 693.1323229161806690E-6, 0.001092735110396656, 0.001248160232496881, 0.001148362674464450, 828.7863730957278680E-6, 369.0014935166746000E-6,-117.3551478836799050E-6,-499.3795146741921370E-6,-650.2818931890610660E-6,-470.2397434546688260E-6, 92.91611279438340890E-6, 0.001028797943446396, 0.002258341478397023, 0.003638313919204863, 0.004976735440146872, 0.006057761326623174, 0.006672919004720886, 0.006654301372932098, 0.005904622972425995, 0.004419108933794485, 0.002295059200475405,-273.4627786303617540E-6,-0.003016225235905580,-0.005620736386853022,-0.007775238832768322,-0.009214796723312999,-0.009763522764877863,-0.009366052136621639,-0.008102517647962031,-0.006183416855904221,-0.003923627103313620,-0.001698025785265647, 115.7623982397020510E-6, 0.001199526967000850, 0.001352179986100310, 533.5872473138101670E-6,-0.001113521893048559,-0.003268896033113763,-0.005468953566903118,-0.007167168560789308,-0.007813818712157576,-0.006944797857711526,-0.004267004042541636, 272.6938139379725500E-6, 0.006446547135154211, 0.013741563508818224, 0.021395580618215986, 0.028472210640447422, 0.033967491028566665, 0.036936136606129023, 0.036621739315567302, 0.032573680133572355, 0.024734224764116537, 0.013482284145852581,-374.6769295480607980E-6,-0.015662705314293146,-0.030952551858455212,-0.044708524900117598,-0.055458180159149446,-0.061962686197633114,-0.063366899796061693,-0.059310088680181249,-0.049982663161841835,-0.036120691753600054,-0.018937569349441633, 0.000000000000000000, 0.018937569349441633, 0.036120691753600054, 0.049982663161841835, 0.059310088680181249, 0.063366899796061693, 0.061962686197633114, 0.055458180159149446, 0.044708524900117598, 0.030952551858455212, 0.015662705314293146, 374.6769295480607980E-6,-0.013482284145852581,-0.024734224764116537,-0.032573680133572355,-0.036621739315567302,-0.036936136606129023,-0.033967491028566665,-0.028472210640447422,-0.021395580618215986,-0.013741563508818224,-0.006446547135154211,-272.6938139379725500E-6, 0.004267004042541636, 0.006944797857711526, 0.007813818712157576, 0.007167168560789308, 0.005468953566903118, 0.003268896033113763, 0.001113521893048559,-533.5872473138101670E-6,-0.001352179986100310,-0.001199526967000850,-115.7623982397020510E-6, 0.001698025785265647, 0.003923627103313620, 0.006183416855904221, 0.008102517647962031, 0.009366052136621639, 0.009763522764877863, 0.009214796723312999, 0.007775238832768322, 0.005620736386853022, 0.003016225235905580, 273.4627786303617540E-6,-0.002295059200475405,-0.004419108933794485,-0.005904622972425995,-0.006654301372932098,-0.006672919004720886,-0.006057761326623174,-0.004976735440146872,-0.003638313919204863,-0.002258341478397023,-0.001028797943446396,-92.91611279438340890E-6, 470.2397434546688260E-6, 650.2818931890610660E-6, 499.3795146741921370E-6, 117.3551478836799050E-6,-369.0014935166746000E-6,-828.7863730957278680E-6,-0.001148362674464450,-0.001248160232496881,-0.001092735110396656,-693.1323229161806690E-6,-101.8413126900205160E-6, 598.2665719417420860E-6, 0.001308831760545092, 0.001932145511897164, 0.002385976109729689, 0.002614938474580782, 0.002596969281511279, 0.002344334452274706, 0.001899473896474262, 0.001326744086469984, 701.6472414193301570E-6, 99.37538268467884000E-6,-415.5630802610967290E-6,-797.1913328353425640E-6,-0.001022459116078967,-0.001091866377316668,-0.001026598160733021,-862.9538006704217420E-6,-645.1858098218745000E-6,-417.9925918761219350E-6,-219.8338842995363790E-6,-77.99015280909182480E-6,-5.922469021692320370E-6,-3.076130405200208530E-6,-56.87879371166833660E-6,-146.3714441825771080E-6,-246.7188501340256150E-6,-333.7928218129888480E-6,-388.1004940400010240E-6,-397.5145321226719940E-6,-358.5120995029542430E-6,-275.8979513215931550E-6,-161.2294257459753060E-6,-30.34164237528758080E-6, 99.53208858067959850E-6, 212.5428895255537900E-6, 296.4843303376538300E-6, 344.2320430481552190E-6, 354.2362782939503060E-6, 330.1048511403636210E-6, 279.4539930704328300E-6, 212.2990212746223050E-6, 139.2955894056957220E-6, 70.12601713393657120E-6, 12.26326443768168950E-6,-29.74661243139744120E-6,-54.44533420394996880E-6,-63.12360636272744330E-6,-59.11256947541127000E-6,-46.87678423399312780E-6,-31.08717219851485680E-6,-15.82862741394291280E-6,-4.050701286956446270E-6, 2.688138188638207640E-6, 4.190108410209962390E-6, 1.356969432477939330E-6,-4.215411625141377350E-6,-10.68919544786741230E-6,-16.39366853431405600E-6,-20.12425300443531740E-6,-21.29563655473358710E-6,-19.94613337054439480E-6,-16.61866773479811950E-6};
 
Last edited:
Frank,
you would use 'fir1', which is also available for octave (signal package).
Also best performance would be achieved by FFT size that is much larger than FIR window length (say 2048 FFT with 128 point FIR)
IMO, a FIR window length > 128 is for single precision float arithmetic already sufficient.
The overlap-save method you are implementing, works also for less overlap than 50%.
 
Thanks Walter!

fir1 runs here now on Octave and produces coefficients, thanks for the hint! However, the stopband attenuation is not that good, at the moment I am trying to use a Kaiser window, but that does not seem to work here on my machine because
g = firkaiser(N+1,6.6);
f = [f1 ]/(Fs/2)
hc = fir1(N, f,'low',g)
does not produce a nice Kaiser window FIR in Octave . . .
I will have a look on that.

Hmmm, so what I am trying to do is overkill, you mean? ;-) From your message I interpret, that the single precision of the FPU of the Teensy 3.6 will make it impossible to get better attenuation/steeper slope than a 129 point FIR. So the steep slopes of the 1025tap FIR are only theoretical and not achievable, because of lacking precision? Is it possible to estimate that on the basis of the number of precision bits of the FPU?

And, maybe I haven´t understood fast convolution right: is the filter steepness / stopband attenuation dependent on the FFT length? I thought it was only dependent on the number of taps /coefficients of the FIR and the FFT length in combination with the FIR no. taps determines the "effectiveness" in terms of processor use.

How would you save calculations, if you use a large FFT and a small FIR? With a 2048-point-FFT and a 129-tap-FIR you would calculate the FFT every 8 sample blocks (128 * 2 * 8 = 2048) and complex multiply with 2048 coeffs (2048 - 129 of them are zero). With a 2048-point FFT and a 1025-tap FIR you would also calculate the FFT every 8 sample blocks (128 * 2 * 8 = 2048) and complex multiply with 2048 coeffs (2048 - 1025 of them are zero), but have a much better filter effect. --> but maybe I am wrong here . . .

All the best,
Frank
 
If you really NEED the steep transition, then you need longer FIR.
So, for a SDR, to separate a weak signal from a strong nearby signal, you may need a lot of coefficients, that is correct.

According to http://www.allaboutcircuits.com/technical-articles/design-of-fir-filters-design-octave-matlab/
the formula for estimating the number of required FIR coefficients is
N = AdB*FS/(22*DF)
where AdB is stopband attenuation in dB
FS is sampling frequency
DF is transition bandwidith

Now for saving calculation, you can interpret you 50% overlap method (say, 1024 point FFT with 512 point FIR) that for each FFT you throw away 50 % of data, now if you were to do a 1024 point FFT with 256 point FIR, you need to throw away only 256 points (+-1, I have to look up the right number). So, you do not need to perform so many FFT's for the same number of audio samples.

FFTFILT is implementing an alternative, the overlap-add method.
You may look into the octave code and adapt to C. (That is what I did, but need to look for the code)
 
Last edited:
You may look into the octave code and adapt to C. (That is what I did, but need to look for the code)
If you had told me that before, I could have saved a lot of time ! ;-). No, just kidding, it was very educating for me to write the code from scratch and I learned how to do it :).

Aaah, the argument with the "thrown away samples" clarifies the situation. I was assuming a constant overlap of 50%, now that makes sense to think about reducing the overlap.

I am not sure, how switching to overlap-and-add would alter the situation (making the filter "better"/more efficient) . . . no idea, probably would have to try it.

I just did an experiment comparing FIR filters of 129, 257, and 513 taps with "exactly" the same parameters (tweaked for Fstop =4.8kHz and at least 90dB stopband attenuation:

// LOWPASS, Fs = 44118sps, Fc = 4.800kHz, Raised Cos 0.98, Kaiser Window, KaiserBeta = 5.4, 512 taps plus one 0.0f, 90dB stopband attenuation
// LOWPASS, Fs = 44118sps, Fc = 4.800kHz, Raised Cos 0.96, Kaiser Window, KaiserBeta = 5.4, 257 taps, 90dB stopband attenuation
// LOWPASS, Fs = 44118sps, Fc = 4.800kHz, Raised Cos 0.94, Kaiser Window, KaiserBeta = 7.0, 129 taps, 90dB stopband attenuation

Then I put a very loud sine tone of 4978Hz (E flat) to the input (which should be suppressed by a "perfect" lowpass with Fstop of 4.800kHz) and had a look at the graphical FFT output of the script (of course far from being a scientific quantitative measurement):

129 taps / 256-point FFT/iFFT: full amplitude
256 taps / 512-point FFT/iFFT: half amplitude
513 taps / 1024-point FFT/iFFT: very faint signal, smaller than 0.05 of the amplitude
[UPDATE:
1020 taps / 2048-point FFT/iFFT: no more signal audible, when I pump up the volume, the signal is well below 2% of the amplitude of the 513 tap filter --> that´s great to have the possibility to have 1020tap FIR filters on the Teensy (in fact: two simultaneous FIR filters, because we have two independent channels) with only 16% processor load . . .]

So, for me this is fine: the precision of the FPU of the Teensy 3.6 does not seem to be limiting the use of higher order FIRs in the Digital Convolution (IF you really need filters with steep transitions). Will test with 1025 tap and 2049 tap FIRs when I have figured out how to calculate nice filters with OCTAVE . . . [DONE: see above]

Frank
 
Last edited:
I am not sure, how switching to overlap-and-add would alter the situation (making the filter "better"/more efficient) . . . no idea, probably would have to try it.

At the moment I try to test a streaming overlap-save (similar to yours)
it is based on the fact that audio library delivers 128 samples
so I construct a FIR filter of 129 taps and do a filter operation every 3 audio buffers (resulting in 1024 point FFT's)
more later, so you can try.
 
Walter: yes that would be nice to be able to try 25% overlap instead of 50% overlap and also try overlap-and-add!

Now I have another question.

I modified the code again and took out all those precalculated coefficients. Now the Teensy calculates all the coefficients of a FIR lowpass filter WITHOUT external Software ;-).

Put in your favourite Fstopband, Fpassband and the stopband attenuation (line 91-93 in the script) and the script will calculate the lowpass FIR coeffs and choose an appropriate FFT size. Works very nicely.

The code for the calculation of the Kaiser-Bessel FIR coeffs was taken from the very helpful CuteSDR Manual by Moe Wheatley. http://ae4jy.software.informer.com/
https://code.google.com/p/cutesdr-se/downloads/detail?name=CuteSDR101.pdf

However, I realized that with frequencies larger than samplerate / 4 = 5515Hz, there is an "alias effect", meaning that the passband is widened at the upper end too, which it shouldn´t. At least I did not intend it to do so . . . Can anyone explain that to me or even better, fix that?

With Fpass and Fstop in the region of 10kHz, the whole thing begins to behave like a notch filter letting everything through except a small portion of audio between 10k and 11k . . .

Am I missing something here? Or do I have to alter the way how I am stuffing the calculated coeffs into the arrays for the FFT to produce the filter mask for the complex multiply? Maybe it´s because I built a stereo version, where two channels are treated simultaneously with one filter . . .

Frank


EDIT: bug in coeffs copying fixed now! That was the cause of the problem explained above [27.11.2016]
Code:
/*********************************************************************************************
 * (c) Frank DD4WH 2016_11_24
 * 
 * FFT Fast Convolution = Digital Convolution
 * with overlap - save = overlap-discard
 * dynamically creates FIR coefficients  
 * 
 * in floating point 32bit 
 * tested on Teensy 3.6
 * 
 * audio queue optimized by Pete El Supremo 2016_10_27, thanks Pete!
 * final hint on the implementation came from Alberto I2PHD, thanks for that!
 * audio processing in float32_t with the NEW ARM CMSIS lib (see https://forum.pjrc.com/threads/38325-Excellent-results-with-Floating-Point-FFT-IFFT-Processing-and-Teensy-3-6?p=119797&viewfull=1#post119797),
 * and see here: https://forum.pjrc.com/threads/38325-Excellent-results-with-Floating-Point-FFT-IFFT-Processing-and-Teensy-3-6?p=120177&viewfull=1#post120177 
 * MIT license
 */

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <Metro.h>
#include "font_Arial.h"
#include <ILI9341_t3.h>
#include <arm_math.h>
#include <arm_const_structs.h>

#include "Filter_coeffs.h"

#define BACKLIGHT_PIN 0
#define TFT_DC      20
#define TFT_CS      21
#define TFT_RST     32  // 255 = unused. connect to 3.3V
#define TFT_MOSI     7
#define TFT_SCLK    14
#define TFT_MISO    12

ILI9341_t3 tft = ILI9341_t3(TFT_CS, TFT_DC, TFT_RST, TFT_MOSI, TFT_SCLK, TFT_MISO);

Metro five_sec=Metro(2000); // Set up a Metro

// this audio comes from the codec by I2S2
AudioInputI2S            i2s_in; 
           
AudioRecordQueue         Q_in_L;    
AudioRecordQueue         Q_in_R;    

AudioPlayQueue           Q_out_L; 
AudioPlayQueue           Q_out_R; 
AudioAnalyzeFFT256       myFFT;
AudioOutputI2S           i2s_out;           

AudioConnection          patchCord1(i2s_in, 0, Q_in_L, 0);
AudioConnection          patchCord2(i2s_in, 1, Q_in_R, 0);

AudioConnection          patchCord5(Q_out_R,0,myFFT,0); 

AudioConnection          patchCord3(Q_out_L, 0, i2s_out, 1);
AudioConnection          patchCord4(Q_out_R, 0, i2s_out, 0);

AudioControlSGTL5000     sgtl5000_1;     //xy=265.212

int idx_t = 0;
int idx = 0;
int64_t sum;
float32_t mean;
int n_L;
int n_R;
long int n_clear;

int peak[512];
int barm[512];

ulong samp_ptr = 0;
// bool FFT_state = false;

const int myInput = AUDIO_INPUT_LINEIN;

#define BUFFER_SIZE 128

// complex FFT with the new library CMSIS V4.5
const static arm_cfft_instance_f32 *S;

// create coefficients on the fly for custom filters
// and let the algorithm define which FFT size you need
// input variables by the user
//   first experiment: LOWPASS
// Fpass, Fstop, Astop (stopband attenuation)  
// fixed: sample rate, scale = 1.0 ???
//
float32_t Fpass = 4860;
float32_t Fstop = 4980;
float32_t Astop = 90;
float32_t FIR_Coef[2049];
#define MAX_NUMCOEF 2049
#define K_2PI         2.0 * 3.14159265358979323846264338327950288419716939937510
#define K_PI          3.14159265358979323846264338327950288419716939937510
uint32_t m_NumTaps = 3;
const uint32_t FFT_L = 4096;
uint32_t FFT_length = FFT_L;
uint32_t N_BLOCKS;
const uint32_t N_B = FFT_L / 2 / BUFFER_SIZE; 
float32_t float_buffer_L [BUFFER_SIZE * N_B];
float32_t float_buffer_R [BUFFER_SIZE * N_B];
float32_t FFT_buffer [FFT_L * 2] __attribute__ ((aligned (4)));
float32_t last_sample_buffer_L [BUFFER_SIZE * N_B];
float32_t last_sample_buffer_R [BUFFER_SIZE * N_B];

// complex iFFT with the new library CMSIS V4.5
const static arm_cfft_instance_f32 *iS;
float32_t iFFT_buffer [FFT_L * 2] __attribute__ ((aligned (4)));

// FFT instance for direct calculation of the filter mask
// from the impulse response of the FIR - the coefficients
const static arm_cfft_instance_f32 *maskS;
float32_t FIR_filter_mask [FFT_L * 2] __attribute__ ((aligned (4))); 
 
int8_t first_block = 1;
uint32_t i = 0;

void setup() {
  Serial.begin(115200);
  delay(1000);

  // for the large FFT sizes we need a lot of buffers
  AudioMemory(60);

  // Enable the audio shield. select input. and enable output
  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.65); // when I put this to 0.9, the digital noise of the Teensy is way too loud,
  // so 0.5 to 0.65 seems a good compromise
  sgtl5000_1.adcHighPassFilterDisable(); // does not help too much!

  pinMode( BACKLIGHT_PIN, OUTPUT );
  analogWrite( BACKLIGHT_PIN, 1023 );

  tft.begin();
  tft.setRotation( 3 );
  tft.fillScreen(ILI9341_BLACK);
  tft.setCursor(10, 1);
  tft.setTextSize(2);
  tft.setTextColor(ILI9341_WHITE);
  tft.setFont(Arial_14);
  tft.print("Fast convolution - FIR coeffs: ");
  
  // this routine does all the magic of calculating the FIR coeffs (Bessel-Kaiser window)
  calc_FIR_lowpass_coeffs (1.0, Astop, Fpass, Fstop, AUDIO_SAMPLE_RATE_EXACT); 
  
  // adjust length of FFT according to number of coefficients calculated
      if (m_NumTaps > 1025)
      {
          FFT_length = 4096;
      }
      else if (m_NumTaps > 513)
      {
          FFT_length = 2048;
      }
      else if (m_NumTaps > 257) 
      {
          FFT_length = 1024;
      }
      else if (m_NumTaps > 129)
      {
          FFT_length = 512;
      }
      else
      {
          FFT_length = 256;
      }
  // set no of BLOCKS
  N_BLOCKS = FFT_length / 2 / BUFFER_SIZE;

  // set to zero all other coefficients in coefficient array    
  for(i = 0; i < MAX_NUMCOEF; i++)
  {
    Serial.print (FIR_Coef[i]); Serial.print(", ");
      if (i >= m_NumTaps) FIR_Coef[i] = 0.0;
  }
    // print number of FIR taps to impress others and show-off
    tft.print((int)m_NumTaps);    
//    tft.print("  ");
//    tft.print(FFT_length);
    
 /****************************************************************************************
 *  init complex FFT
 ****************************************************************************************/
   switch (FFT_length) {
    case 256:
      S = &arm_cfft_sR_f32_len256;
      iS = &arm_cfft_sR_f32_len256;
      maskS = &arm_cfft_sR_f32_len256;
      break;
    case 512:
      S = &arm_cfft_sR_f32_len512;
      iS = &arm_cfft_sR_f32_len512;
      maskS = &arm_cfft_sR_f32_len512;
      break;
    case 1024:
      S = &arm_cfft_sR_f32_len1024;
      iS = &arm_cfft_sR_f32_len1024;
      maskS = &arm_cfft_sR_f32_len1024;
      break;
    case 2048:
      S = &arm_cfft_sR_f32_len2048;
      iS = &arm_cfft_sR_f32_len2048;
      maskS = &arm_cfft_sR_f32_len2048;
      break;
    case 4096:
      S = &arm_cfft_sR_f32_len4096;
      iS = &arm_cfft_sR_f32_len4096;
      maskS = &arm_cfft_sR_f32_len4096;
      break;
  }
 /****************************************************************************************
 *  Calculate the FFT of the FIR filter coefficients once to produce the FIR filter mask
 ****************************************************************************************/

// the FIR has exactly m_NumTaps and a maximum of (FFT_length / 2) + 1 taps = coefficients, so we have to add (FFT_length / 2) -1 zeros before the FFT
// in order to produce a FFT_length point input buffer for the FFT 
    // copy coefficients into real values of first part of buffer, rest is zero
    for (i = 0; i < (FFT_length / 2) + 1; i++)
    {
        FIR_filter_mask[i * 2] = FIR_Coef [i];
        FIR_filter_mask[i * 2 + 1] =  0.0; 
    }
// FFT of FIR_filter_mask
// perform FFT (in-place), needs only to be done once (or every time the filter coeffs change)
    arm_cfft_f32(maskS, FIR_filter_mask, 0, 1);    
    Serial.println("Filter mask FFT done ");

///////////////////////////////////////////////////////////////77
// PASS-THRU only for TESTING
/////////////////////////////////////////////////////////////77
/*
// hmmm, unclear, whether [1,1] or [1,0] or [0,1] are pass-thru filter-mask-multipliers??
// empirically, [1,0] sounds most natural = pass-thru
  for(i = 0; i < FFT_length * 2; i+=2)
  {
        FIR_filter_mask [i] = 1.0; // real
        FIR_filter_mask [i + 1] = 0.0; // imaginary
  }
*/
 /****************************************************************************************
 *  begin to queue the audio from the audio library
 ****************************************************************************************/
    delay(100);
    Q_in_L.begin();
    Q_in_R.begin();    
} // END SETUP

int16_t *sp_L;
int16_t *sp_R;

void loop() {
 
 elapsedMicros usec = 0;
/**********************************************************************************
 *  Get samples from queue buffers
 **********************************************************************************/
    // we have to ensure that we have enough audio samples: we need
    // N_BLOCKS = fft_length / 2 / BUFFER_SIZE
    // FFT1024 point --> n_blocks = 1024 / 2 / 128 = 4 
    // when these buffers are available, read them in and perform
    // the FFT - cmplx-mult - iFFT
    //
    // are there at least N_BLOCKS buffers in each channel available ?
    if (Q_in_L.available() > N_BLOCKS && Q_in_R.available() > N_BLOCKS)
    {   
// get audio samples from the audio  buffers and convert them to float
    for (i = 0; i < N_BLOCKS; i++)
    {  
    sp_L = Q_in_L.readBuffer();
    sp_R = Q_in_R.readBuffer();

      // convert to float one buffer_size
     arm_q15_to_float (sp_L, &float_buffer_L[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
     arm_q15_to_float (sp_R, &float_buffer_R[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
     Q_in_L.freeBuffer();
     Q_in_R.freeBuffer();
    }

    // this is supposed to prevent overfilled queue buffers
    // rarely the Teensy audio queue gets a hickup
    // in that case this keeps the whole audio chain running smoothly 
    if (Q_in_L.available() > N_BLOCKS + 4 || Q_in_R.available() > N_BLOCKS + 4) 
    {
      Q_in_L.clear();
      Q_in_R.clear();
      n_clear ++; // just for debugging to check how often this occurs [about once in an hour of playing . . .]
    }
  
/**********************************************************************************
 *  Digital convolution
 **********************************************************************************/
//  basis for this was Lyons, R. (2011): Understanding Digital Processing.
//  "Fast FIR Filtering using the FFT", pages 688 - 694
//  numbers for the steps taken from that source
//  Method used here: overlap-and-save

// 4.) ONLY FOR the VERY FIRST FFT: fill first samples with zeros 
      if(first_block) // fill real & imaginaries with zeros for the first BLOCKSIZE samples
      {
        for(i = 0; i < BUFFER_SIZE * 2 * N_BLOCKS; i++)
        {
          FFT_buffer[i] = 0.0;
        }
        first_block = 0;
      }
      else
      
// HERE IT STARTS for all other instances
// 6 a.) fill FFT_buffer with last events audio samples
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        FFT_buffer[i * 2] = last_sample_buffer_L[i]; // real
        FFT_buffer[i * 2 + 1] = last_sample_buffer_R[i]; // imaginary
      }

    // copy recent samples to last_sample_buffer for next time!
    arm_copy_f32(float_buffer_L, last_sample_buffer_L, BUFFER_SIZE * N_BLOCKS);
    arm_copy_f32(float_buffer_R, last_sample_buffer_R, BUFFER_SIZE * N_BLOCKS);

// 6. b) now fill audio samples into FFT_buffer (left channel: re, right channel: im)
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        FFT_buffer[FFT_length + i * 2] = float_buffer_L[i]; // real
        FFT_buffer[FFT_length + i * 2 + 1] = float_buffer_R[i]; // imaginary
      }

// perform complex FFT
// calculation is performed in-place the FFT_buffer [re, im, re, im, re, im . . .]
        arm_cfft_f32(S, FFT_buffer, 0, 1);

// complex multiply with filter mask
     arm_cmplx_mult_cmplx_f32 (FFT_buffer, FIR_filter_mask, iFFT_buffer, FFT_length);

/*  // just for testing: pass-thru !  
     for(i = 0; i < FFT_length * 2; i++)
    {
      iFFT_buffer[i] = FFT_buffer[i];
    }
*/

// perform iFFT (in-place)
        arm_cfft_f32(iS, iFFT_buffer, 1, 1);

// our desired output is the real part (left channel) AND the imaginary part (right channel) of the second half of the FFT_buffer
      for(i = 0; i < BUFFER_SIZE * N_BLOCKS; i++)
      {
        float_buffer_L[i] = iFFT_buffer[FFT_length + (i * 2)]; 
        float_buffer_R[i] = iFFT_buffer[FFT_length + (i * 2) + 1];
//        float_buffer_L[i] = iFFT_buffer[(i * 2)]; 
//        float_buffer_R[i] = iFFT_buffer[(i * 2) + 1];
      }

    for (i = 0; i < N_BLOCKS; i++)
    {
      sp_L = Q_out_L.getBuffer();
      sp_R = Q_out_R.getBuffer();
      arm_float_to_q15 (&float_buffer_L[BUFFER_SIZE * i], sp_L, BUFFER_SIZE); 
      arm_float_to_q15 (&float_buffer_R[BUFFER_SIZE * i], sp_R, BUFFER_SIZE); 
      Q_out_L.playBuffer(); // play it !
      Q_out_R.playBuffer(); // play it !
    }  

/**********************************************************************************
 *  PRINT ROUTINE FOR ELAPSED MICROSECONDS
 **********************************************************************************/
 
      sum = sum + usec;
      idx_t++;
      if (idx_t > 1000) {
          tft.fillRect(240,50,90,20,ILI9341_BLACK);   
          tft.setCursor(240, 50);
          mean = sum / idx_t;
          tft.print (mean);
          Serial.print (mean);
          Serial.print (" microsec for ");
          Serial.print (N_BLOCKS);
          Serial.print ("  stereo blocks    ");
          Serial.println();
          Serial.print (" n_clear    ");
          Serial.println(n_clear);
          idx_t = 0;
          sum = 0;
         
      }

     }
/**********************************************************************************
 *  PRINT ROUTINE FOR AUDIO LIBRARY PROCESSOR AND MEMORY USAGE
 **********************************************************************************/
          if (five_sec.check() == 1)
    {
      Serial.print("Proc = ");
      Serial.print(AudioProcessorUsage());
      Serial.print(" (");    
      Serial.print(AudioProcessorUsageMax());
      Serial.print("),  Mem = ");
      Serial.print(AudioMemoryUsage());
      Serial.print(" (");    
      Serial.print(AudioMemoryUsageMax());
      Serial.println(")");
/*      tft.fillRect(100,120,200,80,ILI9341_BLACK);
      tft.setCursor(10, 120);
      tft.setTextSize(2);
      tft.setTextColor(ILI9341_WHITE);
      tft.setFont(Arial_14);
      tft.print ("Proc = ");
      tft.setCursor(100, 120);
      tft.print (AudioProcessorUsage());
      tft.setCursor(180, 120);
      tft.print (AudioProcessorUsageMax());
      tft.setCursor(10, 150);
      tft.print ("Mem  = ");
      tft.setCursor(100, 150);
      tft.print (AudioMemoryUsage());
      tft.setCursor(180, 150);
      tft.print (AudioMemoryUsageMax());
     */ 
      AudioProcessorUsageMaxReset();
      AudioMemoryUsageMaxReset();
    }
   spectrum();
}

 void spectrum() { // spectrum analyser code by rheslip - modified
     if (myFFT.available()) {
    int scale;
    scale = 2;
  for (int16_t x=2; x < 100; x+=2) {

     int bar = (abs(myFFT.output[x]) * scale);
     if (bar >180) bar=180;
     // this is a very simple IIR filter to smooth the reaction of the bars
     bar = 0.2 * bar + 0.8 * barm[x]; 
     if (bar > peak[x]) peak[x]=bar;
//     tft.drawFastVLine(x, 210-bar,bar, ILI9341_PURPLE);
     tft.drawFastVLine(x*2+10, 210-bar,bar, ILI9341_PINK);

     tft.drawFastVLine(x*2+10, 20, 210-bar-20, ILI9341_BLACK);    

     tft.drawPixel(x*2+10,209-peak[x], ILI9341_YELLOW);

     if(peak[x]>0) peak[x]-=1;
     barm[x] = bar;
  }
  } //end if

   } // end void spectrum

void calc_FIR_lowpass_coeffs (float32_t Scale, float32_t Astop, float32_t Fpass, float32_t Fstop, float32_t Fsamprate) 
{ // Wheatley, M. (2011): CuteSDR Technical Manual. www.metronix.com, pages 118 - 120, FIR with Kaiser-Bessel Window
    int n;
    float32_t Beta;
    float32_t izb;
    // normalized F parameters
    float32_t normFpass = Fpass / Fsamprate;
    float32_t normFstop = Fstop / Fsamprate;
    float32_t normFcut = (normFstop + normFpass) / 2.0; // lowpass filter 6dB cutoff
    // calculate Kaiser-Bessel window shape factor beta from stopband attenuation
    if (Astop < 20.96)
    {
      Beta = 0.0;    
    }
    else if (Astop >= 50.0)
    {
      Beta = 0.1102 * (Astop - 8.71);
    }
    else
    {
      Beta = 0.5842 * powf((Astop - 20.96), 0.4) + 0.7886 * (Astop - 20.96);
    }
    // estimate number of taps
    m_NumTaps = (int32_t)((Astop - 8.0) / (2.285 * K_2PI * (normFstop - normFpass)) + 1.0);
    if (m_NumTaps > MAX_NUMCOEF) 
    { 
      m_NumTaps = MAX_NUMCOEF;
    }
    if (m_NumTaps < 3)
    {
      m_NumTaps = 3;
    }
    float32_t fCenter = 0.5 * (float32_t) (m_NumTaps - 1);
    izb = Izero (Beta);
    for (n = 0; n < m_NumTaps; n++)
    {
      float32_t x = (float32_t) n - fCenter;
      float32_t c;
      // create ideal Sinc() LP filter with normFcut
      if ((float32_t) n == fCenter)
      {
        c = 2.0 * normFcut;
      }
      else
      {
        c = (float32_t) sin(K_2PI * x * normFcut) / (K_PI * x);
      }
      x = ((float32_t) n - ((float32_t) m_NumTaps - 1.0) / 2.0) / (((float32_t) m_NumTaps - 1.0) / 2.0);
      FIR_Coef[n] = Scale * c * Izero (Beta * sqrt(1 - (x * x) )) / izb;
    }
} // END calc_lowpass_coeffs

float32_t Izero (float32_t x) 
{
    float32_t x2 = x / 2.0;
    float32_t summe = 1.0;
    float32_t ds = 1.0;
    float32_t di = 1.0;
    float32_t errorlimit = 1e-9;
    float32_t tmp;
    do
    {
        tmp = x2 / di;
        tmp *= tmp;
        ds *= tmp;
        summe += ds;
        di += 1.0; 
    }   while (ds >= errorlimit * summe);
    return (summe);
}  // END Izero
 
Last edited:
The second constant to estimate the beta should be 0.07886 (missing a zero)
I have not tested with your Izero but with besseli(0,x) and could not reproduce your observations.
Will port your Izero to Matlab and compare.

Edit: with Izero ported to Matlab, I could not see any difference and response looks good. So, I cannot see what is wrong with fcut >fsamp/4
 
Last edited:
yes that would be nice to be able to try 25% overlap instead of 50% overlap and also try overlap-and-add!

Here is an attempt,
It compiles and test function produces reasonable results
(I used Arduino style but did not compile on Arduino as I'm using latest CMSIS functions, and not the one that comes with Teensyduino)
Code:
extern "C"
{
	void fft_filt_init(float * filt, int nf);
	void fft_filt_exec(float *zr, float *zi, float *xr, float *xi, int nx, int MM);
}



void fft_test(void)
{
#define nf  129
#define nx (1024 -nf +1)
#define pi 3.14159f

#define fs  192.0f
#define fo  10.0f

#define FILT 1

	 int ii;
#if FILT == 0
	 float filt[nf];

	 // all-pass filter
	 for(ii = 0; ii< nf; ii++) filt[ii] = 0;;	 filt[64]=1;
#elif FILT==1
	 // LP fir1(128,0.2)
	 float filt[]={
	 2.335989e-04f, 3.866300e-04f, 4.009664e-04f, 2.603404e-04f, -6.918080e-19f, -2.970606e-04f, -5.203355e-04f, -5.670347e-04f,
	 -3.838211e-04f, 3.516892e-18f, 4.648315e-04f, 8.297952e-04f, 9.159923e-04f, 6.248526e-04f, -1.436084e-18f, -7.595779e-04f,
	 -1.352701e-03f, -1.486683e-03f, -1.008250e-03f, 2.071234e-18f, 1.208079e-03f, 2.134307e-03f, 2.326556e-03f, 1.564883e-03f,
	 -2.834284e-18f, -1.844980e-03f, -3.234390e-03f, -3.499637e-03f, -2.337331e-03f, 3.679497e-18f, 2.720258e-03f, 4.741167e-03f,
	 5.102609e-03f, 3.391381e-03f, -4.556214e-18f, -3.914770e-03f, -6.800687e-03f, -7.299274e-03f, -4.841111e-03f, 5.411886e-18f,
	 5.575496e-03f, 9.684937e-03f, 1.040235e-02f, 6.909966e-03f, -6.195227e-18f, -8.006510e-03f, -1.397352e-02f, -1.509958e-02f,
	 -1.010630e-02f, 6.859285e-18f, 1.195472e-02f, 2.115418e-02f, 2.324461e-02f, 1.587717e-02f, -7.364257e-18f, -1.984729e-02f,
	 -3.647362e-02f, -4.203514e-02f, -3.052959e-02f, 7.679878e-18f, 4.630684e-02f, 1.002904e-01f, 1.508535e-01f, 1.867756e-01f,
	 1.997662e-01f, 1.867756e-01f, 1.508535e-01f, 1.002904e-01f, 4.630684e-02f, 7.679878e-18f, -3.052959e-02f, -4.203514e-02f,
	 -3.647362e-02f, -1.984729e-02f, -7.364257e-18f, 1.587717e-02f, 2.324461e-02f, 2.115418e-02f, 1.195472e-02f, 6.859285e-18f,
	 -1.010630e-02f, -1.509958e-02f, -1.397352e-02f, -8.006510e-03f, -6.195227e-18f, 6.909966e-03f, 1.040235e-02f, 9.684937e-03f,
	 5.575496e-03f, 5.411886e-18f, -4.841111e-03f, -7.299274e-03f, -6.800687e-03f, -3.914770e-03f, -4.556214e-18f, 3.391381e-03f,
	 5.102609e-03f, 4.741167e-03f, 2.720258e-03f, 3.679497e-18f, -2.337331e-03f, -3.499637e-03f, -3.234390e-03f, -1.844980e-03f,
	 -2.834284e-18f, 1.564883e-03f, 2.326556e-03f, 2.134307e-03f, 1.208079e-03f, 2.071234e-18f, -1.008250e-03f, -1.486683e-03f,
	 -1.352701e-03f, -7.595779e-04f, -1.436084e-18f, 6.248526e-04f, 9.159923e-04f, 8.297952e-04f, 4.648315e-04f, 3.516892e-18f,
	 -3.838211e-04f, -5.670347e-04f, -5.203355e-04f, -2.970606e-04f, -6.918080e-19f, 2.603404e-04f, 4.009664e-04f, 3.866300e-04f,
	 2.335989e-04f};
#endif

	 float xr[nx], xi[nx], zr[nx], zi[nx];

	 for(ii=0; ii < nx; ii++)
	 {   float arg = 2.0*pi*fo*ii/fs;
		 xr[ii] = 0.0f*cosf(arg)+ 1.0f*(rand()/(float) RAND_MAX-0.5);
		 xi[ii] = 0.0f;
	 }

	 fft_filt_init(filt, nf);

	 fft_filt_exec(zr,zi,xr,xi,nx, nf);

	 for(ii=0; ii<nx;ii++)
	 { Serial.printf("%d, %d, %d, %d, %d\n\r", ii,	(int)(xr[ii]*1000000.0f), (int)(xi[ii]*1000000.0f),
			 	 	 	 	 	 	 	 	 	 	(int)(zr[ii]*1000000.0f), (int)(zi[ii]*1000000.0f));
	 }
}

void setup() {
  // put your setup code here, to run once:
  //
	while(!Serial);
	while(Serial.available()); // clear input buffer
	Serial.printf("Starting ESM\n\r");

	fft_test();
	delay(1000);
}

void loop()
{
}


and
Code:
/*
 * fft_filt.c
 *
 *  Created on: Nov 24, 2016
 *      Author: Walter
 */

/*
 * FFTlength N = (L + M - 1)
 * first M - 1 samples are from previous block, or zero for first block
 * discard the first M - 1 samples from FFT output
 * example
 * 		N = 1024 = 4*128
 * 		M = 129
 * 		L = 3*128
 *
 * 		y = {y[L-M+1].... y[L-1], x[0],   , x[L-1]}
 *
 * 		process y -> z
 *
 * 		discard z[0].... z[M-2], keep L samples z[M-1] .... Z[N]
 */
#include "arm_math.h"
#include "arm_const_structs.h"

#define NN 1024
static float bb[2*NN];
static float yy[2*NN];
static float zz[2*NN];


void fft_filt_init(float * filt, int nf)
{   int ii;
	for(ii=0; ii< 1024; ii++)
	{
		bb[2*ii]=0; bb[2*ii+1]=0;
		yy[2*ii]=0; yy[2*ii+1]=0;
	}
	for(ii=0; ii<nf; ii++) bb[2*ii]=filt[ii];

	arm_cfft_f32(&arm_cfft_sR_f32_len1024, bb, 0, 1);  // forward, without bit reverse (not needed for filter, I believe)
}

 void fft_filt_exec(float *zr, float *zi, float *xr, float *xi, int nx, int MM)
{	int ii;
	int LL = NN - MM + 1;

	if(LL != nx) return;

	// copy data to FFT buffer
	float *yp = &yy[2*(MM-1)];
	for(ii = 0; ii < LL; ii++) {yp[2*ii] = xr[ii]; yp[2*ii+1] = xi[ii];}

	// perform complex FFT (in-place)
	arm_cfft_f32(&arm_cfft_sR_f32_len1024, yy, 0, 1);

	// complex multiply with filter spectrum
	arm_cmplx_mult_cmplx_f32 (yy, bb, zz, NN);

	// perform iFFT (in-place)
	arm_cfft_f32(&arm_cfft_sR_f32_len1024, zz, 1, 1);

	// copy overlap to beginning
	float *xrp = &xr[LL-MM+1];
	float *xip = &xi[LL-MM+1];
	for(ii = 0; ii < MM-1; ii++) {yy[2*ii] = xrp[ii]; yy[2*ii+1] = xip[ii];}

	// copy data to result buffer
	float *zp = &zz[2*(MM-1)];
	for(ii = 0; ii < LL; ii++) { zr[ii] = zp[2*ii]; zi[ii] = zp[2*ii+1];}
}

I tested these on T3.6 and got
NoiseFilter.jpg

To repeat, the idea is to accumulate 3 audio buffer and call then the fft_filt_exec function.
I should mention, that the filtered output is delayed by 64 samples (half FIR window size). But this group delay is typical for all filter operations.
 
Edit: with Izero ported to Matlab, I could not see any difference and response looks good. So, I cannot see what is wrong with fcut >fsamp/4

That´s great! Thanks a lot for comparing with MATLAB, so I can be confident with the coefficient calculation using the subfunctions in the script.

[Great thanks also for the 25% overlap script, will test that in the next days! Not much time today]

Yes, that was my error, I missed a zero in beta calculation (did not realize, because I never calculated a filter with an Astop in that range). However, that cannot be the cause of the strange "alias effects".

But the "alias" effect remains to be solved . . . After thinking a bit more, I am pretty sure, that I am seeing an effect of my stereo processing.

Right input channel audio goes to the real part of the FFT input buffer,
Left input channel audio goes to the imaginary part of the FFT input buffer.
(As my SDR will process I&Q coming from a hardware quadrature sampling detector, this simulates my input data for the SDR. At the moment this is audio from my smartphone (music or single sine tones for testing the filter performance))

So, now the FFT buffer is filled with real and imaginaries representing the I&Q data.
The FFT is performed and the output is in the FFT buffer.

The Filter mask used for the complex multiply of the FFT buffer contains the results of the FFT of the FIR filter coefficients.
BUT: the FFT for the filter mask has been done on the FIR coeffs which have been stuffed into the real part only (all imaginary are zero).

Code:
    for (i = 0; i < (FFT_length / 2) + 1; i+=2)
    {
        FIR_filter_mask[i * 2] = FIR_Coef [i];
        FIR_filter_mask[i * 2 + 1] =  0.0; 
    }
    arm_cfft_f32(maskS, FIR_filter_mask, 0, 1);

Is that correct, if my aim is to filter both channels? If not, how would you do it correctly? [well, in fact, as it is, it correctly filters both channels! (but produces aliases, if filter stopband freq is above 5500Hz)]

One last thought, which however is a very vague hypothesis of mine:

- If I filter with a sample rate of 44118Hz, my highest freq I can use is 22059Hz because of Nyquist

- If I filter with a sample rate of 44118Hz, but use two filters at a time (one with positive, the other one with negative freqs; is that what I am doing in my script? not sure), then could it be that your Nyquist frequency is at half of the 22059, ie. 11029 Hz?

--> this would exactly explain my observations:
- everything is fine up to 5515Hz filter stopband
- with filter stopband freqs > 5515Hz I can see aliases in the upper part of the spectrum (around 22059Hz - 5515Hz)
- with filter stopband freqs approaching 11029Hz there is no more "filtering", because the aliases approach the stopband freqs, so with white noise input all the spectrum is filled with equal amplitude noise

These thoughts would raise another question, though ;-):

- Do I need an alias filter for filtering with Fast Convolution?
(in my case I would think a 10Khz filter or a bit lower would have the right dimension, because that would be the absolute maximum audio bandwidth an AM station would transmit)

This hypothesis would also tell me, that my code is ok, but my understanding of Digital Convolution is weak (which I find is better than the other way round ;-)).

If that does not make any sense at all, I apologize in advance! It´s just that I learn DSP by doing it and I do not have any technical/engineering background or similar . . .

Would be very interested to hear your opinion on that!

All the best,

Frank
 
Last edited:
Frank,
there is an error in the sinc function
it should read
Code:
c = normCut
and not
Code:
c = 2.0 * normCut

Reason is, that for vanishing x
Code:
sin(a*x)/x -> (a*x)/x = a
 
Hi Walter,

thanks! But it does not change, except that the input freqs are multiplied by 0.5. "Alias"-products stay the same.

And if I only change "c = normFcut", the stopband attenuation is much worse and the aliases are still there.

I changed this (part commented out with "//" was tested)

Code:
    for (n = 0; n < m_NumTaps; n++)
    {
      float32_t x = (float32_t) n - fCenter;
      float32_t c;
      // create ideal Sinc() LP filter with normFcut
      if ((float32_t) n == fCenter)
      { // deal with odd size filter singularity where sin(0) / 0 == 1
        c = 2.0 * normFcut;
//        c = normFcut;
        Serial.println("Hello, here at odd FIR filter size");
      }
      else
      {
        c = (float32_t) sinf(K_2PI * x * normFcut) / (K_PI * x);
//        c = (float32_t) sinf(K_PI * x * normFcut) / (K_PI * x);
      }
      x = ((float32_t) n - ((float32_t) m_NumTaps - 1.0) / 2.0) / (((float32_t) m_NumTaps - 1.0) / 2.0);
      FIR_Coef[n] = Scale * c * Izero (Beta * sqrt(1 - (x * x) )) / izb;
    }
 
Concerning FFT-filtering of stereo channels, Your approach should be OK.
I tried it in Matlab and got that the expected result (i.e. your approach is OK).
After all, FFT is a linear operation, so that the fact that in the spectral domain the data are scrambled does not influence the result. In fact you could do the whole process a little bit faster by setting the bit-reversal flag to zero (on all three complex ffts).

Caveat: Using Real and imaginary part of a Complex FFT to speed up processing only works when signals are similar. Otherwise, with simultaneous processing of strong and weak signals, the strong signal could leak into the weak signal, due to limited resolution of single float.
 
Hi Walter,

thanks! But it does not change, except that the input freqs are multiplied by 0.5. "Alias"-products stay the same.

It only influences cases where m_numTaps is odd, as then fCenter is an integer

I just realized, your check is not good: You compare two floats, they will not be equal
So better
Code:
if (abs((float)n-fCenter))<0.01)
{ c = 2.0f*normCut;
}

edit: I missed that you divide by K_PI and not K_2PI, so 2*normCut is OK.
 
Last edited:
So, now the FFT buffer is filled with real and imaginaries representing the I&Q data.

The I&Q sampling of the HW is nothing else than transforming the real time series into a complex one (called also analytic extension, real part and imaginary part are shifted by 90 degree, related term Hilbert transform).
If done correctly, the spectrum will have only positive frequencies.

I believe, your reasoning (vague hypothesis) is not describing the phenomenon.
The filters are good up to nyquist e.g. normCut = 0.49.

Best thing is to generate a complex test signal that is intrinsic I&Q and see if output has still aliased lobes.
 
Walter, thanks a lot for clarifying things and confirming the "stereo" approach with MATLAB, and also for the good hint on why not to compare floats with "=="!

I now have an idea where my aliasing came from: I did not use real IQ-data AND I looked at the spectrum display of an FFT of only ONE channel.

Tests with real IQ-data with single IQ tones revealed that the alias is not audible and the filters work very well (at least up to my upper hearing limit, which is at about 14kHz, that limit decreasing in frequency with the years . . .). But the alias still appears optically in the spectrum display of ONE channel, so it´s my fault of having looked only into part of the true story . . .

So I am very sorry for this confusion and I would like to quickly withdraw from my vague hypothesis ;-). Will have to check that more systematically though, and will report here. But my gut feeling is, that the system works really well now . . .


EDIT: BTW, setting bit reversal to 0 in all three FFTs does not work, but gives scrambled audio . . .
EDIT2: if anyone needs a simple IQ tone generator for testing, here is one: http://www.dxzone.com/cgi-bin/dir/jump2.cgi?ID=20548
 
Last edited:
I reproduced the 'aliasing' in Matlab
or better lack of performance in the stop band for fc > Fsamp/4

solution: try with
Code:
c = (float32_t) sinf(K_2PI * x * normFcut) / (K_PI * x * normFcut);
 
Thanks, Walter, good that you could reproduce it in MATLAB, meanwhile I am even more confused: seems that my hearing is even worse than I thought, the "aliasing"/lack of performance is still there.

I tried your code, but it seems to be treating the scaling of the coefficients only, thus only treats gain and not frequency response!?

When I use this,
Code:
c = (float32_t) sinf(K_2PI * x * normFcut) / (K_PI * x * normFcut);
the frequency response is sometimes ok, the gain is much higher, but stopband attenuation is very low.

And if you try it with these parameters:
Code:
float32_t LP_Fpass = 8400;
float32_t LP_Fstop = 8500;
float32_t LP_Astop = 90;

the frequency response is not even a lowpass anymore . . .

But maybe I put it in the wrong place . . . this is what I use in the coefficients function at the moment:

Code:
    for (n = 0; n < m_NumTaps; n++)
    {
      float32_t x = (float32_t) n - fCenter;
      float32_t c;
      // create ideal Sinc() LP filter with normFcut
        if ( abs((float)n - fCenter) < 0.01)
      { // deal with odd size filter singularity where sin(0) / 0 == 1
        c = 2.0 * normFcut;
        Serial.println("Hello, here at odd FIR filter size");
      }
      else
      {
//    I changed this: (but I think this does not work properly)
        c = (float32_t) sinf(K_2PI * x * normFcut) / (K_PI * x * normFcut);
//    before it was like this: 
//        c = (float32_t) sinf(K_2PI * x * normFcut) / (K_PI * x);
      }
      x = ((float32_t) n - ((float32_t) m_NumTaps - 1.0) / 2.0) / (((float32_t) m_NumTaps - 1.0) / 2.0);
      FIR_Coef[n] = Scale * c * Izero (Beta * sqrt(1 - (x * x) )) / izb;
    }

I think I will have to leave this for today and start thinking again in the next days how to fix the "aliasing".
 
one last thought for today:

The 50% overlap leads to the following situation:

256 audio samples in --> PROCESSING --> 128 audio samples out

Although this is processed with FFT-cmplx-mltpl-iFFT etc., functionally this resembles a decimation-by-2, or does it?

If this is functionally a decimation, wouldn´t that also mean, that I need a decimation filter at SAMPLERATE / 2 / DECIMATION_FACTOR = 44118 / 2 / 2 = 11029Hz BEFORE the Fast convolution process AND my maximum FIR filter stopband frequency would be 11029Hz?
 
one last thought for today:

The 50% overlap leads to the following situation:

256 audio samples in --> PROCESSING --> 128 audio samples out

Although this is processed with FFT-cmplx-mltpl-iFFT etc., functionally this resembles a decimation-by-2, or does it?

If this is functionally a decimation, wouldn´t that also mean, that I need a decimation filter at SAMPLERATE / 2 / DECIMATION_FACTOR = 44118 / 2 / 2 = 11029Hz BEFORE the Fast convolution process AND my maximum FIR filter stopband frequency would be 11029Hz?

No, it is not decimating by two. 50% overlap, yes, but you only take 1/2 of it. Are the 256 samples not L/R channel so you have 128 samples per data? Otherwise you would be doubling the data.
I'm making a alternative approach, which I will post soon.
So far, noise filtered data look good.
 
***forgive my intrusion, this thread is amazing!***

This thread has answered a ton of questions I've had over this weekend, thinking that the 3.6 might have the processing power for a project I've been considering for a while. I think your examples above are 90% what I'd like to achieve!

What if one wanted to use an existing matrix (or wav file) as the impulse rather than realtime audio data? Perhaps store it on an SD card?

Again, sorry to distract.

Jamie
 
***forgive my intrusion, this thread is amazing!***

This thread has answered a ton of questions I've had over this weekend, thinking that the 3.6 might have the processing power for a project I've been considering for a while. I think your examples above are 90% what I'd like to achieve!

What if one wanted to use an existing matrix (or wav file) as the impulse rather than realtime audio data? Perhaps store it on an SD card?

Again, sorry to distract.

Jamie

No problem, maybe we need some distraction to solve our problems.
Q: How big is your data? If it does not fit into flash of a T3.6, then, yes, you can read from disk.
 
I tried your code, but it seems to be treating the scaling of the coefficients only, thus only treats gain and not frequency response!?

Yes I realize, and must admit I was too fast.
What I also realize is that, if sinc function does not handle the sin(0)/0 situation correctly, you loose side lobe suppression.
 
I'm making a alternative approach, which I will post soon.

As promised, here my code
Code:
/* myApp.cpp
 *
 *  Created on: Aug 28, 2016
 *      Author: Walter
 */
//
#include <math.h>

#include "kinetis.h"
#include "core_pins.h"
#include "usb_serial.h"

extern "C" void setup(void);
extern "C" void loop(void);

extern "C"
{
	void fft_filt_init(float fc);
	void fft_filt_exec(float *zr, float *zi, float *xr, float *xi, int nx, int MM);
}


//-------------------------------------------------------------------
void doSimulate(void);
void doProcessing(void);

void timer2_init(void)
{
	SIM_SCGC6 |= SIM_SCGC6_PIT;
	// turn on PIT
	PIT_MCR = 0x00;
}

void timer2_start(uint32_t frequency)
{//
	PIT_LDVAL2 = F_BUS/frequency; // setup timer 0 for (F_BUS/frequency) cycles
	PIT_TCTRL2 = 2; // enable Timer 2 interrupts
	NVIC_SET_PRIORITY(IRQ_PIT_CH2, 7*16); // 8 is normal priority (set in mk20dx128.c)
	NVIC_ENABLE_IRQ(IRQ_PIT_CH2);
	PIT_TCTRL2 |= 1; // start Timer 2
}

void timer2_stop(void)
{
	PIT_TCTRL2 &= ~1; // stop Timer 2

}

void pit2_isr(void)
{ //
	PIT_TFLG2=1;
	doSimulate();
}


#define nf  129
#define mf (nf-1)
#define nx (1024 -mf)

short  sro[mf], sio[mf], nro[mf], nio[mf]; // for simulation
float sr[nx], si[nx];	// hold simulated data
float xr[nx], xi[nx], zr[nx], zi[nx];	// for processing

uint32_t t0,t1, t2,t3, t4,t5;
uint32_t procCount=0;
uint32_t dataCount=0;
uint16_t haveData=0;
void doSimulate(void)
{ 	int ii;

	t0 = micros();
	// copy data but change noise somewhat and scale signal and noise to reasonable values
	for(ii=0;ii<mf;ii++)
	{
#define shs 6
#define shn 10
		sr[dataCount*mf+ii] = (float) (sro[ii]>>shs) + (float)(nro[(dataCount+ii) % mf]>>shn);
		si[dataCount*mf+ii] = (float) (sio[ii]>>shs) + (float)(nio[(dataCount+ii) % mf]>>shn);
	}

	t1= micros();
	// if we have sufficient data, copy to working buffer and signal to loop() for (lower priority) processing
	if(++dataCount==(nx/mf))
	{
		for(int ii=0; ii < nx; ii++)
		{
			xr[ii]=sr[ii];
			xi[ii]=si[ii];
		}
		haveData=1;
		dataCount=0;
	}
}

void doProcessing(void)
{
	t2=micros();
	fft_filt_exec(zr,zi,xr,xi,nx, nf);
	procCount++;
    t3=micros();
}

void setup()
{
  // put your setup code here, to run once:
  //
	while(!Serial);
	while(Serial.available()); // clear input buffer
	Serial.printf("Starting SDR\n\r");

	fft_filt_init(0.6);

	// initialize simulation
#define pi 3.14159f
#define no 48 // number of cycles in 128 samples, or frequency = no * (fsamp/128)
#define Sc 32767.0f

	for(int ii=0; ii < mf; ii++)
	{	float arg = 2.0f*pi*no*ii/128.0f;
		 sro[ii] = (short)(Sc*cosf(arg));
		 sio[ii] = (short)(Sc*sinf(arg));
		 nro[ii] = (short)((Sc*(float)rand()/(float) RAND_MAX) - 0.5);
		 nio[ii] = (short)((Sc*(float)rand()/(float) RAND_MAX) - 0.5);
	}
	//
	// following should be replaced with I2S driver
	timer2_init();
#define NINT 345	// interrupts / second (345 = 44100/128)
	timer2_start(NINT);

    t4=millis();
}

uint32_t loopCount=0;
void loop()
{	int ii;
    if(haveData)
    {
    	haveData=0;
    	doProcessing();
    	loopCount++;
    }

    if(loopCount==50) // is about 1 sec of data
    {
    	t5=millis();
    	// stop processing
    	timer2_stop();
    	//
    	float ur[nx],ui[nx],vr[nx],vi[nx];

    	for(ii=0; ii<nx;ii++)
    	{	ur[ii]=xr[ii]; ui[ii]=xi[ii]; 	vr[ii]=zr[ii]; vi[ii]=zi[ii];
    	}

    	for(ii=0; ii<nx;ii++)
    	{	Serial.printf("%d, %d, %d, %d, %d\n\r", ii,	(int)(ur[ii]*1024.0f), (int)(ui[ii]*1024.0f),
    												(int)(vr[ii]*1024.0f), (int)(vi[ii]*1024.0f));
    		delay(10);
    	}
    	Serial.printf("\n\r%d: %d %d %d\n\r",procCount,t1-t0,t3-t2,t5-t4);
    	procCount=0;
    	loopCount=0;
    	while(1) delay(1000); // wait forever for copying the printed data
    }
}
using
Code:
/*
 * fft_filt.c
 *
 *  Created on: Nov 24, 2016
 *      Author: Walter
 */

/*
 * FFTlength N = (L + M - 1)
 * first M - 1 samples are from previous block, or zero for first block
 * discard the first M - 1 samples from FFT output
 * example
 * 		N = 1024 = 8*128
 * 		M = 129
 * 		L = 7*128
 *
 * 		y = {y[L-M+1].... y[L-1], x[0],   , x[L-1]}
 *
 * 		process y -> z
 *
 * 		discard z[0].... z[M-2], keep L samples z[M-1] .... Z[N]
 */
#include "arm_math.h"
#include "arm_const_structs.h"

#define NN 1024
#define NF 128
#define ASTOP 90.0f

static float bb[2*NN];
static float yy[2*NN];
static float zz[2*NN];

//#define PI 3.14159265358979f // already defined in arm_math.h
#define TPI	 6.28318530717959f
#define PIH	 1.57079632679490f

float m_sinc(int m, float fc)
{	// fc is f_cut/(Fsamp/2)
	// m is between -M and M step 2
	//
	float x = m*PIH;
	if(m == 0)
		return 1.0f;
	else
		return sinf(x*fc)/(fc*x);
}

float32_t Izero (float32_t x)
{
    float32_t x2 = x / 2.0;
    float32_t summe = 1.0;
    float32_t ds = 1.0;
    float32_t di = 1.0;
    float32_t errorlimit = 1e-9;
    float32_t tmp;
    do
    {   tmp = x2 / di;
        tmp *= tmp;
        ds *= tmp;
        summe += ds;
        di += 1.0;
    }   while (ds >= errorlimit * summe);
    return (summe);
}  // END Izero

void calc_FIR_lowpass_coeffs (float * coeffs, int numCoeffs, float32_t fc, float32_t Astop)
 {	// modified after
	// Wheatley, M. (2011): CuteSDR Technical Manual. www.metronix.com, pages 118 - 120, FIR with Kaiser-Bessel Window
	// assess required number of coefficients by
	//     numCoeffs = (Astop - 8.0) / (2.285 * TPI * normFtrans);

	 int ii,jj;
     float32_t Beta;
     float32_t izb;

     // calculate Kaiser-Bessel window shape factor beta from stop-band attenuation
     if (Astop < 20.96)
    	 Beta = 0.0;
     else if (Astop >= 50.0)
    	 Beta = 0.1102 * (Astop - 8.71);
     else
    	 Beta = 0.5842 * powf((Astop - 20.96), 0.4) + 0.7886 * (Astop - 20.96);

     izb = Izero (Beta);
     for(ii= - numCoeffs, jj=0; ii< numCoeffs; ii+=2,jj++)
     {
    	 float x =(float)ii/(float)numCoeffs;
    	 coeffs[jj] = fc * m_sinc(ii,fc) * Izero(Beta*sqrtf(1.0f - x*x))/izb;

     }
 } // END calc_lowpass_coeffs



void fft_filt_init(float fc)
{   int ii;
	for(ii=0; ii< 1024; ii++)
	{
		bb[2*ii]=0; bb[2*ii+1]=0;
		yy[2*ii]=0; yy[2*ii+1]=0;
	}
	float coeffs[NF];
	calc_FIR_lowpass_coeffs(coeffs, NF, fc, ASTOP);

	for(ii=0; ii<NF+1; ii++) bb[2*ii]=coeffs[ii];

	arm_cfft_f32(&arm_cfft_sR_f32_len1024, bb, 0, 1);
}

 void fft_filt_exec(float *zr, float *zi, float *xr, float *xi, int nx, int MM)
{	int ii;
	int LL = NN - MM + 1;

	if(LL != nx) return;

	// copy data to FFT buffer
	float *yp = &yy[2*(MM-1)];
	for(ii = 0; ii < LL; ii++) {yp[2*ii] = xr[ii]; yp[2*ii+1] = xi[ii];}

	// perform complex FFT (in-place)
	arm_cfft_f32(&arm_cfft_sR_f32_len1024, yy, 0, 1);

	// complex multiply with filter spectrum
	arm_cmplx_mult_cmplx_f32 (yy, bb, zz, NN);

	// perform iFFT (in-place)
	arm_cfft_f32(&arm_cfft_sR_f32_len1024, zz, 1, 1);

	// copy overlap to beginning
	float *xrp = &xr[LL-MM+1];
	float *xip = &xi[LL-MM+1];
	for(ii = 0; ii < MM-1; ii++) {yy[2*ii] = xrp[ii]; yy[2*ii+1] = xip[ii];}

	// copy data to result buffer
	float *zp = &zz[2*(MM-1)];
	for(ii = 0; ii < LL; ii++) { zr[ii] = zp[2*ii]; zi[ii] = zp[2*ii+1];}
}
and which generates
noise_48o128._128_6_10.jpg

Here, I'm not using Audio library, and simulate noisy I&Q tone.
some explanation:
- the timer fires at a regular interval (44100/128) times per second
- fills up a buffer
- if we have sufficient data, a flag is set, so that low priority loop can call data processing.

Alternatively, processing can be carried out using software interrupt that executes at a lower than the timer interrupt, so loop can do real background processing.

The filter estimation is done differently as suggested by Frank, as I wanted to limit the number of coefficients to a multiple of 128, better (n*128+1).
So the transition bandwidth is somewhat suffering.
 
Last edited:
wow, Walter, that´s a lot of code to digest! Will take me a while to think through it and test on my machine.

Just two quick Q:

- the lowpass filter cutoff freq is 0.6 * sample rate / 2, right?

- the IQ signal frequency (within the noise) is 48 /128 * sample rate = 0.375 * sample rate


Q1 So, the signal should show up well in the spectrum at 0.375, but it shows up at 0.75 (factor 2!?) . . .

Q2 Did you also test whether a signal of say 0.8 would be suppressed by the filter? Because that would resemble the problem with the performance at higher freqs
 
Status
Not open for further replies.
Back
Top