Forum Rule: Always post complete source code & details to reproduce any issue!
Page 6 of 6 FirstFirst ... 4 5 6
Results 126 to 144 of 144

Thread: Fast Convolution Filtering with Teensy 4.0 and audio board

  1. #126
    Junior Member
    Join Date
    Oct 2020
    Posts
    2

    Convolution with an audio IR

    Where can I find the "filter_convolution" routine and can it be used to convolve an audio IR ?
    I need to convolve an audio signal (continuous) with an IR file (wav), using a Teensy 4.0 + Audio board.

    Thanks, Luigi

    Quote Originally Posted by bmillier View Post
    Hi Frank: Good to see you have tried the convolution filter on T4. You might recall from another, older thread that I had "wrapped" your convolution routine into an Audio library object (filter_convolution). I just received a couple of T4s and have updated Arduino IDE/Teensyduino/Visual Micro to the new versions that support T4. Once I added my filter_convolution files to the Audio library folder in the new Arduino 1.8.9 folder, I was able to compile my test program with no errors, so it is looking good. Its great that you don't have to change the arm_math.h file to use CMSIS4/5 like you used to have to do with the T3.5/6.
    I don't have an audio adapter board wired up to the T4 yet, so I can't actually test that the code is working, but I'm optimistic.
    Cheers

  2. #127
    Junior Member
    Join Date
    Oct 2020
    Posts
    2
    Sorry...I've found all...

  3. #128
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    I made minimal changes to use USB input and spdif3 output and I get lots of static (setup works fine without convolution). Tried two difference versions (first was 09/11/2019). Any suggestions or perhaps someone willing to review/test the code?
    Code:
    /*
     * (c) DD4WH 15/05/2020
     * 
     * Real Time PARTITIONED BLOCK CONVOLUTION FILTERING (STEREO)
     * 
     * thanks a lot to Brian Millier and Warren Pratt for your help!
     * 
     * using a guitar cabinet impulse response with up to about 20000 coefficients per channel
     * 
     * uses Teensy 4.1  and Teensy audio shield rev D, uses PSRAM chip(s) soldered to underside of T4.1
     * 
     * inspired by and uses code from wdsp library by Warren Pratt
     * https://github.com/g0orx/wdsp/blob/master/firmin.c
     * 
    *********************************************************************************
    
       GNU GPL LICENSE v3
    
       This program is free software: you can redistribute it and/or modify
       it under the terms of the GNU General Public License as published by
       the Free Software Foundation, either version 3 of the License, or
       (at your option) any later version.
    
       This program is distributed in the hope that it will be useful,
       but WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.
    
       You should have received a copy of the GNU General Public License
       along with this program.  If not, see <http://www.gnu.org/licenses/>
    
     *********************************************************************************
     */
    
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    #include <arm_math.h>
    #include <arm_const_structs.h> // in the Teensy 4.0 audio library, the ARM CMSIS DSP lib is already a newer version
    #include <utility/imxrt_hw.h> // necessary for setting the sample rate, thanks FrankB !
    //#include <T4_PowerButton.h> // for flexRamInfo
    
    //////////////////////////////////////////////////////////////////////////////////////////////////////////////
    // USER DEFINES
    
    // Choose only one of these impulse responses !
    
    //#define IR1  // 512 taps // MG impulse response from bmillier github @44.1ksps
    //#define IR2  // 4096 taps // impulse response @44.1ksps
    //#define IR3  // 7552 taps // impulse response @44.1ksps
    //#define IR4    // 17920 taps // impulse response 400ms @44.1ksps
    //#define IR5    // 21632 taps // impulse response 490ms @44.1ksps
    //#define IR6 // 5760 taps, 18.72% load vs. 15.00%
    //#define IR7 // 22016 taps, 50.62% load, 93.48% RAM1, 32064 bytes free
    //#define IR8 // 25552 taps, too much !
    // about 25000 taps is MAXIMUM --> about 0.5 seconds
    //#define LPMINPHASE512 // 512 taps minimum phase 2.7kHz lowpass filter
    //#define LPMINPHASE1024 // 1024 taps minimum phase 2.7kHz lowpass filter
    #define LPMINPHASE2048PASSTHRU // 2048 taps minimum phase 19.0kHz lowpass filter
    //#define LPMINPHASE4096 // 4096 taps minimum phase 2.7kHz lowpass filter
    const float32_t PROGMEM audio_gain = 2.5; // has to be adjusted from 1.0 to 10.0 depending on the filter gain / impulse resonse gain
    ///////////////////////////////////////////////////////////////////////////////////////////////////////////////
    
    #if defined(IR1)
    #include "impulse_response_1.h"
    const int nc = 512; // number of taps for the FIR filter
    #elif defined(IR2)
    #include "impulse_response_2.h"
    const int nc = 4096; // number of taps for the FIR filter
    #elif defined(IR3)
    #include "impulse_response_3.h"
    const int nc = 7552; // number of taps for the FIR filter
    #elif defined(IR5)
    #include "impulse_response_5.h"
    const int nc = 21632; // number of taps for the FIR filter
    #elif defined(IR6)
    #include "impulse_response_6.h"
    const int nc = 5760; // number of taps for the FIR filter, 
    #elif defined(IR7)
    #include "impulse_response_7.h"
    const int nc = 22016; // number of taps for the FIR filter, 
    #elif defined(IR8)
    #include "impulse_response_8.h"
    const int nc = 25552; // number of taps for the FIR filter, 
    #elif defined(LPMINPHASE512)
    #include "lp_minphase_512.h"
    const int nc = 512;
    #elif defined(LPMINPHASE1024)
    #include "lp_minphase_1024.h"
    const int nc = 1024;
    #elif defined(LPMINPHASE2048PASSTHRU)
    #include "lp_minphase_2048passthru.h"
    const int nc = 2048;
    #elif defined(LPMINPHASE4096)
    #include "lp_minphase_4096.h"
    const int nc = 4096;
    #else
    #include "impulse_response_4.h"
    const int nc = 17920; // number of taps for the FIR filter
    #endif
    
    extern "C" uint32_t set_arm_clock(uint32_t frequency);
    
    //#define LATENCY_TEST
    const double PROGMEM FHiCut = 2500.0;
    const double PROGMEM FLoCut = -FHiCut;
    // define your sample rate
    const double PROGMEM SAMPLE_RATE = 44100;  
    // the latency of the filter is meant to be the same regardless of the number of taps for the filter
    // partition size of 128 translates to a latency of 128/sample rate, ie. to 2.9msec with 44.1ksps
    
    // latency can even be reduced by setting partitionsize to 64
    // however, this only works, if you set AUDIO_BLOCK_SAMPLES to 64 in AudioStream.h
    const int PROGMEM partitionsize = 128; 
    
    //#define DEBUG
    #define FOURPI  (2.0 * TWO_PI)
    #define SIXPI   (3.0 * TWO_PI)
    #define BUFFER_SIZE partitionsize
    int32_t sum;
    int idx_t = 0;
    int16_t *sp_L;
    int16_t *sp_R;
    uint8_t PROGMEM FIR_filter_window = 1;
    const uint32_t PROGMEM FFT_L = 2 * partitionsize; 
    float32_t mean = 1;
    uint8_t first_block = 1; 
    const uint32_t PROGMEM FFT_length = FFT_L;
    const int PROGMEM nfor = nc / partitionsize; // number of partition blocks --> nfor = nc / partitionsize
    //float DMAMEM cplxcoeffs[nc * 2]; // this holds the initial complex coefficients for the filter BEFORE partitioning
    float32_t DMAMEM maskgen[FFT_L * 2];
    //float32_t DMAMEM fmask[nfor][FFT_L * 2]; // 
    float32_t fmask[nfor][FFT_L * 2]; // 
    float32_t DMAMEM fftin[FFT_L * 2];
    float32_t accum[FFT_L * 2];
    float fftout[nfor][FFT_L * 2]; // 
    
    int buffidx = 0;
    int k = 0;
    //int idxmask = nfor - 1;
    
    uint32_t all_samples_counter = 0;
    uint8_t no_more_latency_test = 0;
    
    const uint32_t N_B = FFT_L / 2 / BUFFER_SIZE;
    uint32_t N_BLOCKS = N_B;
    float32_t DMAMEM float_buffer_L [BUFFER_SIZE * N_B];  
    float32_t DMAMEM float_buffer_R [BUFFER_SIZE * N_B]; 
    float32_t DMAMEM last_sample_buffer_L [BUFFER_SIZE * N_B];  
    float32_t DMAMEM last_sample_buffer_R [BUFFER_SIZE * N_B]; 
    // complex FFT with the new library CMSIS V4.5
    const static arm_cfft_instance_f32 *S;
    // complex iFFT with the new library CMSIS V4.5
    const static arm_cfft_instance_f32 *iS;
    // FFT instance for direct calculation of the filter mask
    // from the impulse response of the FIR - the coefficients
    const static arm_cfft_instance_f32 *maskS;
    
    // this audio comes from the codec by I2S
    //AudioInputI2S            i2s_in;
    AudioRecordQueue         Q_in_L;
    AudioRecordQueue         Q_in_R;
    AudioMixer4              mixleft;
    AudioMixer4              mixright;
    AudioPlayQueue           Q_out_L;
    AudioPlayQueue           Q_out_R;
    //AudioOutputI2S           i2s_out;
    AudioOutputSPDIF3        spdif3;         //xy=502,202
    AudioInputUSB            usb1;           //xy=143,207
    
    AudioConnection          patchCord1(usb1, 0, Q_in_L, 0);
    AudioConnection          patchCord2(usb1, 1, Q_in_R, 0);
    // convolution here
    AudioConnection          patchCord3(Q_out_L, 0, mixleft, 0);
    AudioConnection          patchCord4(Q_out_R, 0, mixright, 0);
    AudioConnection          patchCord9(mixleft, 0,  spdif3, 1);
    AudioConnection          patchCord10(mixright, 0, spdif3, 0);
    
    void setup() {
      //Serial.begin(115200);
      //while(!Serial);
    
      AudioMemory(10); 
      delay(100);
    
      // Enable the audio shield, select input, and enable output
      //codec.enable();
      //codec.adcHighPassFilterDisable(); // 
      //codec.inputSelect(AUDIO_INPUT_LINEIN); // AUDIO_INPUT_MIC
      //codec.volume(0.6);
    
      set_arm_clock(600000000);
    
      /****************************************************************************************
         Audio Setup
      ****************************************************************************************/
      mixleft.gain(0, 1.0);
      mixright.gain(0, 1.0);
    
      //setI2SFreq(SAMPLE_RATE);
    
      /****************************************************************************************
         properly initialise variables in DMAMEM
      ****************************************************************************************/
    
      for(unsigned jj = 0; jj < nfor; jj++)
      {
        for(unsigned ii = 0; ii < FFT_L * 2; ii++)
        {
          fftout[jj][ii] = 0.1;
          fmask[jj][ii] = 0.0;
        }
      }
    
    /*  for (unsigned i = 0; i < FFT_length * 2; i++)
      {
          cplxcoeffs[i] = 0.0;
      }
    */
    
      /****************************************************************************************
         init complex FFTs
      ****************************************************************************************/
      switch (FFT_length)
      {
        case 128:
          S = &arm_cfft_sR_f32_len128;
          iS = &arm_cfft_sR_f32_len128;
          maskS = &arm_cfft_sR_f32_len128;
          break;
        case 256:
          S = &arm_cfft_sR_f32_len256;
          iS = &arm_cfft_sR_f32_len256;
          maskS = &arm_cfft_sR_f32_len256;
          break;
      }
    
      /****************************************************************************************
         Calculate the FFT of the FIR filter coefficients once to produce the FIR filter mask
      ****************************************************************************************/
        init_partitioned_filter_masks();
    
    #ifdef HAVE_SERIAL
        flexRamInfo();
    
        Serial.println();
        Serial.print("AUDIO_BLOCK_SAMPLES:  ");     Serial.println(AUDIO_BLOCK_SAMPLES);
    
        Serial.println();
    #endif
        
      /****************************************************************************************
         begin to queue the audio from the audio library
      ****************************************************************************************/
      delay(100);
      Q_in_L.begin();
      Q_in_R.begin();
    
    } // END OF SETUP
      elapsedMillis msec = 0;
    void loop() {
      elapsedMicros usec = 0;
      // are there at least N_BLOCKS buffers in each channel available ?
        if (Q_in_L.available() > N_BLOCKS + 0 && Q_in_R.available() > N_BLOCKS + 0)
        {
    
          // get audio samples from the audio  buffers and convert them to float
          for (unsigned i = 0; i < N_BLOCKS; i++)
          {
            sp_L = Q_in_L.readBuffer();
            sp_R = Q_in_R.readBuffer();
    
            // convert to float one buffer_size
            // float_buffer samples are now standardized from > -1.0 to < 1.0
            arm_q15_to_float (sp_L, &float_buffer_L[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
            arm_q15_to_float (sp_R, &float_buffer_R[BUFFER_SIZE * i], BUFFER_SIZE); // convert int_buffer to float 32bit
            Q_in_L.freeBuffer();
            Q_in_R.freeBuffer();
          }
     
          /**********************************************************************************
              Digital convolution
           **********************************************************************************/
          //  basis for this was Lyons, R. (2011): Understanding Digital Processing.
          //  "Fast FIR Filtering using the FFT", pages 688 - 694
          //  numbers for the steps taken from that source
          //  Method used here: overlap-and-save
    
          // ONLY FOR the VERY FIRST FFT: fill first samples with zeros
          if (first_block) // fill real & imaginaries with zeros for the first BLOCKSIZE samples
          {
            for (unsigned i = 0; i < partitionsize * 4; i++)
            {
              fftin[i] = 0.0;
            }
            first_block = 0;
          }
          else
          {  // HERE IT STARTS for all other instances
            // fill FFT_buffer with last events audio samples
            for (unsigned i = 0; i < partitionsize; i++)
            {
              fftin[i * 2] = last_sample_buffer_L[i]; // real
              fftin[i * 2 + 1] = last_sample_buffer_R[i]; // imaginary
            }
          }
        
          // copy recent samples to last_sample_buffer for next time!
          for (unsigned i = 0; i < partitionsize; i++)
          {
            last_sample_buffer_L [i] = float_buffer_L[i];
            last_sample_buffer_R [i] = float_buffer_R[i];
          }
    
          // now fill recent audio samples into FFT_buffer (left channel: re, right channel: im)
          for (unsigned i = 0; i < partitionsize; i++)
          {
            fftin[FFT_length + i * 2] = float_buffer_L[i]; // real
            fftin[FFT_length + i * 2 + 1] = float_buffer_R[i]; // imaginary
          }
    
    #if defined(LATENCY_TEST)
            if(msec > 2000 && !no_more_latency_test)
            {
            // latency test
            fftin[42] = 10.0; fftin[44] = -10.0;
            fftin[43] = 10.0; fftin[45] = -10.0;
            no_more_latency_test = 1;
            }
            if(no_more_latency_test == 1) all_samples_counter += partitionsize; 
    #endif       
          /**********************************************************************************
              Complex Forward FFT
           **********************************************************************************/
          // calculation is performed in-place the FFT_buffer [re, im, re, im, re, im . . .]
          arm_cfft_f32(S, fftin, 0, 1);
          for(unsigned i = 0; i < partitionsize * 4; i++)
          {
              fftout[buffidx][i] = fftin[i];
          }
    
          /**********************************************************************************
              Complex multiplication with filter mask (precalculated coefficients subjected to an FFT)
              this is taken from wdsp library by Warren Pratt firmin.c
           **********************************************************************************/
          k = buffidx;
    
          for(unsigned i = 0; i < partitionsize * 4; i++)
          {
              accum[i] = 0.0;
          }
          
          for(unsigned j = 0; j < nfor; j++)
          { 
              for(unsigned i = 0; i < 2 * partitionsize; i= i + 4 )
              {
                // doing 8 of these complex multiplies inside one loop saves a HUGE LOT of processor cycles
                  accum[2 * i + 0] +=  fftout[k][2 * i + 0] * fmask[j][2 * i + 0] -
                                       fftout[k][2 * i + 1] * fmask[j][2 * i + 1];
                  accum[2 * i + 1] +=  fftout[k][2 * i + 0] * fmask[j][2 * i + 1] +
                                       fftout[k][2 * i + 1] * fmask[j][2 * i + 0]; 
                  accum[2 * i + 2] +=  fftout[k][2 * i + 2] * fmask[j][2 * i + 2] -
                                       fftout[k][2 * i + 3] * fmask[j][2 * i + 3];
                  accum[2 * i + 3] +=  fftout[k][2 * i + 2] * fmask[j][2 * i + 3] +
                                       fftout[k][2 * i + 3] * fmask[j][2 * i + 2]; 
                  accum[2 * i + 4] +=  fftout[k][2 * i + 4] * fmask[j][2 * i + 4] -
                                       fftout[k][2 * i + 5] * fmask[j][2 * i + 5];
                  accum[2 * i + 5] +=  fftout[k][2 * i + 4] * fmask[j][2 * i + 5] +
                                       fftout[k][2 * i + 5] * fmask[j][2 * i + 4]; 
                  accum[2 * i + 6] +=  fftout[k][2 * i + 6] * fmask[j][2 * i + 6] -
                                       fftout[k][2 * i + 7] * fmask[j][2 * i + 7];
                  accum[2 * i + 7] +=  fftout[k][2 * i + 6] * fmask[j][2 * i + 7] +
                                       fftout[k][2 * i + 7] * fmask[j][2 * i + 6]; 
              }
              k = k - 1;
              if(k < 0)
              {
                k = nfor - 1;
              } 
    
          } // end nfor loop
    
          buffidx = buffidx + 1;
          if(buffidx >= nfor)
          {
              buffidx = 0;    
          } 
          /**********************************************************************************
              Complex inverse FFT
           **********************************************************************************/
          arm_cfft_f32(iS, accum, 1, 1);
    
          /**********************************************************************************
              Overlap and save algorithm, which simply means yu take only half of the buffer
              and discard the other half (which contains unusable time-aliased audio).
              Whether you take the left or the right part is determined by the position
              of the zero-padding in the filter-mask-buffer before doing the FFT of the 
              impulse response coefficients          
           **********************************************************************************/
            for (unsigned i = 0; i < partitionsize; i++)
            {
              //float_buffer_L[i] = accum[partitionsize * 2 + i * 2 + 0];
              //float_buffer_R[i] = accum[partitionsize * 2 + i * 2 + 1];
              float_buffer_L[i] = accum[i * 2 + 0] * audio_gain;
              float_buffer_R[i] = accum[i * 2 + 1] * audio_gain;
            }
    
         /**********************************************************************************
              Serial print the first output samples in order to check for latency
           **********************************************************************************/
    #if defined(LATENCY_TEST)
            if(no_more_latency_test == 1)
            {
              no_more_latency_test = 2;
              for(unsigned i = 0; i < partitionsize; i++)
              {
                if(accum[i * 2 + 0] * audio_gain > 0.1 || accum[i * 2 + 0] * audio_gain < -0.1)
                {
                  Serial.print(i + all_samples_counter - 21); Serial.print("   left:   "); Serial.println(accum[i * 2 + 0]);
                  Serial.print(i + all_samples_counter - 21); Serial.print("  right:   "); Serial.println(accum[i * 2 + 1]);
                }
              }
            }
    #endif
    
           /**********************************************************************
              CONVERT TO INTEGER AND PLAY AUDIO - Push audio into I2S audio chain
           **********************************************************************/
          for (int i = 0; i < N_BLOCKS; i++)
            {
              sp_L = Q_out_L.getBuffer();    
              sp_R = Q_out_R.getBuffer();
              arm_float_to_q15 (&float_buffer_L[BUFFER_SIZE * i], sp_L, BUFFER_SIZE); 
              arm_float_to_q15 (&float_buffer_R[BUFFER_SIZE * i], sp_R, BUFFER_SIZE);
              Q_out_L.playBuffer(); // play it !  
              Q_out_R.playBuffer(); // play it !
            }
    
           /**********************************************************************************
              PRINT ROUTINE FOR ELAPSED MICROSECONDS
           **********************************************************************************/
    #ifdef DEBUG
          sum = sum + usec;
          idx_t++;
          if (idx_t > 400) {
            mean = sum / idx_t;
            if (mean / 29.00 / N_BLOCKS * SAMPLE_RATE / AUDIO_SAMPLE_RATE_EXACT < 100.0)
            {
              Serial.print("processor load:  ");
              Serial.print (mean / 29.00 / N_BLOCKS * SAMPLE_RATE / AUDIO_SAMPLE_RATE_EXACT);
              Serial.println("%");
            }
            else
            {
              Serial.println("100%");
            }
            Serial.print (mean);
            Serial.print (" microsec for ");
            Serial.print (N_BLOCKS);
            Serial.print ("  stereo blocks    ");
            Serial.print("FFT-length = "); Serial.print(FFT_length);
            Serial.print(";   FIR filter length = "); Serial.println(nc);
            Serial.print("k = "); Serial.println(k);
            Serial.print("buffidx = "); Serial.println(buffidx);
            idx_t = 0;
            sum = 0;
          }
    #endif
        } // end of audio process loop
        
          /**********************************************************************************
              Add button check etc. here
           **********************************************************************************/
    
    }
    
    void init_partitioned_filter_masks()
    {
        for(unsigned j = 0; j < nfor;j++)
        {
          // fill with zeroes
          for (unsigned i = 0; i < partitionsize * 4; i++)
          {
              maskgen[i] = 0.0;  
          }
          // take part of impulse response and fill into maskgen
          for (unsigned i = 0; i < partitionsize; i++)
          {   
            // THIS IS FOR REAL IMPULSE RESPONSES OR FIR COEFFICIENTS
            // FOR COMPLEX USE THE PART BELOW
              // the position of the impulse response coeffs (right or left aligned)
              // determines the useable part of the audio in the overlap-and-save (left or right part of the iFFT buffer)
              maskgen[i * 2 + partitionsize * 2] = guitar_cabinet_impulse[i + j * partitionsize];  
          }
    //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////      
    /*    
     *     THIS COMMENTED OUT PART IS FOR COMPLEX FILTER COEFFS
     *     
     *     // take part of impulse response and fill into maskgen
          for (unsigned i = 0; i < partitionsize * 2; i++)
          {
              // the position of the impulse response coeffs (right or left aligned)
              // determines the useable part of the audio in the overlap-and-save (left or right part of the iFFT buffer)
              maskgen[i + partitionsize * 2] = guitar_cabinet_impulse[i + j * partitionsize * 2];  
          }
          */
    //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////      
          // perform complex FFT on maskgen
          arm_cfft_f32(maskS, maskgen, 0, 1);
          // fill into fmask array
          for (unsigned i = 0; i < partitionsize * 4; i++)
          {
              fmask[j][i] = maskgen[i];  
          }    
        }
    }
    
    void setI2SFreq(int freq) {
      // thanks FrankB !
      // PLL between 27*24 = 648MHz und 54*24=1296MHz
      int n1 = 4; //SAI prescaler 4 => (n1*n2) = multiple of 4
      int n2 = 1 + (24000000 * 27) / (freq * 256 * n1);
      double C = ((double)freq * 256 * n1 * n2) / 24000000;
      int c0 = C;
      int c2 = 10000;
      int c1 = C * c2 - (c0 * c2);
      set_audioClock(c0, c1, c2, true);
      CCM_CS1CDR = (CCM_CS1CDR & ~(CCM_CS1CDR_SAI1_CLK_PRED_MASK | CCM_CS1CDR_SAI1_CLK_PODF_MASK))
           | CCM_CS1CDR_SAI1_CLK_PRED(n1-1) // &0x07
           | CCM_CS1CDR_SAI1_CLK_PODF(n2-1); // &0x3f 
    //Serial.printf("SetI2SFreq(%d)\n",freq);
    }

  4. #129
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    Hmm, strange.

    I have never used USB audio with the Teensy and have no hardware compatible with SPDIF, so I cannot test that.

    One idea is the sample rate: why did you comment out the I2S setting to 44100sps? The filters are calculated for that sample rate. However it should not lead to crackles are alike, it could just sound a little different with different sample rates. But I am not sure whether USB / SPDIF needs a dedicated fixed and accurate sample rate?

    EDIT: Not sure at all, but I remember there was a bug in former Teensyduino versions making necessary the use of AudioControlSGTL5000 audio_codec; in the sketch, even if no SGTL5000 was used???
    Did you try the latest TD version or does inserting that improve the situation?

  5. #130
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    I don't use I2S at all, but restoring that rate line made no difference. A big clue - disabling all the convolution code and just taking samples, converting to float and then back also has the distortion. But pass-through patchcords (no Q buffers) works.

    I'll keep trying. Possibly you could test USB in to i2s out? Just connect to a PC and play some music.

  6. #131
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    Tested the USB_Passthrough example with Teensy 4.0 & Audio board Rev D, Arduino 1.8.13 TD 1.53, but I cannot get it to work.

    * Compiled the sketch with USB type: Audio
    * compiled fine
    * PC recognizes Digital audio (Teensy audio)
    * if I play a WAV file through that output, nothing happens at the output of the USB passthrough sketch (headphone I2S output)
    * same result == no audio with your sketch above . . .

    Sorry, but USB audio seems to be broken somehow ??? Or I am doing something wrong.

  7. #132
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Thanks (for the help and the code). I found it was necessary to use AudioMemory(20). Sounds good now. Might be worth using 20 (or more?) in your code.

    The goal is to do digital room correction of speakers like this does:

    https://www.minidsp.com/products/ope...ies/opendrc-di

    It's looking good so far!

    I use REW to measure impulse response, then this DRC code to get a correction impulse.
    http://drc-fir.sourceforge.net/doc/drc.html

    I have found this better than parametric equalization.

  8. #133
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    good that it works for you now!

    Very interesting!

    Would you be willing to prepare a very short How-To for your room correction for speakers somewhere?


    EDIT: Strangely, USB audio seems to work exclusively if I use USB type: "Serial + Midi + Audio", it does not work at all with USB type: "Audio"
    and the USB input is extremely loud, so I have to lower the PC sound source considerably in order not to overload the codec input
    Very nice opportunity to learn about how to use USB audio with the Teensy, thanks for that ;-).
    Last edited by DD4WH; 10-13-2020 at 06:32 PM.

  9. #134
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Will do once I'm done. More on DRC here:

    https://en.wikipedia.org/wiki/Digita...ction%20system.

    It does require an accurate microphone to measure the speakers.

    I'm using a teensy4 with a toslink module - total cost is $24.

  10. #135
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Ran into an issue - speaker/room correction is normally done with a separate impulse for each channel. And I don't understand the code well enough to change it.

  11. #136
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    Inside the Convolution loop, we are doing a complex FFT / inverse FFT routine. The real part of the input is one channel, the imaginary part of the input is the other channel. So in one loop, BOTH channels are processed with the same IR

    The convolution is thus done with a complex coefficient / IR - set. So it is not possible with this setting to get different IRs applied to the specific channels.

    Simplest and fastest solution to implement is just copy/paste the code loops, so that you use one IR for one channel and then just have another copy of the code for the other channel.

    Downside:
    * doubles the CPU load
    * doubles the memory

    Alternative:
    * modify the code for usage of real FFT instead of complex FFT

    BUT:
    * if you use real FFT as implemented in the CMSIS lib:
    * same CPU load for two IRs !!!
    * double memory --> because the CMSIS real FFT routine needs double sized buffers

    So, my pragmatic recommendation:
    * just copy / paste the code for two channels
    * but beware of the size limits for the IR, but you know that :-)

  12. #137
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    I think, I have to correct myself (although I am not so sure, do not have the time at the moment to test this or properly think about it):

    * the code as it is, does in fact offer the possibility to use different IRs for different channels
    * you can see a commented part of the code in function init_partitioned_filter_masks() (COMPLEX FILTER COEFFS)
    * you would have to modify the code, so that both IRs (for left and right speaker channels) are both stuffed into the maskgen buffer
    * maskgen should then consist of even values --> IR left and odd values --> IR right

    That is at least my working hypothesis. Good luck in trying it out! I would be very interested in your results!

  13. #138
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    > both IRs (for left and right speaker channels) are both stuffed into the maskgen buffer

    I thought about this, but if this would work, how does it work as the code is now? The imaginary/odd part of maskgen is currently set to all zeros. So if the imaginary part is the IR for the right channel, then I'd expect the right channel output to be silent.

    The commercial DRC device is 6144 taps/ch x 2 channels, so 8192 x 2 would be good. I think memory use will be OK.

  14. #139
    Senior Member DD4WH's Avatar
    Join Date
    Oct 2015
    Location
    Central Europe
    Posts
    688
    You are probably right . . . I remember fiddling around a lot about this when I wrote the code . . . I need to have a look into my handwritten archive for the initial preparations/thoughts on that.

    I always get confused, because my initial goal was not audio correction with IRs, but SDR DSP for I & Q signals (90 degrees out of phase), which have to be treated differently than independent left & right signals . . .

    My time is restricted at the moment, sorry. So, maybe you can try out to fill the odd values with your second IR and just listen to what happens :-)?

    However, just doing the processing for 8192-long IRs two times seems feasible with T4.0, I think :-). So, your choice is: doing it the elegant way or just doing it quickly. The audio quality will be the same ;-).

  15. #140
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Agreed, unless someone with better understanding wants to help, creating a second fmask array and then running the existing multiply/ifft process twice better fits my limited understanding of this stuff.

  16. #141
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    I have code to alpha test if anyone is interested.

  17. #142
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Here is the alpha code if you need two impulse responses applied to two channels:
    Attached Files Attached Files

  18. #143
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    While 9216x2 taps is OK, it could go higher if DMAMEM could be used for some of the large arrays (eg fmask). But with USB input and spdif output, this causes some static. Any idea why or how to fix?

  19. #144
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    811
    Neither DMAMEM nor using real ffts worked out. But I was successful in using the symmetry of fft output to reduce memory use. This resulted in > 12,000 taps with stereo input and separate impulse responses for each channel.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •