How to play user-defined adjustable audio spectra using Teensy in real time

Status
Not open for further replies.
Hi Paul,

I am new to both Teensy and audio signal processing. I'm looking for some guidance on how to do the following:

- Define a user-modifiable audio spectrum (magnitude and phase)
- Based on certain feedback (e.g. joystick or sensor), I want to modify the shape of the spectrum in real-time while it's IFFT is being played by Teensy 4.1 using I2S protocol and audio shield.

I could not find a relevant example in the discussion forum. Sorry, if I was not able to articulate problem sufficiently. I would much appreciate if you could provide some guidance/link as to how the above can be accomplished. Thank you!

Anab
(Ann Arbor, MI)
 
For real-time you may need something like partitioned convolution to get sufficiently low latency, but it all depends
on the details - how real time? Can you define this as a maximum latency you are prepared to tolerate? Also how
points in the spectrum (an N-point FFT has a resolution of Fs/N where Fs is the sampling rate).

If you can tolerate the latency being as large as the FFT size then overlap-add or overlap-save methods of convolution
can apply directly which will keep things simple.
 
Thanks, MarkT. I get the general idea, but I was hoping to get more specific guidance on how to implement it using Teensy Library.

- Let's assume I need to have 21Hz delta freq. So this means, assuming a 44.1k sampling, I need to maintain and modify 2048 spectral lines (mag/phase) each time I update a frame. Each frame in time domain, respective signal length will be 4096 or ~90ms long @44.1kHz sampling rate. I can write code to modify the spectral content at each frame update based on joystick input or sensor feedback. Then I can also implement overlap-add to stich the frames together.

- What I don't know is whether I can do all this fast enough to make the sound realistic. Secondly, I don't know what Teensy Library object(s) to use to handover the stitched signal (or individual frames assume Teensy can do the overlap-add).

Ideally, I would just like to defined/modify the spectrum at reasonable update rate. Then I was hoping to send the new spectra and Teensy could take care of the rest. Could you point me to an example showing what Teensy Library object might be able to do these thing.

Thanks for your help!
 
Hi,
a rough estimation:
https://www.lntwww.de/Aufgaben:Aufg...Anzahl an Rechenoperationen durchzuführen ist.
says that for FFT you have to do
5*N*log2(N) Operations, so this is 5*2048*11= 112640 Operations per Buffer. Teensy 4.1 can do about 600E6/3 integer multiplications per second. https://www.quinapalus.com/cm7cycles.html
So this is about 0.09s*600e6/3 = 18E6 Operations per Buffer of 0.09sec length. So the IFFT seems to be possible with fixed point number crunching.

I won't say that it is the best way of doing things, but you could have a look at my looper. https://forum.pjrc.com/threads/69362-Core-of-a-Multitrack-Looper
the thread doQueue() feeds frames of 128 samples into TeensyAudio from rolling buffers, which have 4096 samples length and 8 tracks. In my case these rolling buffers come from a file, but they could be single track and could come from IFFT in your case.

So I would try to find 32bit integer math IFFT source code and do some experiments about timing.

I don't know what you want to achieve in the end. But perhaps this could be done with sine, filters or wavetables or ??? using standard elements of TeensyAudio too?
Good luck!
 
Hi cebersp,

Thanks for the link. I will try out your looper and see how that works out. I do realize that oscillators and wave-tables can be used to construct arbitrary wave forms but I don't have any experience in that area. Since the reference spectrum is coming from a gray-box model, I thought the easiest thing to do is to feed the spectrum directly to Teensy, if possible.

Just to give some background, in this hobby project, I'm trying to reproduce load/speed dependent sound coming out of a turbo-charger in real-time. These devices broadly emit two kinds of sound, i) some dominant tonal orders related to mechanical system (unbalance induced vibration etc), ii) Second one is more complex broad-band stochastic sound related to fluid-dynamics driven sound pressure fluctuation etc. So I'm developing a sort of gray-box model that attempts to reproduce these sounds given some basic properties of the device and some recordings at reference speed/load.

Thanks again for your help.
 
So I would try to find 32bit integer math IFFT source code and do some experiments about timing.

You know the builtin CMSIS lib has a full suite of FFT routines? https://www.keil.com/pack/doc/CMSIS/DSP/html/group__groupTransforms.html

(Note the Teensy isn't using the latest version of CMSIS, so there are some changes). The Audio library samples as 16 bit (the same as the q15 type
in CMSIS DSP routines).

Tying code into the Audio library can be done through writing new Audio library classes or more loosely coupled using AudioPlayQueue and
AudioRecordQueue classes.

The T4.x is not going to struggle with such a light task.
 
- What I don't know is whether I can do all this fast enough to make the sound realistic. Secondly, I don't know what Teensy Library object(s) to use to handover the stitched signal (or individual frames assume Teensy can do the overlap-add).
T4.x is blindingly fast, you will not have a shortage of grunt.
As mentioned AudioPlayQueue and AudioRecordQueue classes provide programmatic interfaces for audio data.
Ideally, I would just like to defined/modify the spectrum at reasonable update rate. Then I was hoping to send the new spectra and Teensy could take care of the rest. Could you point me to an example showing what Teensy Library object might be able to do these thing.
One thing to clarify are you filtering with this spectrum or is this a frequency domain sample-stream? If the latter you'll
risk having a apparent fundamental tone at Fs/N (being the repeat rate of the spectrum).
If you filter a signal using the spectrum it doesn't impose such an artifact.
 
I'm glad to hear T4.x is blindingly fast! If there is enough horsepower available, then I suppose I could generate the frames in the time domain and send them over to Teensy via Queue object. This means I will modify the spectrum (based on sensor or other feedback). Then I will perform IFFT to create the frames. I guess find out soon enough when I start playing with the 4.1 board.

BTW does the Queue object take care of stitching these time-domain frames properly (using overlap-add or save)? Or do I need to do something special for that? Thanks again folks for all the help!
 
Hi, Queue does just take single 128 samples frames and strings them together without gap or overlap. Perhaps you will need some windowing of your 4096 samples buffer first and you will probably need some sort of saturation too.
 
overlap-add / overlap-save is a method for using FFT to implement convolutions.

Again: one thing to clarify: are you filtering with this spectrum or is this a frequency domain sample-stream?
 
Hi MarkT/cebersp,

An update. Thank you for all the suggestions. It worked out great!

I'm not too excited about the audio quality but there is a lot of room for improvement from my end. Here is a summary of what was done.
- A table of sound spectra (Mag and Phase) as a function of rpm (8192 x 9) was included in a header. The spectra were created offline by analyzing an engine runup sound originally sampled at 44.1kHz. There were additional post-processing stuff I did but that is not relevant here. In summary, my extraction deltaFreq was 2.69Hz and cutoff was 22kHz.
- Teensy receives current rpm value from a POT (it will eventually come from another Teensy via I2C which runs the physics engine and does motion control).
- I then create the corresponding spectrum by interpolating the 8192 x 9 table at the operating rpm (there is data at 9 discrete rpms).
- Then I zero-pad the spectrum which reduces dF to 1.34Hz and brings back 44.1kHz sampling. After that I do an iFFT.

The routine is from a book called Numerical Recipes, very good reference for off-line scientific computing. All variables are declared as float in FFT routine. It is far from optimal for real-time application. Anyway, this operation yields 16384 audio samples. The AudioQueue buffer fetches blocks from my buffer. As soon as I detect that all samples are consumed, I call my interpolation+iFFT routine to obtain another batch of 16384 samples corresponding to current rpm. The spectral interpolation + iFFT operation takes approximately 9ms, iFFT alone takes about 6-7ms. To my surprise, audio is still pretty good. But if I switch to doubles then it takes about 15ms and the delay becomes audible. I have a rate limiter on engine rpm, so you cannot instantly change rpm. The physics-based rate-limit is 100rpm/sec.

I am attaching the code (the spectral table in header file is huge so its not included). It is very simple, kind of amateurish by your standard but functional. I do intend to switch to an appropriate arm CMSIS FFT routine. I expect that to significantly speed things up. Other than that, do you have any suggestions for speeding up the calculations? Thank again for all your suggestions!! Never thought I could manage 8192 spectral lines in real-time. So Teensy rocks!

Anab



Code:
// Audio
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// EMD645 Sound Spectral Data
#include "specdata.h"
typedef float T;


// GUItool: begin automatically generated code
AudioPlayQueue           queue1;         //xy=917,329
AudioOutputI2S           i2s2;           //xy=1121,330
AudioConnection          patchCord1(queue1, 0, i2s2, 0);
AudioConnection          patchCord2(queue1, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=1131,379
// GUItool: end automatically generated code
//
#define COUNT_OF(a) (sizeof a / sizeof a[0])
//

/* Time Keeping: */
unsigned long t1, t2, t3;
/* iFFT/RPM Interpolation Update Time: */
unsigned long CycleTime_ms = 371.5;


/* Dimension of Spectral Data: */
const int ncol = 8192;          // Numner of spectral Lines
const int mrow = 9;             // 8 Notches + Idle = 9 RPMs
unsigned long int N = ncol*2;
T MyBuffer[ncol*2];             // Buffer containing 16384 samples from iFFT
int currIndx = 0;               // Points at current index of MyBuffer
T a[ncol*4];                    // For four1 FFT routine


/* Ampitude Modulation Function Variables: */
T psiAmp = 0.00; 
T ModMag = 0.25;
T dTglobal = (T) CycleTime_ms / 1000.0f;
T sqrtTwo = sqrt(2.0);
T PI_over_30 = (PI / 30.0);
T dtLocal = 1.0f/44100.0f;
T dtLocal_x_PI_over_30 = dtLocal * PI_over_30;


// JoyStick Pins:
T RPM_RiseRate =  100.0f; // +100 rpm/sec
T RPM_FallRate = -100.0f; // -100 rpm/sec
const int pin_js_y = 15;
float currRPM, prevRPM;



void setup()
{
/* SETUP AUDIO: */
/////////////////////////////////////////////////////////////////////////////////////////
      pinMode(13,OUTPUT);
      // Open serial communications and wait for port to open:
      Serial.begin(115200); // wait for serial port to connect.
      while (!Serial);
      delay(100);     
      if (CrashReport) 
      {
        Serial.println(CrashReport);
        CrashReport.clear();
      }    
      AudioMemory(5);     
      sgtl5000_1.enable();
      sgtl5000_1.volume(0.8);
      queue1.setBehaviour(AudioPlayQueue::NON_STALLING);   
      queue1.setMaxBuffers(5);
//////////////////////////////////////////////////////////////////////////////////////////////
/* END SETUP AUDIO */
    

/* Fillout the sample buffer for Teensy: */
      int i,j;
      currIndx = 0;
      prevRPM = 255.0;
      currRPM = 255.5;
        
/* Call iFFT to Interpolate Spectrum @ operating RPM & generate audio samples: */  
      t1 = millis();
      iFFT();     
      t2 = millis();
      Serial.println(t2-t1);
 
}


void loop()
{
    /* Read current RPM: */     
    get_RPM();      
      
    if (currIndx > 2*ncol-1)
    {
        iFFT();
        currIndx=0;
            
    } else {
      
        int16_t* buf = queue1.getBuffer();
        if (NULL != buf)
        {         
            for (int i=0;i<AUDIO_BLOCK_SAMPLES;i++)
            {
                 T A = AmpMod();
                 buf[i] = (int16_t) 8000.0*A*MyBuffer[currIndx];
                 currIndx++;
            }
            queue1.playBuffer();
        }      
    }
              
}


/* Amplitude Modulation Function: */
T AmpMod()
{
    /* w = RPM * PI / 30; psi = psi + w*dT */
    psiAmp += currRPM * dtLocal_x_PI_over_30;
    /* Modulus Operation: psi < 2*PI */
    if (psiAmp > TWO_PI) psiAmp -= TWO_PI;
    return  ( 1.0 - ModMag*(1.0 - cos(psiAmp))  );
}


/* Get RPM From Analog Pin: */
void get_RPM()
{
    int val = analogRead(pin_js_y);
    currRPM = (T) map(val, 0, 1023, 255, 907);
    currRPM = RateLimiter(dTglobal, RPM_RiseRate, RPM_FallRate, currRPM, prevRPM);
    prevRPM = currRPM;
}


/* Interpolate Spectrum & Perform iFFT: */
void iFFT() 
{  
    int i,j,idL=0,idU=1;
    T Slope, Mag;

    /* Initialize: Very Very Very Important!!! */
    for (i=0; i<4*ncol; i++) a[i] = 0.0;
    
    /* Bracket rpm: */
    for (i=0; i<mrow-1; i++)
    {
        if (  (currRPM >= rpm[i]) && (currRPM <= rpm[i+1])  )
        {
            idL = i;
            idU = i+1;
            break;
        }
    }

    // Interpolate Spectrum @ operating RPM: 
    j=0;
    for (i=0; i < ncol*2; i +=2)
    {
        if (i >= ncol)
        {
          a[i] = a[i+1] = 0.0;
          
        } else {
          
          phi[j] += currRPM * dtLocal_x_PI_over_30;
          if (phi[j] > TWO_PI) phi[j] -= TWO_PI;
          Slope = (Spectrum[j][idU] - Spectrum[j][idL])/(rpm[idU] - rpm[idL]);
          Mag = Spectrum[j][idL] + Slope*(currRPM - rpm[idL]);
          a[i]   = Mag*cos(phi[j]);
          a[i+1] = Mag*sin(phi[j]);
          j++;  
          
        }
    }

  
    // Call FFT: 
    int isign =-1;
    four1(a-1, N, isign);
    j=0;
    for (i=0; i < ncol*4; i += 2)
    {
        MyBuffer[j] = sqrtTwo*a[i];
        j++;
    }              
    
}


/* FFT Routine from Numerical Recipes: */
void four1(T *data, unsigned long nn, int isign) 
{
  unsigned long n, mmax, m, j, istep, i;
  //double wtemp, wr, wpr, wpi, wi, theta;
  float wtemp, wr, wpr, wpi, wi, theta;
  float tempr, tempi;
  n = nn << 1;
  j = 1;
  for (i=1; i<n; i+=2)
  {
    if (j > i)
    {
      swap(data[j],data[i]);
      swap(data[j+1],data[i+1]);
    }
    m = n >> 1;
    while (m >= 2 && j > m)
    {
      j -= m;
      m >>= 1;
    }
    j += m;
  }
  mmax = 2;
  while (n > mmax)
  {
    istep = mmax << 1;
    theta = isign*(2.0*PI/mmax);
    wtemp = sin(0.5*theta);
    wpr = -2.0*wtemp*wtemp;
    wpi = sin(theta);
    wr = 1.0;
    wi = 0.0;
    for (m=1; m<mmax; m+=2)
    {
      for (i=m; i<=n; i+=istep)
      {
        j = i + mmax;
        tempr = wr*data[j]-wi*data[j+1];
        tempi = wr*data[j+1]+wi*data[j];
        data[j] = data[i]-tempr;
        data[j+1] = data[i+1]-tempi;
        data[i] += tempr;
        data[i+1] += tempi;
      }
      wr = (wtemp=wr)*wpr-wi*wpi+wr;
      wi = wi*wpr+wtemp*wpi+wi;
    }
    mmax = istep;
  }
}

void swap(T &x, T &y)
{
        T temp = x;
        x = y;
        y = temp;
}


/* Rate-Limiter Function: */
T  RateLimiter(T dt, T R, T F, T u, T yo)
{
  /* LINEAR DYNAMIC RAMP: 
    R -> Rising Slew Rate; F -> Falling Slew Rate; t -> Current time t
    u -> Input;   y -> Output;
    Note: F must be negative
  */
  T rate, y;
  rate = (u - yo)/dt;
  if (rate > R)         /* Rising Slew Rate: */
  {
    y = dt*R + yo;
  } else if (rate < F) {
    y = dt*F + yo;      /* Falling Slew Rate: */
  } else {
    y = u;
  }
  return y;
}

// EOF
 
You should double-buffer the output, so you can calculate one batch of 16384 samples while the previous is being output. Then you have 1/3s or so to work with.
 
I'm not clear on how to do that. I need to know the current rpm to generate a batch of samples. 16384 samples last for 371.5ms. If I generate two of these then the rpm better not change much within that time frame (743ms) but it can. So the extra samples might not represent sound corresponding to current rpm. Maybe I did not understand what you meant by double buffering.
 
Oh I see. Did you mean I should call iFFT() halfway and generate another batch so that AudioQueue buffer always has data to grab? Then I would be calling my routine twice as often so I should really switch to a CMSIS fft routine. Which do you suggest?
 
No need to generate data twice as often, just earlier - double buffering... You won't be short of CPU cycles
 
Status
Not open for further replies.
Back
Top