New Guy Looking For Teensy DSP programming tips

Status
Not open for further replies.
D

DeletedUser

Guest
I'm a retired DSP software design engineer. I want to continue my previous activities from time to time. I have been working on several projects with the Arduino Nano, UNO. I just bought a Teensy 4.0 board and audio shield. I want to start a new project as soon as I receive the headers for the shield.

I've studied the Teensy GUI Audio Design Tool which looks very useful for open-loop algorithms. However, I would like to create an Automatic Gain Control for my TV or mp3 player so that I won't need to continuously push the +- remote control button every time a louder commercial interrupts the film or my next mp3 song was recorded at a different volume level.

Basically, my question concerns, how can I write a Teensy AGC audio function (non-optimized at first). Below, I am providing a Matlab code for the details. Thanks.

---------------------------------------------- Matlab Code ----------------------------------

clear;
close all;
filename='Baez.wav';
filename=input('input wav filename: ','s');
[sig, fs]=audioread(filename);
T=1/fs;
[nsamp,dummy]=size(sig);
fclose('all');
k=.00001;
ref=.07;
gain(1)=0.5;
numiter=nsamp;
for iter=1:numiter
sigout(iter)=sig(iter)*gain(iter);
t(iter)=(iter-1)*T;
err(iter)=abs(sigout(iter))-ref;
gain(iter+1)=gain(iter)-err(iter)*k;
end

figure
plot(t,sig,'b');
grid on;
zoom on;
hold on;
plot(t,sigout,'r');
hold off;
title('Input Signal(blue), Output Signal(red)');
xlabel('seconds');

figure2.jpg
 
have a look at the source code for existing components in the audio library and try to change them. see if you can make small changes and then try compiling them. see if you can add a parameter, perhaps an extra gain param to start with.

copy AudioEffectGain.h and .cpp in the audio library, rename it, and see if you can translate each matlab function one by one to a cloned version of AudioEffectGain.

look through the other audio library source code to see what functions are available, how to import them, #include <math> , for instance, to add square root, there are lots of great examples and teensy audio library is a great place to see how the basic functions are implemented. unfortunate there will be a learning curve writing classes etc in c++, however it is worth the learning curve, in my experience.

hope that helps. i had a look at your matlab code, but I didn’t understand gain() function.
 
before you write an audio class, you could write a simple arduino sketch which takes an array and writes the gain for each sample to the serial port.

int16_t sig[10] = { 0, 1, 2, 1, 0, -1 … }
double gain = 0.5;
for (int i=0; i<10; i++) {
gain = …;
Serial.printf(“gain for sample %i is %d”, i, gain);
}

once you have that working as expected, it should be much easier to transfer to an audio class.
 
thanks
Don't want to change the library. Just want to write my agc function in the Arduino code loop(). I don't know how to get and send data from my Arduino code to the I/O I2S objects. Sure would be great to have an example sketch of copying input samples to output samples via a declared buffer in my sketch. Maybe I need play and record objects using queues?

As for your misunderstanding of my Matlab code: If you're not familiar with Matlab, it's difficult to understand. I always simulate functions before I code them and the forum rules recommend furnishing current code. The gain is just the multiplier applied to each input sample to obtained the output sample's desired absolute average level.
 
The code below should use the queues to just pass input from line-in to the output.
You can then modify the loop which does the copying to implement the AGC. I may have a shot at it myself :)

Pete

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=305.8833312988281,324.8833312988281
AudioRecordQueue         queue1;         //xy=569.8833312988281,248.88333129882812
AudioPlayQueue           queue2;         //xy=759.8833312988281,249.88333129882812
AudioOutputI2S           i2s2;           //xy=683.8833312988281,319.8833312988281
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(queue2, 0, i2s2, 0);
AudioConnection          patchCord3(queue2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=97.88333129882812,111
// GUItool: end automatically generated code
const int myInput = AUDIO_INPUT_LINEIN;


void setup(void)
{
  Serial.begin(9600);
  while(!Serial);
  delay(1000);

  AudioMemory(32);

  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.75);
  // Start the record queue
  queue1.begin();
}


void loop(void)
{
  short *bp1,*bp2;
  // When an input buffer becomes available
  // process it
  if (queue1.available() >= 1) {
    bp1 = queue1.readBuffer();
    bp2 = queue2.getBuffer();
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
      *bp2++ = *bp1++;
    }
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
  }

  // volume control
  static int volume = 0;
  // Volume control
  int n = analogRead(15);
  if (n != volume) {
    volume = n;
    sgtl5000_1.volume(n / 1023.);
  }

}
 
Oh Great! Just what I needed. I will try it as soon as I get those rare headers to connect the audio shield. Amazon or AliExpress don't have them and the delay from the PJRC store to my home in France is several weeks.

Yes, try it out if you want. It's the AGC algo similar to the one I used for the V.32bis modem when I was working. However, the adaptation step size was much faster than this one.

1. I am surprised that there are very few closed-loop algorithms (LMS, AGC, DPLL, etc.) in the Teensy library. Maybe later on? I don't believe there is a mips issue with a Cortex M7 core.
2. Not enough or haven't found sufficient documentation to do basic setups like what you have just shown me.

Thanx!
 
Ah, just one more question so I don't flood the forum with my newguy questions. Can you please show me how the GUItool looks like for automatically generating this code? Thanx again!
 
If you can't find these why not use use these with these.
They can even be made from a much longer strip for ease of supply.
If making them from a longer strip, don't forget to allow for scrapping the
pin between the two 14 pin strips, i.e a 28 pin strip will not make two 14 pin strips. You will need at least 29 pins.

Or you could use two 10 + 4 socket headers and two 14 pin headers from this kit. You should be able to get that tomorrow with Prime.
 
Last edited:
Ah, just one more question so I don't flood the forum with my newguy questions. Can you please show me how the GUItool looks like for automatically generating this code? Thanx again!

Bill:

I hope this answers the question that you are asking. Using Pete's code as follows as an example:

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=305.8833312988281,324.8833312988281
AudioRecordQueue         queue1;         //xy=569.8833312988281,248.88333129882812
AudioPlayQueue           queue2;         //xy=759.8833312988281,249.88333129882812
AudioOutputI2S           i2s2;           //xy=683.8833312988281,319.8833312988281
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(queue2, 0, i2s2, 0);
AudioConnection          patchCord3(queue2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=97.88333129882812,111
// GUItool: end automatically generated code
const int myInput = AUDIO_INPUT_LINEIN;


void setup(void)
{
  Serial.begin(9600);
  while(!Serial);
  delay(1000);

  AudioMemory(32);

  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.75);
  // Start the record queue
  queue1.begin();
}


void loop(void)
{
  short *bp1,*bp2;
  // When an input buffer becomes available
  // process it
  if (queue1.available() >= 1) {
    bp1 = queue1.readBuffer();
    bp2 = queue2.getBuffer();
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
      *bp2++ = *bp1++;
    }
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
  }

  // volume control
  static int volume = 0;
  // Volume control
  int n = analogRead(15);
  if (n != volume) {
    volume = n;
    sgtl5000_1.volume(n / 1023.);
  }

}

You can cut/paste everything between the two "// GUItool" comments & "Import" that excerpt into the GUItool.

Code:
// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=305.8833312988281,324.8833312988281
AudioRecordQueue         queue1;         //xy=569.8833312988281,248.88333129882812
AudioPlayQueue           queue2;         //xy=759.8833312988281,249.88333129882812
AudioOutputI2S           i2s2;           //xy=683.8833312988281,319.8833312988281
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(queue2, 0, i2s2, 0);
AudioConnection          patchCord3(queue2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=97.88333129882812,111
// GUItool: end automatically generated code

This allows you to duplicate the layout & connections that someone (or yourself) has defined/designed. Any changes/additions/corrections that you make can then be "Exported" back into your program. Hope that helps . . .

Feel free to ask any other questions that you may have. This is a wonderful forum, as there are several individuals on here that are always willing to answer questions & to help with understanding any & all aspects of the Teensy & its peripherals.

Mark J Culross
KD5RXT
 
Last edited:
I ordered female header (long)pins from EBAY for the Teensy so I can optionally plug the sound card with male headers (short)pins on top and also optionally plug the both or just the Teensy to a perfboard with female header (short)pins on it. Hope that makes sense.
 
Bill:

I hope this answers the question that you are asking. Using Pete's code as follows as an example:

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=305.8833312988281,324.8833312988281
AudioRecordQueue         queue1;         //xy=569.8833312988281,248.88333129882812
AudioPlayQueue           queue2;         //xy=759.8833312988281,249.88333129882812
AudioOutputI2S           i2s2;           //xy=683.8833312988281,319.8833312988281
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(queue2, 0, i2s2, 0);
AudioConnection          patchCord3(queue2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=97.88333129882812,111
// GUItool: end automatically generated code
const int myInput = AUDIO_INPUT_LINEIN;


void setup(void)
{
  Serial.begin(9600);
  while(!Serial);
  delay(1000);

  AudioMemory(32);

  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.75);
  // Start the record queue
  queue1.begin();
}


void loop(void)
{
  short *bp1,*bp2;
  // When an input buffer becomes available
  // process it
  if (queue1.available() >= 1) {
    bp1 = queue1.readBuffer();
    bp2 = queue2.getBuffer();
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
      *bp2++ = *bp1++;
    }
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
  }

  // volume control
  static int volume = 0;
  // Volume control
  int n = analogRead(15);
  if (n != volume) {
    volume = n;
    sgtl5000_1.volume(n / 1023.);
  }

}

You can cut/paste everything between the two "// GUItool" comments & "Import" that excerpt into the GUItool.

Code:
// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=305.8833312988281,324.8833312988281
AudioRecordQueue         queue1;         //xy=569.8833312988281,248.88333129882812
AudioPlayQueue           queue2;         //xy=759.8833312988281,249.88333129882812
AudioOutputI2S           i2s2;           //xy=683.8833312988281,319.8833312988281
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(queue2, 0, i2s2, 0);
AudioConnection          patchCord3(queue2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=97.88333129882812,111
// GUItool: end automatically generated code

This allows you to duplicate the layout & connections that someone (or yourself) has defined/designed. Any changes/additions/corrections that you make can then be "Exported" back into your program. Hope that helps . . .

Feel free to ask any other questions that you may have. This is a wonderful forum, as there are several individuals on here that are always willing to answer questions & to help with understanding any & all aspects of the Teensy & its peripherals.

Mark J Culross
KD5RXT
You can cut/paste everything between the two "// GUItool" comments & "Import" that excerpt into the GUItool.


GREAT! THANX!
 
If all you want is AGC, then the audio processing block that you're looking for is a "compressor". I've always been suprised that there isn't one in the Teensy Audio Library. I'd do one myself, but my integer math skills are lacking. Instead...

You're running a Teensy 4, which means that you can do floating point math as fast as integer math. Therefore, you are not limited to the Teensy Audio Library (with it's focus on integer math). There are several folks with floating-point extensions of the Teensy Audio Library. You can run any one of them without penalty.

One choice is the OpenAudio library (https://github.com/chipaudette/OpenAudio_ArduinoLibrary). I started it, but it is now maintained by others. Most important for you is that it has two different compressors that you can use. I'd start with the simpler one. One of the examples that comes with the library shows you how to use it: https://github.com/chipaudette/OpenAudio_ArduinoLibrary/tree/master/examples/BasicCompressor_Float

Here's a blog post explaining its approach...it's a totally standard dynamic range compressor: https://openaudio.blogspot.com/search/label/Compression

Chip

Results-AmpSweep.png
 
If all you want is AGC, then the audio processing block that you're looking for is a "compressor". I've always been suprised that there isn't one in the Teensy Audio Library. I'd do one myself, but my integer math skills are lacking. Instead...

You're running a Teensy 4, which means that you can do floating point math as fast as integer math. Therefore, you are not limited to the Teensy Audio Library (with it's focus on integer math). There are several folks with floating-point extensions of the Teensy Audio Library. You can run any one of them without penalty.

One choice is the OpenAudio library (https://github.com/chipaudette/OpenAudio_ArduinoLibrary). I started it, but it is now maintained by others. Most important for you is that it has two different compressors that you can use. I'd start with the simpler one. One of the examples that comes with the library shows you how to use it: https://github.com/chipaudette/OpenAudio_ArduinoLibrary/tree/master/examples/BasicCompressor_Float

Here's a blog post explaining its approach...it's a totally standard dynamic range compressor: https://openaudio.blogspot.com/search/label/Compression

Chip

View attachment 26052

Yes, I will look into this. I have developed my fixed point code inside the Arduino loop() but I'm still waiting for hardware before I can test it. Simulations work in Matlab for floating point and C (mingw64) for fixed point. I'm using I/O audio wav files for now. It would be interesting to compare the number of cycles per sample needed for the floating point compressor and the fixed point agc. agc.jpgagc.jpg
 
If all you want is AGC, then the audio processing block that you're looking for is a "compressor".

View attachment 26052

I thought about the similarities and differences between an AGC and a compressor. I have never compared 2 audio clips processed by, respectively, the 2 algorithms. But it seems that we would lose the "nuances" (dynamic range) in a song processed by compression. With an AGC using a small adaptation constant (step size), we would observe the different passages played at pp(pianissimo) or ff(fortissimo). With the compression, we would hear, most likely, mf(mezzo forte) or f(forte). My goal is to maintain the high fidelity on 2 different songs recorded with 2 different recording levels.
Voila, my thoughts of the day.
Bill
 
The code below should use the queues to just pass input from line-in to the output.
You can then modify the loop which does the copying to implement the AGC. I may have a shot at it myself :)

Pete

Code:
// AGC - see https://forum.pjrc.com/threads/68320


}

Thanx for the help. I was able to implement your program & modify it with the agc that I plan to use for my TV, stereo. I perform a current agcGain to L&R queues and play them. I then update the agcGain according to the sum of the 128 samples compared to a reference. I'll try and show the results below (I'm not yet very skilled on using the tools on this forum). It works. I finally got the headers I needed for attaching the audio shield to the Teensy 4.0. I have 3 questions.
1. Before I used the Line in input, I loaded the wav file on the SD card and looped on it. The SD player froze up a few times. Has anyone else noticed that?
2. How do I determine the size x for AudioMemory(x)? I needed to put 128 to reduce the SD player's freezing up.
3. Is there a way to find the number of instruction cycles needed to process each sample queue?

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

//#define USE_SD //audio from SD card instead of linein
//#define PASSTHRU //direct copy of input to output samples

// GUItool: begin automatically generated code
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

#ifdef USE_SD
AudioPlaySdWav           playSdWav1;
#else
AudioInputI2S            i2s1;           //xy=139.1999969482422,214.99998474121094
#endif
AudioRecordQueue         queue1;         //xy=369.20001220703125,209.99998474121094
AudioRecordQueue         queue3;         //xy=370.20001220703125,262.20001220703125
AudioPlayQueue           queue2;         //xy=546.2000122070312,210.1999969482422
AudioPlayQueue           queue4;         //xy=548.2000122070312,268.20001220703125
AudioOutputI2S           i2s2;           //xy=750.2000122070312,216.99998474121094
#ifdef USE_SD
AudioConnection          patchCord1(playSdWav1, 0, queue1, 0);
AudioConnection          patchCord2(playSdWav1, 1, queue3, 0);
#else
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(i2s1, 1, queue3, 0);
#endif
AudioConnection          patchCord3(queue2, 0, i2s2, 0);
AudioConnection          patchCord4(queue4, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=132.20001220703125,142

#ifdef USE_SD
#define SDCARD_CS_PIN    10
#define SDCARD_MOSI_PIN  7
#define SDCARD_SCK_PIN   14
#else 
const int myInput = AUDIO_INPUT_LINEIN;
#endif
int32_t agcGain1=0x08000000;//.5 in Q3.29 format
int32_t agcGain2=0x08000000;//.5 in Q3.29 format

void setup(void)
{
  Serial.begin(9600);
  AudioMemory(128);
  sgtl5000_1.enable();
  sgtl5000_1.volume(0.5);
#ifdef USE_SD
  SPI.setMOSI(SDCARD_MOSI_PIN);
  SPI.setSCK(SDCARD_SCK_PIN);
  if (!(SD.begin(SDCARD_CS_PIN))) {
    while (1) {
      Serial.println("Unable to access the SD card");
      delay(500);
    }
  }
  delay(1000);
#else
  sgtl5000_1.inputSelect(myInput);
#endif
  // Start the record queue
  queue1.begin();
  queue3.begin();
}


void loop(void)
{
  int16_t *bp1,*bp2,*bp3,*bp4;
  int32_t sum,err;
  int32_t ref=293504;//.07*AUDIO_BLOCK_SAMPLES in Q15 format(2293*128)
#ifdef USE_SD
  if (playSdWav1.isPlaying() == false) 
  {
    Serial.println("Start playing");
//    playSdWav1.play("test.wav");
    playSdWav1.play("Tower.wav");
    delay(10);
  }
#endif  
  // When an input buffer becomes available
  // process it
  if (queue1.available() >= 1) {
    bp1 = queue1.readBuffer();
    bp2 = queue2.getBuffer();
#ifdef PASSTHRU
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
      *bp2++ = *bp1++;
    }
#else
    // Apply agcGain on input with saturation to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) 
    { 
      //ref. dspinst.h in Teensy audio library
      //*bp2++=saturate16((((int32_t)(*bp1++))*(agcGain1>>15))>>13); 
      *bp2++=saturate16(signed_multiply_32x16b(agcGain1>>12,(*bp1++)));
    }
    bp2 = queue2.getBuffer();
    sum=0;
    // Calculate sum of abs values of output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++)
    { 
       sum+=(int32_t)(abs(*bp2++)); 
    }
    // Calculate new agcGain for next input buffer
    err=ref-sum; 
    agcGain1+=err>>3; //update gain 
    agcGain1=min(agcGain1,0x3fffffff); //keep agcGain < 4.0
#endif
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
  }
  if (queue3.available() >= 1) {
    bp3 = queue3.readBuffer();
    bp4 = queue4.getBuffer();
#ifdef PASSTHRU
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
      *bp4++ = *bp3++;
    }
#else
    // Apply agcGain on input with saturation to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) 
    { 
      //ref. dspinst.h in Teensy audio library
      //*bp4++=saturate16((((int32_t)(*bp3++))*(agcGain2>>15))>>13); 
      *bp4++=saturate16(signed_multiply_32x16b(agcGain2>>12,(*bp3++)));
    }
    bp4 = queue4.getBuffer();
    sum=0;
    // Calculate sum of abs values of output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++)
    { 
       sum+=(int32_t)(abs(*bp4++)); 
    }
    // Calculate new agcGain for next input buffer
    err=ref-sum; 
    agcGain2+=err>>3; //update gain 
    agcGain2=min(agcGain2,0x3fffffff); //keep agcGain < 4.0
#endif
    // Free the input audio buffer
    queue3.freeBuffer();
    // and play it back into the audio queue
    queue4.playBuffer();
  }
  // volume control
  static int volume = 0;
  // Volume control
  int n = analogRead(0);
  if (n != volume) 
  {
    volume = n;
    sgtl5000_1.volume((float)n / 1023.);
  }
}

results.jpg
 
1. Before I used the Line in input, I loaded the wav file on the SD card and looped on it. The SD player froze up a few times. Has anyone else noticed that?
Unfortunately, that's normal for the waveplayer :)

2. How do I determine the size x for AudioMemory(x)? I needed to put 128 to reduce the SD player's freezing up.
I dont' think that that helps, because the player does not use the additional memory.
You can find out more about memory here:
https://www.pjrc.com/teensy/td_libs_AudioConnection.html

3. Is there a way to find the number of instruction cycles needed to process each sample queue?

For your code?
Yes, you can use micros() - gives you the microseconds counter.

Or, better, you can use ARM_DWT_CYCCNT - that gives you exacty cycles (32Bit-counter - and 600MHZ will roll it pretty fast)
uint32_t t = ARM_DWT_CYCCNT;
//*do your stuff *
t = ARM_DWT_CYCCNT - t;
 
Unfortunately, that's normal for the waveplayer :)


I dont' think that that helps, because the player does not use the additional memory.
You can find out more about memory here:
https://www.pjrc.com/teensy/td_libs_AudioConnection.html



For your code?
Yes, you can use micros() - gives you the microseconds counter.

Or, better, you can use ARM_DWT_CYCCNT - that gives you exacty cycles (32Bit-counter - and 600MHZ will roll it pretty fast)
uint32_t t = ARM_DWT_CYCCNT;
//*do your stuff *
t = ARM_DWT_CYCCNT - t;

OK
For the SD card, I don't intend to use it anyway but it's good to know that what I observed appears to be normal.
I found that I use 15 cycles/sample with AGC ON and 4 in PASSTHRU. Guess there should be about (13605-15) free cycles @ fs=44100 & fclk=600 MHz. Not bad!
I found AudioMemoryUsageMax=7.
Looks like there's some working space and time left over for some more complicated projects. I like this Teensy!
Thank you for the info!

BR Bill
 
Not because I need to, but because I'm curious to see if I can optimize the number of cycles/sample in my AGC program. In my AUDIO_BLOCK_SAMPLES loop for calculating and applying the agcGain, I have created only 1 loop per channel instead of 2 to obtain 12 instead of 15 cycles per sample(see my last post showing the code). I then tried to take advantage of the ARM capability of performing 2 instructions simultaneously. I now get 8.8 cycles per sample. I am trying to take advantage of the dspinst.h list of optimized instructions list. The new code is shown below if anyone is interested in commenting. Thanks.

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

//#define USE_SD //audio from SD card instead of linein
//#define PASSTHRU //direct copy of input to output samples
//#define NCYCLES

// GUItool: begin automatically generated code
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

#ifdef USE_SD
AudioPlaySdWav           playSdWav1;
#else
AudioInputI2S            i2s1;           //xy=139.1999969482422,214.99998474121094
#endif
AudioRecordQueue         queue1;         //xy=369.20001220703125,209.99998474121094
AudioRecordQueue         queue3;         //xy=370.20001220703125,262.20001220703125
AudioPlayQueue           queue2;         //xy=546.2000122070312,210.1999969482422
AudioPlayQueue           queue4;         //xy=548.2000122070312,268.20001220703125
AudioOutputI2S           i2s2;           //xy=750.2000122070312,216.99998474121094
#ifdef USE_SD
AudioConnection          patchCord1(playSdWav1, 0, queue1, 0);
AudioConnection          patchCord2(playSdWav1, 1, queue3, 0);
#else
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(i2s1, 1, queue3, 0);
#endif
AudioConnection          patchCord3(queue2, 0, i2s2, 0);
AudioConnection          patchCord4(queue4, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=132.20001220703125,142

#ifdef USE_SD
#define SDCARD_CS_PIN    10
#define SDCARD_MOSI_PIN  7
#define SDCARD_SCK_PIN   14
#else 
const int myInput = AUDIO_INPUT_LINEIN;
#endif
int32_t agcGain1=0x08000000;//.5 in Q3.29 format
int32_t agcGain2=0x08000000;//.5 in Q3.29 format
#ifdef NCYCLES
uint32_t iter=0;
#endif

void setup(void)
{
  Serial.begin(9600);
  AudioMemory(32);
  sgtl5000_1.enable();
  sgtl5000_1.volume(0.5);
#ifdef USE_SD
  SPI.setMOSI(SDCARD_MOSI_PIN);
  SPI.setSCK(SDCARD_SCK_PIN);
  if (!(SD.begin(SDCARD_CS_PIN))) {
    while (1) {
      Serial.println("Unable to access the SD card");
      delay(500);
    }
  }
  delay(1000);
#else
  sgtl5000_1.inputSelect(myInput);
#endif
  // Start the record queue
  queue1.begin();
  queue3.begin();
}


void loop(void)
{
  int16_t *bp1,*bp2,*bp1a,*bp2a,*bp3,*bp4,*bp3a,*bp4a;
  int32_t sum,sum1,err,res,res1;
  int32_t ref=293504;//.07*AUDIO_BLOCK_SAMPLES in Q15 format(2293*128)
#ifdef NCYCLES
  uint32_t t;
#endif
#ifdef USE_SD
  if (playSdWav1.isPlaying() == false) 
  {
    Serial.println("Start playing");
    playSdWav1.play("Tower.wav");
    delay(10);
  }
#endif  
  // When an input buffer becomes available
  // process it
  if (queue1.available() >= 1) 
  {
#ifdef NCYCLES
    t = ARM_DWT_CYCCNT;
#endif
    bp1 = queue1.readBuffer();
    bp1a= bp1+1;
    bp2 = queue2.getBuffer();
    bp2a= bp2+1;
#ifdef PASSTHRU
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) 
    {
      *bp2++ = *bp1++;
    }
#else
    // Apply agcGain on input with saturation to output buffer
    sum=0;
    sum1=0;
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES/2;i++) 
    { 
      //ref. dspinst.h in Teensy audio library
      res=signed_multiply_32x16b(agcGain1>>12,*bp1);
      res1=signed_multiply_32x16b(agcGain1>>12,*bp1a);
      *bp2=saturate16(res);
      *bp2a=saturate16(res1);
      sum+=abs(res);
      sum1+=abs(res1);
      bp1+=2;bp1a+=2;bp2+=2;bp2a+=2;
    }
    sum+=sum1;
    // Calculate new agcGain for next input buffer
    err=ref-sum; 
    agcGain1+=err>>3; //update gain 
    agcGain1=min(agcGain1,0x3fffffff); //keep agcGain < 4.0
#endif
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
#ifdef NCYCLES
    t = ARM_DWT_CYCCNT - t;
    iter++;
    if (iter==1000)
    {
      iter=0;
      Serial.print("cycles for left channel AGC on 128 samples: "); //1122 cycles or 8.8 cycles/sample
      Serial.print(t);
      Serial.print(" AudioMemoryUsageMax: ");
      Serial.println(AudioMemoryUsageMax()); //32
    }
#endif
  } //if (queue1.available() >= 1)
  if (queue3.available() >= 1) 
  {
    bp3 = queue3.readBuffer();
    bp3a=bp3+1;
    bp4 = queue4.getBuffer();
    bp4a=bp4+1;
#ifdef PASSTHRU
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) 
    {
      *bp4++ = *bp3++;
    }
#else
    // Apply agcGain on input with saturation to output buffer
    sum=0;
    sum1=0;
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES/2;i++) 
    { 
      res=signed_multiply_32x16b(agcGain1>>12,*bp3);
      res1=signed_multiply_32x16b(agcGain1>>12,*bp3a);
      *bp4=saturate16(res);
      *bp4a=saturate16(res1);
      sum+=abs(res);
      sum1+=abs(res1);
      bp3+=2;bp3a+=2;bp4+=2;bp4a+=2;
    }
    sum+=sum1;    
    // Calculate new agcGain for next input buffer
    err=ref-sum; 
    agcGain2+=err>>3; //update gain 
    agcGain2=min(agcGain2,0x3fffffff); //keep agcGain < 4.0
#endif
    // Free the input audio buffer
    queue3.freeBuffer();
    // and play it back into the audio queue
    queue4.playBuffer();
  } //if (queue3.available() >= 1)
  // volume control
  static int volume = 0;
  // Volume control
  int n = analogRead(0);
  if (n != volume) 
  {
    volume = n;
    sgtl5000_1.volume((float)n / 1023.);
  }
}
 
That looks pretty good.

If you want to play a bit more, you can try to use 32Bit reads and writes and to process two samples per channel at once.
It *may* be a little faster as two reads and stores less would be done on each loop, but may need a shift. Often the ARM can do that sidline and is costless, worst case it needs 1 cycle per shift (the CPU has a barrel-shifter)
As said if you want to try - it may safe a few cycles more. ( Edit: And it automatically unrolls the loop by two)

An other option is to try to just unroll the loop a little more.

It all depends on the free registers. If the compiler needs more than registers inside the loop than the CPU has available things might become slower.
 
Last edited:
That looks pretty good.

If you want to play a bit more, you can try to use 32Bit reads and writes and to process two samples per channel at once.
It *may* be a little faster as two reads and stores less would be done on each loop, but may need a shift. Often the ARM can do that sidline and is costless, worst case it needs 1 cycle per shift (the CPU has a barrel-shifter)
As said if you want to try - it may safe a few cycles more. ( Edit: And it automatically unrolls the loop by two)

An other option is to try to just unroll the loop a little more.

It all depends on the free registers. If the compiler needs more than registers inside the loop than the CPU has available things might become slower.

Thanks for the tip. I might try those 32-bit R/W's. I'll let you know. Bye
 
Thanks for the tip. I might try those 32-bit R/W's. I'll let you know. Bye

FYI

1. 32-bit R/W with signed_multiply_32x16b and signed_multiply_32x16t requires a few more cycles.
2. Unrolling my last code that I showed you but processing 4 samples/loop requires 7 instead of 8.8 cycles/sample.

I think I'll stop playing. Thanks.
 
And the final code

Code:
// AGC - see https://forum.pjrc.com/threads/68320

#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

//#define USE_SD //audio from SD card instead of linein
//#define PASSTHRU //direct copy of input to output samples
//#define DEBUG

// GUItool: begin automatically generated code
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

#ifdef USE_SD
AudioPlaySdWav           playSdWav1;
#else
AudioInputI2S            i2s1;           //xy=139.1999969482422,214.99998474121094
#endif
AudioRecordQueue         queue1;         //xy=369.20001220703125,209.99998474121094
AudioRecordQueue         queue3;         //xy=370.20001220703125,262.20001220703125
AudioPlayQueue           queue2;         //xy=546.2000122070312,210.1999969482422
AudioPlayQueue           queue4;         //xy=548.2000122070312,268.20001220703125
AudioOutputI2S           i2s2;           //xy=750.2000122070312,216.99998474121094
#ifdef USE_SD
AudioConnection          patchCord1(playSdWav1, 0, queue1, 0);
AudioConnection          patchCord2(playSdWav1, 1, queue3, 0);
#else
AudioConnection          patchCord1(i2s1, 0, queue1, 0);
AudioConnection          patchCord2(i2s1, 1, queue3, 0);
#endif
AudioConnection          patchCord3(queue2, 0, i2s2, 0);
AudioConnection          patchCord4(queue4, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=132.20001220703125,142

#ifdef USE_SD
#define SDCARD_CS_PIN    10
#define SDCARD_MOSI_PIN  7
#define SDCARD_SCK_PIN   14
#else 
const int myInput = AUDIO_INPUT_LINEIN;
#endif
int32_t agcGain1=0x08000000;//.5 in Q3.29 format
int32_t agcGain2=0x08000000;//.5 in Q3.29 format
int32_t ref=293504;//.07*AUDIO_BLOCK_SAMPLES in Q15 format(2293*128)
#ifdef DEBUG
uint32_t iter=0;
#endif

void setup(void)
{
  Serial.begin(9600);
  AudioMemory(32);
  sgtl5000_1.enable();
  sgtl5000_1.volume(0.5);
#ifdef USE_SD
  SPI.setMOSI(SDCARD_MOSI_PIN);
  SPI.setSCK(SDCARD_SCK_PIN);
  if (!(SD.begin(SDCARD_CS_PIN))) {
    while (1) {
      Serial.println("Unable to access the SD card");
      delay(500);
    }
  }
  delay(1000);
#else
  sgtl5000_1.inputSelect(myInput);
#endif
  // Start the record queue
  queue1.begin();
  queue3.begin();
}

void loop(void)
{
#ifdef DEBUG
  uint32_t t;
#endif
#ifdef USE_SD
  if (playSdWav1.isPlaying() == false) 
  {
    Serial.println("Start playing");
    playSdWav1.play("Tower.wav");
    delay(10);
  }
#endif

//--------------------------- Left channel -----------------------------
  if (queue1.available() >= 1) //Left Channel
  {
#ifdef DEBUG
    t = ARM_DWT_CYCCNT;
#endif
    agcCalc(queue1.readBuffer(),queue2.getBuffer(),&agcGain1);
    // Free the input audio buffer
    queue1.freeBuffer();
    // and play it back into the audio queue
    queue2.playBuffer();
#ifdef DEBUG
    t = ARM_DWT_CYCCNT - t;
    iter++;
    if (iter==1000)
    {
      iter=0;
      Serial.print("left queue processing duration in cycles/sample: "); //7.7 cycles/sample
      Serial.print((float)t/(float)AUDIO_BLOCK_SAMPLES);
      Serial.print(" AudioMemoryUsageMax: ");
      Serial.print(AudioMemoryUsageMax()); //32
      Serial.print(" agcGain: ");
      Serial.println((float)agcGain1/(float)0x10000000);
    }
#endif
  } //if (queue1.available() >= 1)

//--------------------------- Right channel -----------------------------
  if (queue3.available() >= 1) 
  {
    agcCalc(queue3.readBuffer(),queue4.getBuffer(),&agcGain2);
    // Free the input audio buffer
    queue3.freeBuffer();
    // and play it back into the audio queue
    queue4.playBuffer();
  } //if (queue3.available() >= 1)
  
//--------------------------- volume control ----------------------------
  static int volume = 0;
  // Volume control
  int n = analogRead(0);
  if (n != volume) 
  {
    volume = n;
    sgtl5000_1.volume((float)n / 1023.);
  }
}

void agcCalc(int16_t *inbuf,int16_t *outbuf,int32_t *agcGain)
{
  int16_t *bp1,*bp2;
  int32_t sum,sum1,sum2,sum3,res,res1,res2,res3,err;
  bp1=inbuf;
  bp2=outbuf;
#ifdef PASSTHRU
    // Copy from input to output buffer
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) 
    {
      *bp2++ = *bp1++;
    }
#else
    // Apply agcGain on input with saturation to output buffer
    sum=0;
    sum1=0;
    sum2=0;
    sum3=0;
    for(int i = 0;i < AUDIO_BLOCK_SAMPLES>>2;i++) 
    { 
      //ref. dspinst.h in Teensy audio library
      res=signed_multiply_32x16b(*agcGain>>12,*bp1++);
      res1=signed_multiply_32x16b(*agcGain>>12,*bp1++);
      res2=signed_multiply_32x16b(*agcGain>>12,*bp1++);
      res3=signed_multiply_32x16b(*agcGain>>12,*bp1++);
      *bp2++=saturate16(res);
      *bp2++=saturate16(res1);
      *bp2++=saturate16(res2);
      *bp2++=saturate16(res3);
      sum+=abs(res);
      sum1+=abs(res1);
      sum2+=abs(res2);
      sum3+=abs(res3);
    }
    sum+=(sum1+sum2+sum3);
    // Calculate new agcGain for next input buffer
    err=ref-sum; 
    *agcGain+=err>>3; //update gain 
    *agcGain=min(*agcGain,0x3fffffff); //keep agcGain < 4.0
#endif
}
 
Once I receive all the HW I ordered for hooking up the AGC to my TV/Stereo, I want to add a pre-emphasis filter upstream to the AGC. It's for higher frequencies for us older folks. I have used the FIR library programmed with coefficients I designed offline. I am able to make it work with the AGC (see attached template and recording of a composite sine wave at the sound card's stereo headphone jack. I would like to know if anyone has an easy method for tracing the digital-only xfer function without going thru the sound card (maybe viatemplateFIRAGC.jpgrecordingFIRAGC.jpg the SD card??). This would be especially useful for selective biquad designs. Thanks.
 
Status
Not open for further replies.
Back
Top