Creating Digital Gain from I2S microphone

TinKanMan

Member
Hello, delving into the Teensy world, and I'm excited.

I have got Teensy to act as a multi-channel USB Audio Interface with help from this thread: https://forum.pjrc.com/index.php?th...-multi-channel-outputs-not-just-stereo.70176/

So far, I have attached a pair of I2S microphones (ICS43434) to a Teensy 4.0 and it is working well. However, I am a little disappointed by the input gain form these microphones. I intend to record background noise, so the levels are coming in very quiet. I get poor resolution by just boosting the 16-bit recording I get from the Teensy. So I have been digging around to find a way to "boost" the signal within the mics/Teensy signal chain...

The microphones themselves apparently transmit a 24-bit PCM signal through I2S (using a word length of 32 bits I believe), however the Teensy Audio library is 16-bit, so I lose 8 bits of data in a 24-to-16bit conversion process somewhere (I assume - most likely the last 8 bits just gets thrown away).

It occurs to me that if I can get into that conversion process, I could create a "digital gain", where the quiet signal from the 24bit samples are converted to louder 16bit samples, losing minimal resolution. If the gain factor is a power of 2, then this conversion should be no more than omitting the leading bits from the 24bit sample (with some consequences accepted).

I was hoping that I might be able to do such a thing by building a new child class from the AudioInputI2S class along the lines of below:
C++:
class AudioInputI2SGain : public AudioInputI2S
{
public:
    AudioInputI2SGain(int gain /** insert integer gain factor */) : AudioInputI2S(void) {/** some constructor**/ }
    /** some methods as needed **/
}
...but I am likely not digging in the right place. I am willing to do some work to get this working, but I could do with a few pointers as to where to look in the Teensy 4 core Audio library. Importantly, where in the library does the 24bit samples from the I2S inputs get decimated to 16bits? The AudioStream class? DMAChannel class?

I'll be willing to work to get this sorted and share the code with others. Thanks in advance.


==== More Info ====
spec sheet for the ICS43434: https://cdn-shop.adafruit.com/product-files/6049/6049_DS-000069-ICS-43434-v1.2.pdf

Paul says 24 bit may come in future ( https://www.pjrc.com/teensy/td_libs_AudioRoadmap.html ), but I don't think I need to wait for this.
 
Last edited:
This would be my recommendation: switch to the great OpenAudio library , this will allow you to capture the full 24bits signal and manipulate it with floats - the Teensy 4.x has a built-in hw fpu, which is more than capable of doing audio processing with full F32 precision instead of integers. It also avoids issues with clipping, giving you ungodly amounts of headroom to play with.

Then, it's as simple as adding a gain block to get to the level you need.

Note that the effective number of bits is always going to be less than 24 bits - with this mic you'll get a bit less than 20 bits under perfect conditions; according to the spec sheet, it has "high SNR and 120 dB SPL AOP in all operational modes" - if you trust the manufacturer, that is...

Marc
 
This would be my recommendation: switch to the great OpenAudio library , this will allow you to capture the full 24bits signal and manipulate it with floats - the Teensy 4.x has a built-in hw fpu, which is more than capable of doing audio processing with full F32 precision instead of integers. It also avoids issues with clipping, giving you ungodly amounts of headroom to play with.
Oooo. That is quite a project. I have had a look, and I have a few initial concerns about going towards the OpenAudio library, though I am glad I found it for future reference!

Firstly, I intend to use the input_i2s_hex and input_i2s_oct, not just the stereo version. I can't see an OpenAudio library version for higher numbers of channels. It can't be too difficult to transfer them over (the input_i2s_f32.cpp file is reportedly just copied from Teensy Audio Library), but then my next concern would be if the Teensy could handle so many channels at once at this higher bitrate?

Secondly, I also want to make Teensy into a multi-channel USB audio interface (see https://forum.pjrc.com/index.php?th...-multi-channel-outputs-not-just-stereo.70176/ ). So I would presumably have to port that project over to 32 bits too?

Thirdly, my intention was to have the Teensy USB audio interface to be plugged into a Raspberry Pi Zero 2, using 6 or 8 channels. I thought 6 channels would be ok, 8 channels maybe pushing it, but a data transfer rate of 16 bits, @ 48kHz, it could work. Going 32 bits doubles the data, and I worry that my Pi might not cope.

So efficiency is key in my application. I assume the base Teensy Audio library is very efficient, so I thought it best to build on that. I'd be very happy to have my concerns allayed.

Note that the effective number of bits is always going to be less than 24 bits - with this mic you'll get a bit less than 20 bits under perfect conditions; according to the spec sheet, it has "high SNR and 120 dB SPL AOP in all operational modes" - if you trust the manufacturer, that is...
Oh? I thought that the bit clock frame for I2S was 32 bits long, and I get the 24 significant bits within that frame? Though even if this were the case, with 20 good bits, my concept of "bit gain" would allow a 2^4=16x gain when decimating to 16 bits from 20.


...but I am likely not digging in the right place. I am willing to do some work to get this working, but I could do with a few pointers as to where to look in the Teensy 4 core Audio library. Importantly, where in the library does the 24bit samples from the I2S inputs get decimated to 16bits? The AudioStream class? DMAChannel class?
Having dug around some more, and reading @Paul 's post here: https://forum.pjrc.com/index.php?threads/setting-up-custom-i2s-communication.65229/#post-263104

If I go back to my original thinking (and I think aloud now), I have been looking at the input_i2s.cpp file, specifically this method:
C++:
void AudioInputI2S::isr(void)
{
    uint32_t daddr, offset;
    const int16_t *src, *end;
    int16_t *dest_left, *dest_right;
    audio_block_t *left, *right;

#if defined(KINETISK) || defined(__IMXRT1062__)
    daddr = (uint32_t)(dma.TCD->DADDR);
    dma.clearInterrupt();
    //Serial.println("isr");

    if (daddr < (uint32_t)i2s_rx_buffer + sizeof(i2s_rx_buffer) / 2) {
        // DMA is receiving to the first half of the buffer
        // need to remove data from the second half
        src = (int16_t *)&i2s_rx_buffer[AUDIO_BLOCK_SAMPLES/2];
        end = (int16_t *)&i2s_rx_buffer[AUDIO_BLOCK_SAMPLES];
        if (AudioInputI2S::update_responsibility) AudioStream::update_all();
    } else {
        // DMA is receiving to the second half of the buffer
        // need to remove data from the first half
        src = (int16_t *)&i2s_rx_buffer[0];
        end = (int16_t *)&i2s_rx_buffer[AUDIO_BLOCK_SAMPLES/2];
    }
    left = AudioInputI2S::block_left;
    right = AudioInputI2S::block_right;
    if (left != NULL && right != NULL) {
        offset = AudioInputI2S::block_offset;
        if (offset <= AUDIO_BLOCK_SAMPLES/2) {
            dest_left = &(left->data[offset]);
            dest_right = &(right->data[offset]);
            AudioInputI2S::block_offset = offset + AUDIO_BLOCK_SAMPLES/2;
            arm_dcache_delete((void*)src, sizeof(i2s_rx_buffer) / 2);
            do {
                *dest_left++ = *src++;
                *dest_right++ = *src++;
            } while (src < end);
        }
    }
#endif
}
I am thinking of doing my bit gain idea on the lines:
Code:
*dest_left++ = *src++;
*dest_right++ = *src++;

...but this is confusing... i2s_rx_buffer[] is defined as uint32_t - presumably the 32 bits taken directly from the I2S device:
Code:
DMAMEM __attribute__((aligned(32))) static uint32_t i2s_rx_buffer[AUDIO_BLOCK_SAMPLES];
...in the AudioInputI2S::isr() method, this is cast down to a int16_t:
C++:
// assign second half of DMA
src = (int16_t *)&i2s_rx_buffer[AUDIO_BLOCK_SAMPLES/2];
// ...or assign first half:
src = (int16_t *)&i2s_rx_buffer[0];
... which I originally thought was where my 24 bit samples were being decimated to 16 bits, but then I see this bit later:
Code:
            do {
                *dest_left++ = *src++;
                *dest_right++ = *src++;
            } while (src < end);
... which leads me to believe that the 32 bit i2s_rx_buffer[] was actually only storing two 16-bit samples from the Left and Right channels.

So it means to get hold of my 24bit samples, have to go deeper than the AudioInputI2S class, to where i2s_rx_buffer is set, in dma.TCD->DADDR I think. I will report further....
 
I wonder if I have come to a show stopper for my project above. Please do enlighten me or correct me...

I know that Teensy out of the box (with audio library) does 16-bit, stereo I2S, with variations in sample rate. The manual for Teensy 4 tells me on p1989-1990, that the SAI (of which I2S is a part):
• Support word length of 8 to 32 bits per word
- First word length and remaining word lengths can be configured separately
So from above, I would assume that getting all of my 24bits of resolution I2S MEMS microphone is possible (well within 32 bit word length), and not have to make do with a truncated 16bit input.

BUT! Perhaps I have misunderstood the deeper realities. I note in the Audio library code (in input_i2s.cpp ) that the DMA pulls I2S data as 32-bit words, which then get split into 16-bit left and right channels (16-bit words). Even though the manual says SAI can do 32-bit words, in the case of I2S, do the two left and right channels (each 16-bits), together count as one, 32-bit word? In other words, is the Audio Library already maxing out the bit depth of the I2S interface at an effective 16-bit word per L/R channel? I.e - is my project already doomed?

I cannot find any code that has people attempting more than 16 bits stereo (even the OpenAudio project uses 16-bit I2S input as with the main library, although the signal processing itself is in float-32).

IF! My fears are overblown, then it is clear that I need to start working with fiddling around with the AudioOutputI2S::config_i2s(bool only_bclk) method, or even the (dreaded) set_audioClock() function. Which is OK, but given the learning curve, I would like to know that I am not doomed to failure...

Thanks
 
output_spdif and output_spdif2 both configure I2S to use 32-bit words with 4 words per frame.
 
That's a good point. Thanks for the lead. I followed the config functions for both I2S and SPDIF - the clock is the same, and many configuratios are the same, except for these blocks of code:

in AudioOutputI2S::config_i2s(bool only_bclk):
C++:
    if (!only_bclk)
    {
      CORE_PIN23_CONFIG = 3;  //1:MCLK
      CORE_PIN20_CONFIG = 3;  //1:RX_SYNC  (LRCLK)
    }
    CORE_PIN21_CONFIG = 3;  //1:RX_BCLK

    int rsync = 0;
    int tsync = 1;

    I2S1_TMR = 0;
    //I2S1_TCSR = (1<<25); //Reset
    I2S1_TCR1 = I2S_TCR1_RFW(1);
    I2S1_TCR2 = I2S_TCR2_SYNC(tsync) | I2S_TCR2_BCP // sync=0; tx is async;
            | (I2S_TCR2_BCD | I2S_TCR2_DIV((1)) | I2S_TCR2_MSEL(1));
    I2S1_TCR3 = I2S_TCR3_TCE;
    I2S1_TCR4 = I2S_TCR4_FRSZ((2-1)) | I2S_TCR4_SYWD((32-1)) | I2S_TCR4_MF
            | I2S_TCR4_FSD | I2S_TCR4_FSE | I2S_TCR4_FSP;
    I2S1_TCR5 = I2S_TCR5_WNW((32-1)) | I2S_TCR5_W0W((32-1)) | I2S_TCR5_FBT((32-1));

    I2S1_RMR = 0; // RMR - looks like a register to IMXRT_SAI1.RMR
    //I2S1_RCSR = (1<<25); //Reset
    I2S1_RCR1 = I2S_RCR1_RFW(1); // Receive FIFO watermark p2021
    I2S1_RCR2 = I2S_RCR2_SYNC(rsync) | I2S_RCR2_BCP 
                // 0=async 1=sync with trasmitter
                // BitClockPolarity (active on low), p2022
            | (I2S_RCR2_BCD | I2S_RCR2_DIV((1)) | I2S_RCR2_MSEL(1));
                //Bit Clock Direction (generated internally by master mode)
                // Bit clock divide by (DIV+1)*2 (Divides down the audio master clock to generate the bit clock when configured for an internal bit clock. The division value is (N + 1) * 2.)
                // MCLK select, 0=bus clock, 1=I2S0_MCLK, p2024
    I2S1_RCR3 = I2S_RCR3_RCE; // Receive Channel Enable, p2024
    I2S1_RCR4 = I2S_RCR4_FRSZ((2-1)) | I2S_RCR4_SYWD((32-1)) | I2S_RCR4_MF
                // Frame Size (Configures the number of words in each frame. The value written must be one less than the number of words in the frame. For example, write 0 for one word per frame. The maximum supported frame size is 32 words.))
                // Sync Width - Configures the length of the frame sync in number of bit clocks. The value written must be one less than the number of bit clocks. For example, write 0 for the frame sync to assert for one bit clock only. The sync width cannot be configured longer than the first word of the frame. 
                // MSB First - Configures whether the LSB or the MSB is received first. 0b - LSB is received first. 1b - MSB is received first.    
            | I2S_RCR4_FSE | I2S_RCR4_FSP | I2S_RCR4_FSD;
                // Frame Sync Early Frame Sync Early (0b - Frame sync asserts with the first bit of the frame, 1b - Frame sync asserts one bit before the first bit of the frame.)
                // Frame Sync Polarity - Configures the polarity of the frame sync. (0b - Frame sync is active high, 1b - Frame sync is active low.)
                // Frame Sync Direction - Configures the direction of the frame sync. (0b - Frame Sync is generated externally in Slave mode, 1b - Frame Sync is generated internally in Master mode)
    I2S1_RCR5 = I2S_RCR5_WNW((32-1)) | I2S_RCR5_W0W((32-1)) | I2S_RCR5_FBT((32-1));
                // Word N width - Configures the number of bits in each word, for each word except the first in the frame. The value written must be one less than the number of bits per word. Word width of less than 8 bits is not supported.
                // Word 0 width - Configures the number of bits in the first word in each frame. The value written must be one less than the number of bits in the first word. Word width of less than 8 bits is not supported if there is only one word per frame.
                // First Bit Shifted - Configures the bit index for the first bit received for each word in the frame. If configured for MSB First, the index of the next bit received is one less than the current bit received. If configured for LSB First, the index of the next bit received is one more than the current bit received. The value written must be greater than or equal to the word width when configured for MSB First. The value written must be less than or equal to 31-word width when configured for LSB First.
...and in AudioOutputSPDIF::config_SPDIF(void):
C++:
    int rsync = 0;
    int tsync = 1;
    // configure transmitter
    I2S1_TMR = 0;
    I2S1_TCR1 = I2S_TCR1_RFW(0);  // watermark
    I2S1_TCR2 = I2S_TCR2_SYNC(tsync) | I2S_TCR2_MSEL(1) | I2S_TCR2_BCD | I2S_TCR2_DIV(0);
    I2S1_TCR3 = I2S_TCR3_TCE;

    //4 Words per Frame 32 Bit Word-Length -> 128 Bit Frame-Length, MSB First:
    I2S1_TCR4 = I2S_TCR4_FRSZ(3) | I2S_TCR4_SYWD(0) | I2S_TCR4_MF | I2S_TCR4_FSP | I2S_TCR4_FSD;
    I2S1_TCR5 = I2S_TCR5_WNW(31) | I2S_TCR5_W0W(31) | I2S_TCR5_FBT(31);

    //I2S1_RCSR = 0;
    I2S1_RMR = 0;
    //I2S1_RCSR = (1<<25); //Reset
    I2S1_RCR1 = I2S_RCR1_RFW(0);
    I2S1_RCR2 = I2S_RCR2_SYNC(rsync) | I2S_TCR2_MSEL(1) | I2S_TCR2_BCD | I2S_TCR2_DIV(0);
    I2S1_RCR3 = I2S_RCR3_RCE;
    I2S1_RCR4 = I2S_TCR4_FRSZ(3) | I2S_TCR4_SYWD(0) | I2S_TCR4_MF | I2S_TCR4_FSP | I2S_TCR4_FSD;
    I2S1_RCR5 = I2S_TCR5_WNW(31) | I2S_TCR5_W0W(31) | I2S_TCR5_FBT(31);

So the solution to my problem is somewhere in this code... I've taken a bit of time writing the parameters from the manual to the code above... (I'll report back).
 
Last edited:
So the solution to my problem is somewhere in this code... I've taken a bit of time writing the parameters from the manual to the code above... (I'll report back).
The solution was not in that code, but in the AudioInputI2S::begin() function:
C++:
    dma.TCD->SADDR = (void *)((uint32_t)&I2S1_RDR0 + 0); // take the full 32 bit (not just upper half)
    //dma.TCD->SADDR = (void *)((uint32_t)&I2S1_RDR0 + 2); // Original Setting
...I am new to programming micros, and I made the mistake thinking that the data that comes in from the i2S (MSB first) would be stored in the same order in the RDR0 register. It is not - the register, being Little Endian, stores the MSB last. That distinction kept this newbie tearing his hair out.

To anyone playing around with DMA, I recommend you visit this thread: https://forum.pjrc.com/index.php?threads/teensy-4-1-how-to-start-using-dma.63353/

But I finally made the module to add gain to I2S. I have written versions for Quad, Hex, and Oct I2S inputs too:

I2S Gain Module
The attached files contain classes that replace the AudioInputI2S, AudioInputI2SQuad, AudioInputI2SHex and AudioInputI2SOct classes, and can be used interchangeably on Teensy 4.X.

PURPOSE:
This class is a replacement of AudioInputI2 to add a fixed digital gain to I2S inputs.

Teensy Audio Library currently only uses 16-bit samples in its chain and through its I2S inputs. Therefore with a 24-bit or higher bit resolution input, the extra bits are simply discarded. The AudioInputI2SGain class takes in up to 32-bit samples as an input from I2S, adds gain, then converts to the 16-bit audio stream, thus preserving a greater resolution from the input signal.

The gain control is fixed at compile time using an integer value in INPUT_I2S_GAIN_AMOUNT. The gain is applied in powers of 2 - this circumvents rounding issues, and keeps the processing load minimal.

The code has only been tested on Teensy 4.X. Tested with ICS43434 I2S microphones, which output 24-bit depth samples.

USAGE:
1. Create your Teensy Audio chain as normal using the Teensy Audio Design Tool, remembering to add the I2S input modules.
2. In your Sketch, add the following lines to the top of the Sketch file:
C++:
#include "input_i2s_gain.h"
3. In your Sketch, replace the instances of AudioInputI2S class with AudioInputI2SGain.
4. Copy the input_i2s_gain.h and input_i2s_gain.cpp files to your sketch folder.
5. In the input_i2s_gain.h file, set the desired gain by editing this line:
C++:
#define INPUT_I2S_GAIN_AMOUNT X //...replacing X with an integer between 0 to 16.
The input signal will have gain applied at a magnitude of 2^X. For example, if X=0, no gain is applied.
If X=1, the signal is amplified by 2x, if X=2, the signal is amplified by 4x, and so on up to X=16

=== REQUEST FOR REVIEW ===
As I said earlier, I am new to micro controller programming, and I wish to have a particular part of my code reviewed. The part that adds gain to a 32 bit number then converts to a 16 bit integer is in this static member function:
C++:
#define GAIN_AMOUNT 3
int16_t AudioInputI2SGain::gain(const int32_t* sample) {
    int16_t out_sample = (int16_t)((*sample)/(pow(2,16-GAIN_AMOUNT))); // power scales between 16 (no gain) and 8 (max gain), for a 24 bit input
    return out_sample;
}
This is a safe way to add gain. The pow(2,16-GAIN_AMOUNT) should be calculated at compile-time. The division operation scales the 32 bit number ready for conversion to a 16-bit integer.

I am concerned though that this division operation is too computationally expensive. I have been looking at bit shifting operations to perform the division, but I really can't get my head around it for twos complement system. I'd be grateful if anyone can suggest a more performant bit of code here.
 

Attachments

  • input_i2s_gain.zip
    27.8 KB · Views: 25
Last edited:
Code:
  int16_t out_sample = (int16_t) ((*sample) >> (16 - GAIN_AMOUNT));

If GAIN_AMOUNT is constant declare the method inline for extra performance, although the compiler probably figures this out already.
 
I feel like there should be some limit checking in there rather than a direct cast (truncation) to 16 bits.
Also note that for negative values, shifting will round down but a division will round towards zero. E.g. -1 >> 1 = -1, but -1 / pow(2,1) = 0.
 
Last edited:
Code:
  int16_t out_sample = (int16_t) ((*sample) >> (16 - GAIN_AMOUNT));

If GAIN_AMOUNT is constant declare the method inline for extra performance, although the compiler probably figures this out already.
This works! Thank you, that saved a lot of bother. I have the method declared as a static method, so I believe the compiler takes care of the rest as with inline.

I feel like there should be some limit checking in there rather than a direct cast (truncation) to 16 bits.
So I think you are right. With just bit shifting for division, when clipping occurs, the sample value isn't just clipped at its extreme min/max value, but assumes a new integer altogether, with results like in the waveform below:
Screenshot 2025-09-29 at 20.37.51.png


Pretty horrible. I thought maybe I could live with it, but it feels wrong to leave it like this. So checking limits I think is important. I'll try and do some more learning about how to do this performantly.
Also note that for negative values, shifting will round down but a division will round towards zero. E.g. -1 >> 1 = -1, but -1 / pow(2,1) = 0.
I think in the context of audio samples, this is ideal. Consider two integer values: -3 and +3. The difference between them is +6. If you were to divide these integers:
-3/2 = -1.5 => round down to -2
+3/2 = +1.5 => round down to +1
The difference in value between the divided samples is -2 -1 = 3. Before the division, the original difference was +6. Divide the original difference between samples: 6/2 = +3, which is what you want.

Given that samples' relative values to each other give audio its quality, I happily don't have to worry about this quirk about signed integers (I have enough to worry about elsewhere!).
 
OK, so how about:
C++:
int16_t out_sample = (int16_t) signed_saturate_rshift(*sample, 16, 16 - GAIN_AMOUNT);
This is provided in Audio/utility/dspinst.h, so available to any audio object: see here; if you look at the KINETISL version for cores without the DSP instructions, you can see how it works internally. Takes 1 CPU cycle.

If I were coding this, I would not make the gain fixed at compile time, would probably make the gain per-channel, and have a separate flow for GAIN_AMOUNT==0 for all channels, for a super-efficient default case.
 
I think in the context of audio samples, this is ideal. Consider two integer values: -3 and +3. The difference between them is +6. If you were to divide these integers:
-3/2 = -1.5 => round down to -2
+3/2 = +1.5 => round down to +1
The difference in value between the divided samples is -2 -1 = 3. Before the division, the original difference was +6. Divide the original difference between samples: 6/2 = +3, which is what you want.
You can cherry-pick examples that will round either way. Samples of -3 and -4 shifted by 2 will end up with a difference of 0, whereas -2 and -3 will end up with a difference of 1.
The problem with shifting is that a negative value, no matter how small, can never be reduced to zero. Consider the case where you have "original" samples of 0 and -1 with 0 gain (shift of 16):
0 >> 16 = 0
-1 >> 16 = -1
This is effectively saying 1 / 32768 is equal to 1, which is a very significant approximation.
 
OK, so how about:
C++:
int16_t out_sample = (int16_t) signed_saturate_rshift(*sample, 16, 16 - GAIN_AMOUNT);
This is provided in Audio/utility/dspinst.h, so available to any audio object: see here; if you look at the KINETISL version for cores without the DSP instructions, you can see how it works internally. Takes 1 CPU cycle.
This has done the trick! Very happy to arrive at such a performant solution. Waveforms being clipped normally:
Screenshot 2025-09-30 at 09.48.54.png

Thank you to everyone for sharing their experience!
If I were coding this, I would not make the gain fixed at compile time, would probably make the gain per-channel, and have a separate flow for GAIN_AMOUNT==0 for all channels, for a super-efficient default case.
Dynamic control is in my mind. I can see that it is not too difficult or particularly cumbersome on the CPU. This weekend perhaps, I may get on to fitting these methods into the classes:
C++:
AudioInputI2SGain::gain(int16_t amount); // gain for all channels
AudioInputI2SGain::gain(int16_t amount, int16_t channel); // gain per channel
 
Great news. How about
C++:
// gain per channel, or all channels if -1
// result true if amount and channel are valid
bool AudioInputI2SGain::gain(int32_t amount, int16_t channel=-1);
amount must be a power of 2 (easy to check, and more intuitive in use); channel must be in range. Otherwise there's no change, and the method returns false.

An easy check for amount being a power of two is:
Code:
if (amount == (~(amount-1) & amount)))
    ...

and to get the correct bit shift, you use:
Code:
shift = __builtin_clz(amount) - 15; // count of leading zeroes, less 15

You could also cater for a gain of 0 muting the relevant channel(s).
 
OK, so how about:
C++:
int16_t out_sample = (int16_t) signed_saturate_rshift(*sample, 16, 16 - GAIN_AMOUNT);
This is provided in Audio/utility/dspinst.h, so available to any audio object: see here; if you look at the KINETISL version for cores without the DSP instructions, you can see how it works internally. Takes 1 CPU cycle.

If I were coding this, I would not make the gain fixed at compile time, would probably make the gain per-channel, and have a separate flow for GAIN_AMOUNT==0 for all channels, for a super-efficient default case.
So I have been pondering how to make a variable gain for each channel. Using the signed_saturate_rshift() function is actually very difficult to do, as it expects the rshift (the gain) parameter to be a compile-time constant.

Given the limited umber of rshift values, I could make a runtime dispatch with a template:
C++:
// Wrapper function for runtime dispatch
static inline int32_t signed_saturate_rshift(int32_t val, int bits, int rshift) {
    switch (rshift) {
        case 0: return signed_saturate_rshift_template<0>(val, bits);
        case 1: return signed_saturate_rshift_template<1>(val, bits);
        case 2: return signed_saturate_rshift_template<2>(val, bits);
        case 3: return signed_saturate_rshift_template<3>(val, bits);
        case 4: return signed_saturate_rshift_template<4>(val, bits);
        case 5: return signed_saturate_rshift_template<5>(val, bits);
        case 6: return signed_saturate_rshift_template<6>(val, bits);
        case 7: return signed_saturate_rshift_template<7>(val, bits);
        case 8: return signed_saturate_rshift_template<8>(val, bits);
        case 9: return signed_saturate_rshift_template<9>(val, bits);
        case 10: return signed_saturate_rshift_template<10>(val, bits);
        case 11: return signed_saturate_rshift_template<11>(val, bits);
        case 12: return signed_saturate_rshift_template<12>(val, bits);
        case 13: return signed_saturate_rshift_template<13>(val, bits);
        case 14: return signed_saturate_rshift_template<14>(val, bits);
        case 15: return signed_saturate_rshift_template<15>(val, bits);
        case 16: return signed_saturate_rshift_template<16>(val, bits);
        default:
            // Handle invalid rshift values
            __builtin_unreachable(); // Compiler hint: this should never happen
    }
}
...but this would mean a switch statement for each sample converted, which is not great.

I really need to get out of the ::isr() static method for this, and do the work in the update() method. This means working with blocks of samples passed from ::isr() to ::update(). And I can see that the library is very... "C", and not very "C++" when it comes to this area - it throws around pointers between functions, which must be very organised in its own way, but leaves this C++ programmer feeling vulnerable.

I think I got my head around audio_block_t and how these are used by AudioStream class. The problem is that (I think) I need 32-bit versions of audio_block_t, to pass data between ::isr() and ::update() methods. I would like to vaguely follow the format:
C++:
// Somewhere in the AudioInputI2S::update() method:
int32_t * block_32_left = new int32_t[AUDIO_BLOCK_SAMPLES]; // block to fill in ::isr()
int32_t * block_32_right = new int32_t[AUDIO_BLOCK_SAMPLES]; // block to fill in ::isr()
// Pass to DMA, apply gain and copy to regular 16-bit audio_block_t for transmission
delete[] block_32_left; // delete the allocation once data is copied to regular blocks
delete[] block_32_right;
Is this fine? Is there anything I need to know about the relationship between the ::isr() and update() functions that I need to be aware of? I ask this as the AudioStream::allocate() and AudioStream::release() methods do more than just allocate memory, but track memory usage.
 
Firstly I would say, whatever you do, do not use new / delete / malloc / free in an ISR. They are not designed to be thread-safe, as far as I'm aware...

That's annoying about signed_saturate_rshift() only using a constant shift value. However, it doesn't invalidate your approach: you just need to have your switch / case execute a single channel de-interleave on a per-case basis, rather than a per-sample one. It'll be slightly less efficient than the existing full buffer de-interleave into N audio blocks, but it's probably a good compromise between speed and code size. If you give the de-interleave function a stride, parameter, you can use it for all channel counts: stereo has a stride length of 2 int32_t values, quad is 4, and so on, corresponding with how the raw values are located in the DMA buffer.

Personally I'd leave the overall processing as-is, i.e. do the de-interleave in the DMA ISR, rather than push it into update().
 
Is there a significant improvement making it run-time changeable?

Are the uppermost bits from the 24bit input signal even used?
Also most of the time you have an idea of the maximum noise level and adjust up front. I don't think lot's off applications operate with 120dBA and need 16bit SNR.
 
Is there a significant improvement making it run-time changeable?

Are the uppermost bits from the 24bit input signal even used?
Also most of the time you have an idea of the maximum noise level and adjust up front. I don't think lot's off applications operate with 120dBA and need 16bit SNR.
It's nicer to have run-time changeable gain, IMO. I am thinking about making an autogain extension on the back of this another day. I have found a way to achieve this without hurting performance (I will share shortly).

So the default Teensy library (AudioInputI2S) would throw away the least significant 8 bits of a 24bit signal. This extension/AudioObject tries to recover some of those bits in the case that the signal is quiet by using this digital "bit gain" but still outputs to the 16bit audio stream that the rest of Teensy uses.

In my use case, I have an ICS43434 i2S mic, which allegedly outputs 24bit signal depth (here are other i2s mics that send less that 24 bits, but that's another story). My mic doesn't have an independent, adjustable analog gain. When recording quieter sounds, without this extension, I essentially can't use the most significant 2-3 bits (effectively I can only get 13-14 bit depth), so this extension allows me to record sounds leveraging the full 16bit-depth of the Teensy AudioStream. Also, I am using these to make sensitive measurements: it isn't about sound quality in my case.
 
Firstly I would say, whatever you do, do not use new / delete / malloc / free in an ISR. They are not designed to be thread-safe, as far as I'm aware...

That's annoying about signed_saturate_rshift() only using a constant shift value. However, it doesn't invalidate your approach: you just need to have your switch / case execute a single channel de-interleave on a per-case basis, rather than a per-sample one. It'll be slightly less efficient than the existing full buffer de-interleave into N audio blocks, but it's probably a good compromise between speed and code size. If you give the de-interleave function a stride, parameter, you can use it for all channel counts: stereo has a stride length of 2 int32_t values, quad is 4, and so on, corresponding with how the raw values are located in the DMA buffer.

Personally I'd leave the overall processing as-is, i.e. do the de-interleave in the DMA ISR, rather than push it into update().
Thank you for your suggestion - MUCH easier not having to deal with allocations. I've followed your advice and have made the changes. We now have dynamic gain setting on a per-audio-block basis. Processing is done in the ISR function. I managed to reduce it down to one switch/case call per audio block, per channel. If the signed_saturate_rshift() function is as low-overhead as you say, this extension should be very efficient indeed.

Hopefully job done! Let me know if you have any other suggestions.

TITLE: AudioInputI2SGain v0.2

PURPOSE

This class is a replacement of AudioInputI2 to add a digital gain to I2S inputs. The gain is bit-shift based, so is fast and efficient.

Teensy Audio Library currently only uses 16-bit samples in its chain and through its I2S inputs. Therefore with an input with 17bits or higher sample resolution, the extra bits are simply discarded. The AudioInputI2SGain class takes in up to 32-bit samples as an input from I2S, adds gain, then converts to the 16-bit audio stream, thus preserving a greater resolution of the input signal.

The gain is controlled using the method AudioInputI2SGain::set_gain(). The gain is a power of 2.

The code has only been tested on Teensy 4.X.

USAGE:
1. Create your Teensy Audio chain as normal using the Teensy Audio Design Tool, remembering to add the I2S input modules.
2. In your Sketch, add the following lines to the top of the Sketch file:
Code:
#include "input_i2s_gain.h"
3. In your Sketch, replace the instances of AudioInputI2S class with AudioInputI2SGain.
4. Copy the input_i2s_gain.h and input_i2s_gain.cpp files to your sketch folder.
5. In your sketch file, under setup(), or loop() functions, you can set the gain using:
C++:
AudioInputI2SGain.set_gain(X); // sets the gain of both channels to 2^X
 //...or....
 AudioInputI2SGain.set_gain(X, 0); // sets the gain of channel 0 (left) to 2^X

The input signal will have gain applied at a magnitude of 2^X. For example, if X=0, no gain is applied. If X=1, the signal is amplified by 2x, if X=2, the signal is amplified by 4x, and so on up to X=16.

An example Sketch.ino file is included with options on implementation.
 

Attachments

  • input_i2s_gain.zip
    30.1 KB · Views: 17
Nice that it can be made with almost no overhead.

After looking at the code I have some questions/remarks, which are general to embedded libraries and more :

a) If a wrong gain is set, there a print statement. This is embedded not always useful, is it the way to go?
b) There's no member to test if a gain is valid, could be useful.
c) There's no member to get the current gain. This is useful to change it "on the fly" aka increase or decrease gain.
 
a) If a wrong gain is set, there a print statement. This is embedded not always useful, is it the way to go?
b) There's no member to test if a gain is valid, could be useful.
c) There's no member to get the current gain. This is useful to change it "on the fly" aka increase or decrease gain.
on a) the print statement I put there to warn the user of an error in the input. I am new to embedded systems, and I haven't been able to find "a proper way to do error reporting", if such a thing exists. So I put that print statement in so at least there might be some feedback when the engineer is testing their application. Really more as a development aid than anything else. I'd be interested to hear how others deal with reporting errors.

on b) I suppose I've rolled this into the print statement as in (a). In the context of (c), this makes sense to implement

on c) yes, I can see how this will be useful. All it requires is an accessor. I'll do this soon.

...and thank you for the feedback!
 
It's (almost) never a good idea to put printf statements in library code, as it makes a lot of assumptions about the environment your final application will be running in. It's obviously fine while you're debugging, as long as you remember to remove them at the end.

I sort of implied in post #14 above that bool AudioInputI2SGain::gain(int32_t amount, int16_t channel=-1) should return a flag; typically this would be true to indicate success, and it's up to the programmer to check the result and take the appropriate action. Of course, they shouldn't really make an error in the first place, but sometimes its easier to ask forgiveness than permission:
C++:
void change_gain(bool up)
{
    static int32_t current_gain = 0;
    int32_t new_gain = current_gain;
    
    // compute new gain value - power of two
    if (up)
    {
        new_gain *= 2;
        if (0 == new_gain) // up from muted?
            new_gain = 1;  // yes, set low gain
    }
    else
        new_gain /= 2; // could go to 0, that's OK, mutes
        
    if (inputI2S.gain(new_gain))
        current_gain = new_gain;   
}

I do think you should take the trouble to convert the requested gain from a power-of-two value parameter to an internally-stored bit shift, for consistency with (most of) the rest of the audio library. AudioSynthWavetable is an example of incosistency here - its volume() method uses 0.0 to 1.0 values, but playFrequency() and playNote() use 0-127 (i.e. MIDI values).
 
on a) the print statement I put there to warn the user of an error in the input. I am new to embedded systems, and I haven't been able to find "a proper way to do error reporting", if such a thing exists. So I put that print statement in so at least there might be some feedback when the engineer is testing their application. Really more as a development aid than anything else. I'd be interested to hear how others deal with reporting errors.

on b) I suppose I've rolled this into the print statement as in (a). In the context of (c), this makes sense to implement

on c) yes, I can see how this will be useful. All it requires is an accessor. I'll do this soon.

...and thank you for the feedback!
I've made the changes in the post above for just the stereo-input case, in the attached.
I do think you should take the trouble to convert the requested gain from a power-of-two value parameter to an internally-stored bit shift, for consistency with (most of) the rest of the audio library. AudioSynthWavetable is an example of incosistency here - its volume() method uses 0.0 to 1.0 values, but playFrequency() and playNote() use 0-127 (i.e. MIDI values).
Which existing library object do you suggest I try to emulate? I have looked at AudioMixer, which converts a float to an integer multiplier (negative for divide, I assume). Would you suggest that I store gain as mult and use applyGain function like in mixer.cpp?:
C++:
static void applyGain(int16_t *data, int32_t mult)
{
    uint32_t *p = (uint32_t *)data;
    const uint32_t *end = (uint32_t *)(data + AUDIO_BLOCK_SAMPLES);

    do {
        uint32_t tmp32 = *p; // read 2 samples from *data
        int32_t val1 = signed_multiply_32x16b(mult, tmp32);
        int32_t val2 = signed_multiply_32x16t(mult, tmp32);
        val1 = signed_saturate_rshift(val1, 16, 0);
        val2 = signed_saturate_rshift(val2, 16, 0);
        *p++ = pack_16b_16b(val2, val1);
    } while (p < end);
}

... I guess that would be an additional CPU cycle per sample compared with my current implementation, but would work without losing resolution.
 

Attachments

  • input_i2s_small.zip
    8.7 KB · Views: 12
You don't need a shift and a multiply, but I do recommend you convert the power-of-two gain value (1, 2, 4 ... 65536) to the appropriate bit shift. You then store the shift value, rather than a mult value as the mixer does.

Once again, refer to post #14 which has some code fragments, noting that a gain of 65536 corresponds to a right-shift of 0 bits, 32768 to 1 bit ... 1 shifts 16 bits, followed by using the LSint16 of the resulting value.

For the AudioMixer4, the conversion is from a float to a 32-bit fixed point fraction, where 65536 is a gain of 1.0, 32768 is 0.5, 131072 is 2.0 and so on. A negative value actually gives a negative gain, i.e. inverts the signal, which can be useful.
 
Once again, refer to post #14 which has some code fragments, noting that a gain of 65536 corresponds to a right-shift of 0 bits, 32768 to 1 bit ... 1 shifts 16 bits, followed by using the LSint16 of the resulting value.
Thank you for the advice, as ever. I have implemented most of what you have suggested now. The old behaviour of accepting bit shift values has been moved to the set_gain_shift() method.

The standard behaviour of set_gain() is to accept a gain that is a multiple. If it is not a power of 2, the method will return false from now. There is no method to increment/decrement the gain currently, but those with coding experience can make a simple function of their own to do this.

I've also made a Base Class and rearranged class methods to make the code more maintainable.

Attached is the latest version of the plugin. I hope it serves others as it will serve me!

C++:
/**
 * TITLE: AudioInputI2SGain v0.3
 * AUTHOR: TinKanMan 13/10/2025
 * Modified from AudioInputI2S to add gain.  License as above
 *
 * PURPOSE: This class is a replacement of AudioInputI2 to add a digital gain to
 * I2S inputs.  The gain is bit-shift based, so is fast and efficient.
 *
 * Teensy Audio Library currently only uses 16-bit samples in its chain and through its I2S
 * inputs.  Therefore with an input with 17bits or higher sample resolution, the extra bits are simply
 * discarded.  The AudioInputI2SGain class takes in up to 32-bit samples as an input from I2S,
 * adds gain, then converts to the 16-bit audio stream, thus preserving a greater resolution
 * of the input signal.
 *
 * The gain is controlled using the function set_gain().  The gain is a power of 2.
 *
 * The code has only been tested on Teensy 4.X.
 *
 * USAGE: 1. Create your Teensy Audio chain as normal using the Teensy Audio Design Tool,
 * remembering to add the I2S input modules.
 * 2. In your Sketch, add the following lines to the top of the Sketch file:
#include "input_i2s_gain.h"
 * 3. In your Sketch, replace the instances of AudioInputI2S class with AudioInputI2SGain.
 * 4. Copy the input_i2s_gain.h and  input_i2s_gain.cpp files to your sketch folder.
 * 5. In your sketch file, under setup(), or loop() functions, you can set the gain using:
 *  AudioInputI2SGain.set_gain(MULT); // sets the gain of all channels to MULT - an integer multiple (must be a power of two)
 * ...or....
 *  AudioInputI2SGain.set_gain(MULT, 1); // sets the gain of channel 1 (left) to MULT - an integer multiple (must be a power of two)
 * ...or....
 *  AudioInputI2SGain.set_gain_shift(POW); // sets the gain of both channels to 2^POW
 * ...or....
 *  AudioInputI2SGain.set_gain_shift(POW, 1); // sets the gain of channel 1 (left) to 2^POW
 *
 * The input signal will have gain applied at a magnitude of 2^POW.  For example, if POW=0, no gain is applied.
 * If POW=1, the signal is amplified by 2x, if POW=2, the signal is amplified by 4x, and so on up to POW=16
 */
 

Attachments

  • input_i2s_gain_v0.3.zip
    30.1 KB · Views: 17
Back
Top