I just tried to implement a FIR low pass filter, but I'm horrible at software, so I failed miserably.
I got *something* to work, but when I tried more than 30 taps the Teensy began to drop samples. It was a mess. And with <30 taps the filter was useless. So I'll leave FIR to the professionals
So then I read this paper:
http://dspguru.com/sites/dspguru/files/cic.pdf and I thought I can't mess up too much and implemented a 3-Stage CIC filter (3 Comb and three Integrator stages). And to my surprise, after some fiddling, it works! Have a look at this:
Channel 4 (red): linear interpolation, implemented by Frank
Channel 1 (yellow): 3 stage CIC filter, pretty much exactly what you find in the paper I linked.
Channel 3 (blue): the yellow signal after the same RC low-pass used in the other measurements
This is a 9kHz signal, I chose it because the difference is best visible in this frequency range.
For comparison, here is a FFT of the same 5kHz sine I used to compare OS/no OS and filter/ no filter in post #52. The screenshots are directly comparable, same settings on the scope for all of them, but this time I used CIC filtering plus the RC-low-pass.
Compare this to the last FFT picture in #52, it's even better!
Next I'll try to expand the CIC, up to 8 stages should be possible with 32 bit integers without overflow (CIC filters have gain, so higher number of stages need a lot of headroom).
Here's the modified part in Frank's output_pt8211.cpp, this goes inside if (blockL && blockR) {...}
Code:
//memcpy_tointerleaveLR(dest, blockL->data + offsetL, blockR->data + offsetR);
for (int i=0; i< AUDIO_BLOCK_SAMPLES / 2; i++, offsetL++, offsetR++) {
int32_t valL = blockL->data[offsetL];
int32_t valR = blockR->data[offsetR];
// int32_t nL = (oldL+valL) >> 1;
int32_t nR = (oldR+valR) >> 1;
int32_t comb[3] = {0};
static int32_t combOld[2] = {0};
comb[0] = valL - oldL;
comb[1] = comb[0] - combOld[0];
comb[2] = comb[1] - combOld[1];
// comb[2] now holds input val
combOld[0] = comb[0];
combOld[1] = comb[1];
for (int j = 0; j < 4; j++) {
int32_t integrate[3];
static int32_t integrateOld[3] = {0};
integrate[0] = ( (j==0) ? (comb[2]) : (0) ) + integrateOld[0];
integrate[1] = integrate[0] + integrateOld[1];
integrate[2] = integrate[1] + integrateOld[2];
// integrate[2] now holds j'th upsampled value
*(dest+j*2) = integrate[2] >> 4;
integrateOld[0] = integrate[0];
integrateOld[1] = integrate[1];
integrateOld[2] = integrate[2];
}
// *(dest+0) = (oldL+nL) >> 1;
*(dest+1) = (oldR+nR) >> 1;
// *(dest+2) = nL;
*(dest+3) = nR;
// *(dest+4) = (nL+valL) >> 1;
*(dest+5) = (nR+valR) >> 1;
// *(dest+6) = valL;
*(dest+7) = valR;
dest+=8;
oldL = valL;
oldR = valR;
}
"Incoming" values go through the comb filters, the output is then used for the integrator filters. The next three inputs for the integrators are zero. This reduces output by 1/4, but the filter gain is is 64, so the effective filter gain is 64*(1/4)=16, which is accounted for by bitshifting (>> 4) at the output.
Edit: It turns out filter orders >3 significantly reduce amplitude in the passband; they are not useful with a resampling factor as low as 4. I'll leave the filter as it is right now and will add a higher order analog output filter when I get the parts needed.
@Frank, do you have an idea how we could add options for the Frame-Sync-Early bit and for oversampling to the output_i2s object so it can eventually be merged to the Audio lib?