Design filter coefficients of FIR and Biquad

Status
Not open for further replies.
A

Decimating 10:1 can only be done AFTER you brick-wall filter. Else you will end up with aliasing mirror images of frequencies in the output signal, which will give you false readings.

Here brick wall filtering in the sense doing filtering by software filters right with steep cut off? Not from analog filters right?
 
"jwatte" Refer to below 1024 FFT in teensy
analyze_fft1024.cpp
Code:
/* Audio Library for Teensy 3.X
 * Copyright (c) 2014, Paul Stoffregen, paul@pjrc.com
 *
 * Development of this audio library was funded by PJRC.COM, LLC by sales of
 * Teensy and Audio Adaptor boards.  Please support PJRC's efforts to develop
 * open source software by purchasing Teensy or other PJRC products.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice, development funding notice, and this permission
 * notice shall be included in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

#include "analyze_fft1024.h"
#include "sqrt_integer.h"
#include "utility/dspinst.h"


// 140312 - PAH - slightly faster copy
static void copy_to_fft_buffer(void *destination, const void *source)
{
	const uint16_t *src = (const uint16_t *)source;
	uint32_t *dst = (uint32_t *)destination;

	for (int i=0; i < AUDIO_BLOCK_SAMPLES; i++) {
		*dst++ = *src++;  // real sample plus a zero for imaginary
	}
}

static void apply_window_to_fft_buffer(void *buffer, const void *window)
{
	int16_t *buf = (int16_t *)buffer;
	const int16_t *win = (int16_t *)window;;

	for (int i=0; i < 1024; i++) {
		int32_t val = *buf * *win++;
		//*buf = signed_saturate_rshift(val, 16, 15);
		*buf = val >> 15;
		buf += 2;
	}

}

void AudioAnalyzeFFT1024::update(void)
{
	audio_block_t *block;

	block = receiveReadOnly();
	if (!block) return;

#if defined(KINETISK)
	switch (state) {
	case 0:
		blocklist[0] = block;
		state = 1;
		break;
	case 1:
		blocklist[1] = block;
		state = 2;
		break;
	case 2:
		blocklist[2] = block;
		state = 3;
		break;
	case 3:
		blocklist[3] = block;
		state = 4;
		break;
	case 4:
		blocklist[4] = block;
		state = 5;
		break;
	case 5:
		blocklist[5] = block;
		state = 6;
		break;
	case 6:
		blocklist[6] = block;
		state = 7;
		break;
	case 7:
		blocklist[7] = block;
		// TODO: perhaps distribute the work over multiple update() ??
		//       github pull requsts welcome......
		copy_to_fft_buffer(buffer+0x000, blocklist[0]->data);
		copy_to_fft_buffer(buffer+0x100, blocklist[1]->data);
		copy_to_fft_buffer(buffer+0x200, blocklist[2]->data);
		copy_to_fft_buffer(buffer+0x300, blocklist[3]->data);
		copy_to_fft_buffer(buffer+0x400, blocklist[4]->data);
		copy_to_fft_buffer(buffer+0x500, blocklist[5]->data);
		copy_to_fft_buffer(buffer+0x600, blocklist[6]->data);
		copy_to_fft_buffer(buffer+0x700, blocklist[7]->data);
		if (window) apply_window_to_fft_buffer(buffer, window);
		arm_cfft_radix4_q15(&fft_inst, buffer);
		// TODO: support averaging multiple copies
		for (int i=0; i < 512; i++) {
			uint32_t tmp = *((uint32_t *)buffer + i); // real & imag
			uint32_t magsq = multiply_16tx16t_add_16bx16b(tmp, tmp);
			output[i] = sqrt_uint32_approx(magsq);
		}
		outputflag = true;
		release(blocklist[0]);
		release(blocklist[1]);
		release(blocklist[2]);
		release(blocklist[3]);
		blocklist[0] = blocklist[4];
		blocklist[1] = blocklist[5];
		blocklist[2] = blocklist[6];
		blocklist[3] = blocklist[7];
		state = 4;
		break;
	}
#else
	release(block);
#endif
}

analyze_fft1024.h

Code:
/* Audio Library for Teensy 3.X
 * Copyright (c) 2014, Paul Stoffregen, paul@pjrc.com
 *
 * Development of this audio library was funded by PJRC.COM, LLC by sales of
 * Teensy and Audio Adaptor boards.  Please support PJRC's efforts to develop
 * open source software by purchasing Teensy or other PJRC products.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice, development funding notice, and this permission
 * notice shall be included in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

#ifndef analyze_fft1024_h_
#define analyze_fft1024_h_

#include "Arduino.h"
#include "AudioStream.h"
#include "arm_math.h"

// windows.c
extern "C" {
extern const int16_t AudioWindowHanning1024[];
extern const int16_t AudioWindowBartlett1024[];
extern const int16_t AudioWindowBlackman1024[];
extern const int16_t AudioWindowFlattop1024[];
extern const int16_t AudioWindowBlackmanHarris1024[];
extern const int16_t AudioWindowNuttall1024[];
extern const int16_t AudioWindowBlackmanNuttall1024[];
extern const int16_t AudioWindowWelch1024[];
extern const int16_t AudioWindowHamming1024[];
extern const int16_t AudioWindowCosine1024[];
extern const int16_t AudioWindowTukey1024[];
}

class AudioAnalyzeFFT1024 : public AudioStream
{
public:
	AudioAnalyzeFFT1024() : AudioStream(1, inputQueueArray),
	  window(AudioWindowHanning1024), state(0), outputflag(false) {
		arm_cfft_radix4_init_q15(&fft_inst, 1024, 0, 1);
	}
	bool available() {
		if (outputflag == true) {
			outputflag = false;
			return true;
		}
		return false;
	}
	float read(unsigned int binNumber) {
		if (binNumber > 511) return 0.0;
		return (float)(output[binNumber]) * (1.0 / 16384.0);
	}
	float read(unsigned int binFirst, unsigned int binLast) {
		if (binFirst > binLast) {
			unsigned int tmp = binLast;
			binLast = binFirst;
			binFirst = tmp;
		}
		if (binFirst > 511) return 0.0;
		if (binLast > 511) binLast = 511;
		uint32_t sum = 0;
		do {
			sum += output[binFirst++];
		} while (binFirst <= binLast);
		return (float)sum * (1.0 / 16384.0);
	}
	void averageTogether(uint8_t n) {
		// not implemented yet (may never be, 86 Hz output rate is ok)
	}
	void windowFunction(const int16_t *w) {
		window = w;
	}
	virtual void update(void);
	uint16_t output[512] __attribute__ ((aligned (4)));
private:
	void init(void);
	const int16_t *window;
	audio_block_t *blocklist[8];
	int16_t buffer[2048] __attribute__ ((aligned (4)));
	//uint32_t sum[512];
	//uint8_t count;
	uint8_t state;
	//uint8_t naverage;
	volatile bool outputflag;
	audio_block_t *inputQueueArray[1];
	arm_cfft_radix4_instance_q15 fft_inst;
};

#endif
 
Yes, that's the FFT. It doesn't have an IFFT or a FFT multiply, though, which means doing the full convolution needs additional code.

Regarding anti-aliasing, if you use an analog filter at 2 kHz, you could sample at 4 kHz, and not need to decimate. However, analog brick wall filters are really hard and expensive to make (long series of very high precision opamps and inductor/capacitor cascades.) Thus, it's typically better to use a shallower anti-aliasing filter and use a higher sampling frequency, and then brick wall filter in software.

"brick wall" is just a term for a low-pass filter with a very sharp cut-off curve (60 dB/octave or more, typically.) Those are typically used specifically for anti-aliasing purposes.

The highest-quality digital brick wall filters are long FIR filters (512 coefficients or more.) To implement those efficiently, you need to implement some form of "long convolution," of which overlap-add using FFT - multiply - IFFT is the most efficient known implementation. Given that you only have the FFT primitive in the current Teensy library, you can't do all of that using only code that ships in Teensyduino, but there are a number of other FFT libraries out there you might use instead.

In the end, you can get "good enough" by ganging a number of bi-quads, then decimating, then doing the FFT analysis of collected samples after decimation. You'd get a little more aliasing, and the bi-quads would introduce more phase warp, but for visualizing/detecting specific speech-related sounds (which is what the 350-2k range sounds like to me) that'd probably be "good enough."

So, input -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> buffer
Set the biquads to 3 dB point at around 1.5 kHz -- this will give some loss at 2 kHz, but if you decimate 10:1, you don't want much frequency content above 2 kHz.
Use every 10th sample (this is decimation) and when you have 1024 samples, feed that buffer into the AudioAnalysisFFT1024 class for a spectrum. Ignore the first 80 or so buckets/bins you get out, because those will represent signals < 350 Hz.
This should be quite easy to mock up using the GUI tool, and should let you see whether the mechanism is "good enough" or not.

note that, with 1024 sample buffers, at 4410 Hz sampling rate, you'll only get one spectrum out every 250 milliseconds or so. This is the necessary trade-off to get the narrower bins.
 
So, input -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> buffer
Set the biquads to 3 dB point at around 1.5 kHz -- this will give some loss at 2 kHz, but if you decimate 10:1, you don't want much frequency content above 2 kHz.

This should be quite easy to mock up using the GUI tool, and should let you see whether the mechanism is "good enough" or not.

Alright. In GUI design tool "one biquad" internally can cascade upto four "direct 2 form" stages. Your asking just to adjust this one biquad with internal four stages as well as biquad2 --> biquad3 .. like wise?

Or just take biquad1 and configure it to four stages?Which has 1.5kHz cut off?
 
Also to obtain much better resolution can't we decimate 44100 by 10 which get 4410Hz with about 4.3Hz bin size. Don't know whether is this possible?

Something to consider is 1024 samples at 4.41 kHz sample rate is 232 ms of data, or nearly a quarter of 1 second. If this were used for something like music visualization, an update rate of only 4 Hz probably wouldn't seem very satisfying.
 
As the very first step i implemented a low pass filter which has cutoff frequency around 2kHz by using 1 biquad which has 4 stages. Simply 8th order filter.By using below MATLAB code obtained coefficients

Code:
clc
clear
close all

%Lowpass Filter 
f1=2000;
fsh=44100/2;

[z1,p1,k1]=butter(8,f1/fsh,'low');
[s1,g1]=zp2sos(z1,p1,k1)
s1x=s1(:,[1,2,3,5,6]);

fprintf('{%.8f, %.8f, %.8f, %.8f, %.8f},\n',s1x')

And wrote arduino code as below.

Code:
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// GUItool: begin automatically generated code
AudioInputAnalog         adc1;           //xy=68,274
AudioFilterBiquad        biquad1;        //xy=305,273
AudioOutputUSB           usb1;           //xy=559,160
AudioOutputAnalog        dac1;           //xy=565,313
AudioConnection          patchCord1(adc1, biquad1);
AudioConnection          patchCord2(biquad1, 0, usb1, 0);
AudioConnection          patchCord3(biquad1, 0, usb1, 1);
AudioConnection          patchCord4(biquad1, dac1);
// GUItool: end automatically generated code

double gain = 8.6096e-08 ; 
double Coeff[4][5]=
{{1.0,  2.0, 1.0, -1.50453551, 0.56775487},
 {1.0,  2.0, 1.0, -1.55572300, 0.62109321},
 {1.0,  2.0, 1.0, -1.66008363, 0.72983899},
 {1.0, -2.0, 1.0, -1.81956193, 0.89601845}};

 

void setup() {
  // Audio connections require memory to work.  For more
  // detailed information, see the MemoryAndCpuUsage example
  AudioMemory(120);

// AudioNoInterrupts();
  for(int ii=0;ii<4;ii++)
    for(int jj=0; jj<3;jj++) Coeff[ii][jj] *= sqrt(sqrt(gain));
    
  for(int ii=0; ii<4; ii++)
  { biquad1.setCoefficients(ii,Coeff[ii]);
  }
 // noise1.amplitude(0.7);

//  AudioInterrupts();
}

void loop() {
  // Do nothing here.  The Audio flows automatically
  // When AudioInputAnalog is running, analogRead() must NOT be used.
}


Simply What i'm doing is taking Inputs from ADC object and pass the samples through biquad. Output is going to connect into USB + DAC(listening from microphone). By using "Audacity" analyzing the spectrum.
But when analyzing spectrum seems like low pass filter is not properly working. Can't figure out what's wrong with this. Can you guys check and let me know what's wrong with this?
 
Simply What i'm doing is taking Inputs from ADC object and pass the samples through biquad. Output is going to connect into USB + DAC(listening from microphone). By using "Audacity" analyzing the spectrum.
But when analyzing spectrum seems like low pass filter is not properly working. Can't figure out what's wrong with this. Can you guys check and let me know what's wrong with this?

I did no look all code but
the sign of b1 of the last biquad section is wrong
(all LP biquad have positive b values (1 2 1))
 
the sign of b1 of the last biquad section is wrong
(all LP biquad have positive b values (1 2 1))

Exactly. What a silly mistake. Now its perfectly working but only when gain is distribute among all four stages and when using sqrt(sqrt(gain));

In the end, you can get "good enough" by ganging a number of bi-quads, then decimating, then doing the FFT analysis of collected samples after decimation. You'd get a little more aliasing, and the bi-quads would introduce more phase warp, but for visualizing/detecting specific speech-related sounds (which is what the 350-2k range sounds like to me) that'd probably be "good enough."

So, input -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> biquad -> buffer
Set the biquads to 3 dB point at around 1.5 kHz -- this will give some loss at 2 kHz, but if you decimate 10:1, you don't want much frequency content above 2 kHz.

As jwatte stated on here do we need to cascade another biquad in series at GUI tool(biquad2)??? Or just only 1 biquad and configure it to 4 stages?
How to set the biquads to 3 dB point?
 
Assuming a bi-quad with four stages gives you four stages of 12 dB/octave each:

4 stages gives you 48 dB per octave. If your cut-off is 1.5 kHz, and your half sampling (nyquistt) frequency is 2.2 kHz, that's only half an octave, so you only have 27 dB rejection of aliasing (half an octave plus the 3 dB point at 1.5 kHz.) Two biquads, each with four stages, would give you twice as much rejection (51 dB) which would start getting okay. (Design targets are often 60 dB or better; much better for audiophile quality but that's not the target here.)

This is why over-sampling (aiming for a higher sampling frequency, such as 8.8 kHz) gives you so much more headroom; aliasing doesn't start folding down until another octave up, AND the folded aliasing stays out of your area of interest for another (reflected) octave. So, aliasing actually affecting your measurement area would be attenuated by three effective octaves, instead of one-half to one. But you'd have to live with the coarser frequency bins at 8.8 KHz sampling frequency. (I actually think that would be fine for this application.)

The "3 dB point" is the "design frequency" you put into the filter design formulas. I e, when you say "a low-pass filter with a corner frequency of 1.5 kHz" that means that the low-pass filter will have 3 dB attenuation at the frequency 1.5 kHz. (It will actually have some measurable fraction of attenuation all the way down to DC -- you don't get a 100% flat filter!) Another way to look at it is that the 3 dB point is where the filter goes from "mostly flat" into "mostly starting to fall off" for a variety of statistical properties.

In this case, input -> biquad-times-4 -> decimate-by-5 -> FFT-1024 -> throw away bins below 350 and above 2 kHz -> you have about 192 sample points that actually matter, and you get them about every 125 milliseconds. I think you can do a lot of good with that!

At this point, you really should try playing around with the options and build your own picture of how things work. This will be easier for you if you have an easy way of generating signals of known frequency and amplitude. If you have Audacity, you can for example generate a number of sine wave files and play them back through some good-quality speakers. (Steps up from that include things like using synthesizers, or signal generators, and plugging signals straight into the electrical interface instead of going through microphones/speakers.)
 
Just three quick suggestions:

- you could use ZoomFFT to get higher resolution of your FFT for specific parts of your spectrumwithout using higher FFT sizes OR

- if you are only interested in determining the frequency of the loudest signal in your audio data (which hopefully is the frequency of the mosquito you want to identify), use an FFT of small size (say 256) and use quadratic interpolation of the three bins (bin with loudest signal and the two adjacent bins) to determine the exact frequency of your signal

- decimating by lowpass filtering with an IIR and subsequent downsampling is probably more processor-intense than taking the optimized CMSIS functions, which use polyphase implementations of FIR filters before downsampling. Also consider multistage decimation to keep number of FIR taps low and efficient.

All three techniques have been implemented in the Teensy Convolution SDR with the Teensy 3.6 (does not run on a 3.2, because you need the FPU), you can have a look at the code on github, if you are interested (search for DD4WH and Teensy Convolution SDR). in the code, you will find "autotune", which implements the quadratic interpolation and "ZoomFFT" and decimation with CMSIS lib. Also reading Rich Lyons book "Understanding Digital Signal Processing" will probably help a lot.

Sorry for creating further confusion with my suggestions ;-),

Frank DD4WH
 
Status
Not open for further replies.
Back
Top