Need help understanding FFT bins

Status
Not open for further replies.

harald25

Well-known member
Hello!
I am making a LED installation that is reacting to audio. The idea is that I want to be able to control what frequency the LEDs should react to live, using Touch OSC.
I've "wired up" an analog input to a biquad filter, and the biquad is "wired" to an FFT analyzer. I started with only Analog in -> FFT, but I thought I could specify even more accurately the frequency I want my LEDs to react to if I also ran the signal through a biquad using the bandpass functionality.
So, from touch OSC I can control the frequenzy, and width of the bandpass filter, and I can control the start and stop bin for the FFT readings. My confusion sets in when adjusting these parameters and trying to understand exactly how they function. Here's an example:
I'm listening to a song, and the bass I want my LEDs to react to is at 65hz. I would then set the bandpass filter to 65hz, frequency bin start to 2, and frequency bin stop to 2. The reason I set the bins to 2 and 2 is because the frequecy resolution of the FFT is 43hz (I'm using the 1024 point FFT), and bin 2 should be 43hz to 86hz. This is the part I want to get confirmed if is correct or not.
The reason I started doubting this is if I set the bandpass frequency to 1000hz, and the start + stop bins to 23, and 24, I get no reaction at all on the LEDs. The level returned by the FFT is 0.00 (or close to it).

My code is here: https://github.com/harald25/MicroTree
The file in question that does the audio stuff is this one: https://github.com/harald25/MicroTree/blob/master/src/audio_react.cpp


Hope someone can clearify to me exactly how the bins work, because I clearly don't understand it fully!
 
I suggest you remove your biquad filter, apply signals of known frequency to the input and see what bins are populated. You should be able to figure out how it works in short order.

I have played with FFT's of my own design, without any windowing functions and have good luck in interpreting my results as follows:
With a 100 hz resolution, bin zero represents any DC voltage offset of the AC signal. Bin 1 is 100 hz, bin 2 is 200 hz and so forth. A signal at 250 hz would show peaks at bins 2 and 3 and also raise the noise floor of all the bins.
 
Two things...

First, ordinary music will have very little (if any) energy at 65 Hz.

Second, you may be misunderstanding or overestimating how FFT bins work. 1024 point FFT at 44.1 kHz sample rate will have bins 43 Hz apart. Actually 43.0664 Hz, which can matter...

If you have a pure steady sine wave at *exactly* 43.0664 Hz, you can expect its energy to completely show up in that first bin. Likewise for a 86.1328 Hz waveform in the 2nd bin. That is, if you run the FFT without a "window".

But FFT without a window is pretty much worthless for arbitrary signals that aren't *perfectly* sync'd to the FFT's 1024 samples. For example, if you have a 50 Hz waveform, you might intuitively expect most of its energy to show up in the 43 Hz bin and the rest to appear in the 86 Hz bin. But that's not how FFT works.

Here's one of the pictures from the audio library tutorial, on page 28 of the PDF, where the FFT's 1024 sample a sine wave which isn't perfectly aligned to a bin.

1.png

To intuitively understand FFT limitations, you need to realize that the FFT only "sees" those 1024 points of the waveform. Implicit in the FFT algorithm is that each bin is a perfect, pure, never-ending sine wave. So the FFT assumes those 1024 points repeat over and over. Your eyes see a sine wave, because you're a human with intuition. But FFT is math, so here's what the FFT "sees":

2.png

This isn't really a sine wave. It has very sharp non-sine features every 1024 points. When you input a waveform any waveform that doesn't perfectly repeat its cycles on the 1024 boundaries, you're giving the FFT something quite non-sine to analyze. The more times it does repeat within the 1024 points, you'll get much of the energy in the intended bins, but those sharp spikes will show up as energy scattered across lots of the other bins.

The proper term for this is "spectral leakage". Google will turn up lots of info, but sadly much of is written (or copied) by academicians who love to use rigorous math and seem to disdain plain language descriptions that people can actually understand.

The "windows" is a scaling of the samples before preforming the FFT. It solves the spectral leakage problems, but also tends to "smear" the energy to nearby bins. There are lots of different window shapes which trade off these downsides, but there is no perfect answer. The FFT algorithm fundamentally assumes the data points repeat infinitely (both before and after) and gives you the set of pure perfect sine waves corresponding to that hypothetical reality.

I'm going to cut this message short and point you to the tutorial PDF, where I wrote more about this, with more pictures that hopefully explain it. Alysia and I also made a lengthy video a few years ago, which you'll find on that page. The video's material is all the same as the PDF, but if you prefer listening and watching, maybe the video can help too.

The bottom line is you can expect any given frequency to appear in several bins, when using the default Hanning window. FFT is powerful, but has a lot of limitations, especially when you wish to analyze arbitrary sounds that aren't composed only from absolutely pure waveforms that perfectly align to the 1024 points the FFT analyzes.


(skip forward to 34:56 for the FFT part)
 
Allright!
I've checked the video, and I think I understand it a little better now. After watching the video I even think the peak module is better suited for the thing I'm trying to do now!

And thank you so much for taking the time to write a response! I really appreciate it! It's a major reason why I like the Teensy and your other products so much! :D
 
Status
Not open for further replies.
Back
Top