what resolution is possible (in theory) for a FFT?

Status
Not open for further replies.

mala96

Member
Hey guys,

analyzing the 1024 FFT code from the audio example codes the last couple of weeks I was wondering how high the resolution could possibly be for a fast fourier transformation? Or better: how high does it need to be?

Do i get it right that for a 1024 point FFT and a spectrum from about 20 Hz - 22050 Hz you'd get 1024 "frequency bands" each having a bandwidth around 21.5 Hz in the spectrum? (22,05 kHz / 1024 = 21.5)
Or what is meant with 1024 "points" otherwise?

So back to the actual question: would it be possible to write an analyzer (probably using oversampling) which gets better resolution?
Would it make sense to even have "1 Hz Bands" for example? So is having an independent value for EVERY frequency a benefit or does it not matter, because the quantization of the spectrum does not need to be that dense?

Sorry guys maybe I'm sounding stupid here, but it's just something i find very fascinating and confusing at the same time, that in DSP the frequency domain is discrete aswell...:confused:

With kind regards, Mala
 
FFT isn't nearly as good as you're assuming!

1024 FFT refers to the number of audio samples that are the input to the FFT. A very important aspect (limitation) of FFT is it gives you the spectrum as if those 1024 samples repeat over and over. If doesn't know anything before or after those 1024, so the Fourier Transform is an assumption that whatever happens in those 1024 points repeats (which generally is not true for highly dynamic signals like music or speech).

FFT gives half as many frequency bands output as the number of samples input. Well, at least that's the common use. If the input is complex number then you can talk about all sorts of special mathematical properties. But the case of only real input (no imaginary numbers in) is the number of frequency bands you get out is half the number of samples you put in. FFT gives complex numbers output, which you could use to know the relative phase of each frequency bin. The FFT1024 in the Teensy Audio Library discards the phase info and gives you only the magnitude of each frequency bin, which is generally the only thing anyone wants for projects where you use the data for audio visualization. Applications where you would want to retain the real+imaginary numbers are usually where you want to alter the frequency data and later turn it back into audio samples. The audio library FFT isn't designed for that sort of signal processing / filtering with FFT, and generally it's not very popular because its latency is usually much worse than non-FFT techniques. So Teensy's FFT give you just a single number for each frequency bin with is the real+imaginary converted to just simple magnitude, without any way to know the relative phase of that frequency bin.

But that isn't the end of the story about limitations of FFT. Remember the analysis is the spectrum as if those exact 1024 points repeat over and over. If you have a frequency which isn't an exact multiple of its period over those 1024 samples, the result is a sharp discontinuity which means a big mess of false frequency data. This is called "spectral leakage". I tried to explain this in the audio tutorial with pictures that make it clearer why the FFT sees frequencies it shouldn't. Turn to page 24 in the PDF for the info about FFT. The picture of what the FFT actually analyzes is on page 28.

Because of the spectral leakage problem, the input data is scaled by a "window" before the FFT is computed. The window solves the spectral leakage, but the cost is "smearing" the data across bins.

Now, to answer your question:

So back to the actual question: would it be possible to write an analyzer (probably using oversampling) which gets better resolution?
Would it make sense to even have "1 Hz Bands" for example?

FFT has a fundamental trade-off between frequency resolution and temporal precision. The more accurately you resolve a frequency, the less precisely you know *when* that frequency was present. If you want 1 Hz frequency resolution, or 22050 frequency bands, then you need to analyze 44100 samples. That's 1 second of audio to analyze. When you get the result, you can know with 1 Hz precision (minus the smearing by the window scaling) what frequencies were present. But you don't know when they happened withing that 1 second (except the window shape means you're looking much more in the middle of that 1 second and much less near the edges).

For course you can't analyze 44100 samples. While the audio library offers up to 1024, people have done 4096. Search this forum. It's been discussed many times. Code has been posted. And in general, the first F for Fast means the number of input samples must be a power of 2. So if you wanted to analyze 44100, you'd need to actually do 65536 samples, which is such a long time for music and most other audio that even when you do get extremely fine resolution in the frequency data, it covers such a long time that it's probably not very useful.

The bottom line is FFT has a lot of limitations. You can still use it for lots of awesome things. But if you get greedy for more frequency resolution, to get it with Fourier Transform you have to analyze much longer groups of samples which ultimately means your detailed knowledge of the spectrum becomes not so useful because you have so little idea of when those frequencies were present.
 
Last edited:
Another way to look into this is
For a 1Hz resolution you need an analysis window of 1 second or so and the signal must be stable for all this time.
If the signal is only 0.1 s long, then the spectrum will be at least 10 Hz wide. A 1 s analysis window will give you no better resolution than 10 Hz but accuracy will stay a 1 Hz.
 
The discrete wavelet transform (DWT) is one way to improve the frequency resolution at low frequencies without compromising
the timeliness of information at higher frequencies in a spectrogram. However its more complicated and the simpler STFT is usually
adequate for many purposes, which is simply running short FFTs repeatedly, with overlap, over the same data to produce a 2D
view of the time-frequency plane.

As for the max resolution of an FFT, that's limited basically by the amount of RAM, and there are even versions of the FFT
optimized for sequential access of external memory, so in effect you can FFT any amount of data you can access.

In practice if you need much higher frequency resolution then multi-rate techniques are often used, as decimation followed
by a small FFT takes less storage than a large FFT. This is what you are getting at by speaking of using oversampling for
better resolution.

Commonly in spectrum analyzers the Welch method is used for averaging the results of many successive FFTs on the input
data to reduce the noise in the frequency domain. Simply using a larger FFT to analyze more samples doesn't reduce the noise
level in a spectrum at all (which can be hiding many interesting details and small peaks)

Lots of information out there, and alas you do need some maths to get the most out of it.
https://en.wikipedia.org/wiki/Discrete_wavelet_transform
https://en.wikipedia.org/wiki/Short-time_Fourier_transform
https://en.wikipedia.org/wiki/Spectrogram
https://holometer.fnal.gov/GH_FFT.pdf
 
Status
Not open for further replies.
Back
Top