Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 6 of 6

Thread: Genericize FFT? Also window functions representation

  1. #1
    Senior Member
    Join Date
    Jul 2020
    Posts
    927

    Genericize FFT? Also window functions representation

    Could the AudioAnalyzeFFT256/1024 not be subclasses of a generic FFT class, able to handle a wider range of powers of two?

    I was looking at the implementation using arm_cfft_radix4_q15, whereas the fastest method seems to be arm_rfft_q15
    as only real values are fed to the FFT? I benchmarked rfft as faster than cfft on Teensy 4.0 by about 50% - although it may be less
    numerically accurate due to the pre- and post-processing.

    The arm_cfft_radixX_XXX functions are also deprecated, according to the documentation I found. The other methods like
    arm_cfft_q15 and arm_rfft_q15 allow powers of 2 for FFT size, not limited to powers of 4.


    The implementation of window functions as precompiled tables becomes an issue with this, if you want to support 32/64/128/...../4096/...
    sizes you'd need to be able to generate them at runtime for the windows/sizes used.

    I note the existing tables are wasteful of space in that they don't take advantage of the even symmetry of a window function.
    They are also not DFT-even style which may cause some issues in some contexts.

    And lastly the window functions aren't structs/objects so there's no record of their noise bandwidth factor which is needed
    for measuring noise rather than signal peaks.

  2. #2
    Senior Member
    Join Date
    Jul 2020
    Posts
    927
    [ I've noticed there's already an outstanding pull-request for the deprecation issue, https://github.com/PaulStoffregen/Audio/pull/331 (though it misses fft256). ]

  3. #3
    Senior Member
    Join Date
    Jul 2020
    Posts
    927
    I have a fairly stable implementation now for an improved FFT along these lines, although it
    turns out only the T4.x have a decent implementation of arm_rfft_q31 with a wide range of
    points from 32..8192 (The T3 seems to be limited to 32/128/512 and there seem to be issues
    with it (doesn't match T4 behaviour).

    Branch is at https://github.com/MarkTillotson/Audio/tree/generic_fft
    if you want to give it a go, theres docs and example now.

    It auto-generates windows, knows about window processing gain and noise-bandwidth and compensates for these,
    and you can read either power spectrum or power spectral density, as dB for power or magnitude for amplitude.
    You can set the nominal full-scale voltage to calibrate should you want this.

    Example included as Audio/Analysis/GenericFFT which compares it against AudioAnalyzeFFT1024 for a set of
    fixed tones and noise-floor.

  4. #4
    Senior Member
    Join Date
    Jul 2020
    Posts
    927
    Comparisons of the generic fft v. the existing FFT1024, for several windows, with
    input being 5 test tones at -1dB, -20dB, -40dB, -60dB and -80dB compared to full scale sine.

    Click image for larger version. 

Name:	fft_comparisons.png 
Views:	15 
Size:	138.8 KB 
ID:	23877

    Notable is the variation in gain in FFT1024 with different windows and its poor noise floor.

  5. #5
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    334
    The last time I played around with the ARM CMSIS FFT on the Teensy it was with the T3.6. I found that a Real FFT took about 1.02mSec for FFT1024 with a Hanning window. A friend is working over the same calculations and tested the code on a T4.1 and found that the Real FFT32 took about 0.243 mSec---roughly in proportion to the clock speed difference. The calculations were a pretty straightforward translation of a Matlab script for determining power density in turbulence calculations. We stuck with 32-bit floats at the time, since this was in the pre-T4.x era and we didn't have access to a double precision FPU.

    I'm curious about the relative speeds of the rfft_q15 and rfft_fast_f32 versions on the T4.X. IIRC, the q15 versions were widely used before FPUs were available--but I'm not familiar with the history. I've only used the FFTs in scientific calculations and most often left Matlab to make the decisions about algorithms.

  6. #6
    Senior Member
    Join Date
    Jul 2020
    Posts
    927
    I think I did measure the performance of the various FFT types a while back on the T4, though I guess it matters more
    on slower chips - I only have T4.0 and T3.2 at the moment to play with.

    The theoretical number of bits needed for an 2^N-point FFT for a signal of M bits is close to M + N/2, since the
    smallest possible signal is one LSB over 2^N samples, which gives a flat spectrum at -6M -3(N-1) dB, which for
    16 bit signal and 1024 point FFT = -6.16 -3.9 = -123dB

    Or put another way the quantization noise is -6M-2 = -98dB, and that's spread over 512 frequency bins,
    so each sees 1/512 of that (-27dB relative to -98dB total noise = -125dB noise floor in the power spectrum)
    [ the true noise power spectral density depends also on the window and sample rate ]

    So to get full information from 16 bit and 1024 FFT needs 125/6 bits at least (21). The FFT1024 in the
    audio library is actually 15 bits, not 16, as it uses q15 fixed point values which range from -2.0 to +2.0.

    So you only get about -90dB noise floor at best from the data representation, though the q15 fft primitives
    lose several more bits in practice due to rounding errors (and it doesn't compensate for window attenuation
    - as that would overflow the q15 representation - so you might get another 5 to 13dB loss from that). Practical
    measurements with FFT1024 and flattop window show a noise floor of -78dB or so, rather than the theoretical
    possible -125dB.

    Using Python I determined 24 bits is a good minimum number for representing FFT output at 1024 points and
    16 bit data:

    Click image for larger version. 

Name:	fft_precision.png 
Views:	10 
Size:	167.8 KB 
ID:	23885
    Note that due to the reasons above the actual Audio lib fft1024 is considerably worse than the worst of these
    graphs. Using q31 versions of FFT I seem to be able to get about -110dB noise floor or thereabouts with
    a window function.

    The graphs use 6 test tones at -1/-20/-40/-60/-80/-100dB and some band limited noise (amplitude 10 LSBs
    before low-pass filtering). Python code available if anyone wants.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •