FFT code question

Status
Not open for further replies.

Avi Harel

Active member
Dear All,
I found an apparent contradiction in the analyze_fft1024.cpp code,
I would be much obliged if anyone could resolve this :

On the one hand in the function copy_to_fft_buffer, there's the loop :

Code:
for (int i=0; i < AUDIO_BLOCK_SAMPLES; i++) {
		*dst++ = *src++;  // real sample plus a zero for imaginary
	}

and this is called from the update() by

Code:
copy_to_fft_buffer(buffer+0x000, blocklist[0]->data);
		copy_to_fft_buffer(buffer+0x100, blocklist[1]->data);
		copy_to_fft_buffer(buffer+0x200, blocklist[2]->data);
		copy_to_fft_buffer(buffer+0x300, blocklist[3]->data);
		copy_to_fft_buffer(buffer+0x400, blocklist[4]->data);
		copy_to_fft_buffer(buffer+0x500, blocklist[5]->data);
		copy_to_fft_buffer(buffer+0x600, blocklist[6]->data);
		copy_to_fft_buffer(buffer+0x700, blocklist[7]->data);

since the jumps are in multiples of 256 samples this means
(and please correct me if I am wrong)
that buffer results in alternating blocks of 128 samples and 128 zero samples,
all in all 2048 samples as the size of the buffer.
(btw where is buffer zeroed out in the first place, or how should it be assumed ?)

The problem with this is that in the function apply_window_to_fft_buffer,
which is also called from update, we have :

Code:
static void apply_window_to_fft_buffer(void *buffer, const void *window)
{
	int16_t *buf = (int16_t *)buffer;
	const int16_t *win = (int16_t *)window;;

	for (int i=0; i < 1024; i++) {
		int32_t val = *buf * *win++;
		//*buf = signed_saturate_rshift(val, 16, 15);
		*buf = val >> 15;
		buf += 2;
	}

}

which seems to suggest that the samples held in buffer
actually alternate between samples and zeros, i.e. each sample
is followed by a zero as should be the reasonable thing and
it is also supported in the short documentation line in copy_to_fft_buffer above
which says "real sample plus a zero for imaginary".

I could not find any documentation on arm_cfft_radix4_q15 which explains
what it expects in buffer...
so what is the resolution of this apparent contradiction ?
Thanks in advance, Avi
 
I'm not sure what you see as a contradiction.
In copy_to_fft_buffer, dst points to a 32-bit integer whereas src points to a 16-bit integer. The result of the copy is that it moves a sample into the high-order half (real) of a 32-bit word and zero in the lower half (imaginary). Each cal to that function creates a new buffer with 128 real,imaginary pairs. Doing that 8 times results in the 1024 (real,imaginary) pairs needed for the FFT.
In the windowing, the window is applied to a 16-bit quantity pointed to by buf which is then incremented by 2 to skip over the imaginary component.

Pete
 
I'm not sure what you see as a contradiction.
In copy_to_fft_buffer, dst points to a 32-bit integer whereas src points to a 16-bit integer. The result of the copy is that it moves a sample into the high-order half (real) of a 32-bit word and zero in the lower half (imaginary). Each cal to that function creates a new buffer with 128 real,imaginary pairs. Doing that 8 times results in the 1024 (real,imaginary) pairs needed for the FFT.
In the windowing, the window is applied to a 16-bit quantity pointed to by buf which is then incremented by 2 to skip over the imaginary component.

Pete

Thanks, I see what I missed : In the code

Code:
static void copy_to_fft_buffer(void *destination, const void *source)
{
	const uint16_t *src = (const uint16_t *)source;
	uint32_t *dst = (uint32_t *)destination;

	for (int i=0; i < AUDIO_BLOCK_SAMPLES; i++) {
		*dst++ = *src++;  // real sample plus a zero for imaginary
	}
}

for some reason I mixed the uint16_t definition of src with that of dst !
But still, where has it been zeroed ? I do not see that in the code
 
Hi,

I'll explain : the ADC stores values from 0 to 65535.
imagine you have a sample with value 24512 stored in a 16bit uint (the resolution of the ADC). That will be 0101111111000000.
But it will be 0000000000000000 0101111111000000 if stored in a 32bit uint (well it depends on the edianness of the processor really but the idea is here)
Then if you read this 32bit value as 2x 16 bit values, it will give you one zero and one 24512...

Here is the zero !

Hope this helps.

Cheers
 
that makes sense, but shouldn't this be the other way around ?
It says here
that the input to the fft should be :

{real[0], imag[0], real[1], imag[1],..}

so shouldn't the "bottom" value be the zero ? I see why this does not matter
in computing the magnitude but in general, if one wants to run an inverse fourier transform
this is a 90 degrees phase shift on the input...
(sorry if this seems picky, I am truly trying to figure out these functions...)
 
As Nitnelav says it depends of the assumed "endianness" of the processor. Take a look at it on Wikipedia...
 
Ok, I think I understand what the "endianness" property means
but I just want to make sure I understand the code correctly :
in :

Code:
static void apply_window_to_fft_buffer(void *buffer, const void *window)
{
	int16_t *buf = (int16_t *)buffer;
	const int16_t *win = (int16_t *)window;;

	for (int i=0; i < 1024; i++) {
		int32_t val = *buf * *win++;
		//*buf = signed_saturate_rshift(val, 16, 15);
		*buf = val >> 15;
		buf += 2;
	}

}

after the line
Code:
int32_t val = *buf * *win++;

is executed : are the two parts being switched so the zero is actually the bottom part now in val ?
this seems to be the only logical possibility since arm_cfft_radix4_q15 expects its input to be a *q15_t
which means it does not consider its inputs to be pairs so it does not does not seem likely it would flip them and this means that
the convention here is "Little-Endian", am I correct ? btw where is documented what is the endianness ?
 
Yes, this is indeed where then edianness comes into play.
The teensy processor is Little Endian obviously. which means that 32 bits values are stored with the LSB (least significant bytes) before the MSB (most significant bytes).
So 24512 32bit integer is stored in "reversed order" so to speak, like this :
11000000 01011111 00000000 00000000
 
No, it's not reorganizing the order. The lines
Code:
int32_t val = *buf * *win++;
	*buf = val >> 15;
simply multiply two Q1.15 values (the buffer and window function elements) to generate a Q2.30 (in an int32_t) value, then shifts right 15 places to convert back to Q1.15 (q15_t) in *buf.
The buf += 2; line addresses only the even values in the buffer (0, 2, 4,...).
 
Last edited:
...then I still do not understand where is the zero coming from :
*buf here is an int16_t, it gets the value val>>15 and then we advance
it by 2 which seems to imply that the next sample - (buf+1) - was zeroed
which is not done explicitly anywhere in the code
 
but it was already zero because it was deposited in the buffer as a little endian uint32_t with zero in the top two bytes, and hasn't been touched since.
 
I'm going to bring this topic back up again, as I'm trying to understand the analyze_fft1024 code for a musical project of my own.

In analyze_fft1024.h, we see, under the "private" section, that "buffer" is defined as follows:

Code 1
Code:
int16_t buffer[2048] __attribute__ ((aligned (4)));

You can see that it is type int16_t and length 2048.

Then, in analyze_fft1024.cpp, the following code is used to accept pointers to "buffer" as the "destination."

Code 2
Code:
static void copy_to_fft_buffer(void *destination, const void *source)
{
	const uint16_t *src = (const uint16_t *)source;
	uint32_t *dst = (uint32_t *)destination;

	for (int i=0; i < AUDIO_BLOCK_SAMPLES; i++) {
		*dst++ = *src++;  // real sample plus a zero for imaginary
	}
}

The relevant line is

Code:
	uint32_t *dst = (uint32_t *)destination;

What exactly does this line accomplish? I get that it uses a typecast to give "destination" the type "uint32_t," but does this mean that "destination" stays the same length while doubling its size, or halves in length while remaining the same size? I ask in the context of the function being called later:

Code 3
Code:
copy_to_fft_buffer(buffer+0x000, blocklist[0]->data);
copy_to_fft_buffer(buffer+0x100, blocklist[1]->data);
copy_to_fft_buffer(buffer+0x200, blocklist[2]->data);
copy_to_fft_buffer(buffer+0x300, blocklist[3]->data);
copy_to_fft_buffer(buffer+0x400, blocklist[4]->data);
copy_to_fft_buffer(buffer+0x500, blocklist[5]->data);
copy_to_fft_buffer(buffer+0x600, blocklist[6]->data);
copy_to_fft_buffer(buffer+0x700, blocklist[7]->data);

buffer is copied to in chunks of 256 bytes, which implies that the typecasting halves the number of indexes, somehow, in order for the second half of Code 2 to work out (since AUDIO_BLOCK_SAMPLES = 128).

I have a few more questions about the window functions and the following fft functions as well - mainly to do with understanding how a length 2048 input buffer leads to 512 output bins. I'll try to understand it a bit more on my own before posing more specific questions.

Thanks!

-David
 
Hi,

The idea is that a 2048 int16_t table takes the same memory space as a 1024 int32_t table (same as a 512 int64_t, a 4096 int8_t etc...)
You can read the same memory region (a variable) with different "cast", you just change the size of the values you read each time.

In this line :
Code:
copy_to_fft_buffer(buffer+0x000, blocklist[0]->data);
You need to understand that the variable buffer is declared as a 2048 int16_t table

So when you take block->data, a 128 int16_t table, and put it in a 128 int32_t table, that means the buffer is filled with 256 int16_t values.
That's why the next call to copy_to_fft_buffer, buffer starts at 0x100, which is 256 in hexadecimal...

The idea of copy_to_fft_buffer is to fill the fft buffer with {0 , value, 0, value, 0, value, etc.}, thats why, each time its called, the buffer is filled by twice the size of block->data.
... as explained earlier in this thread.

I hope this helps...
 
Nitnelav says it well: the buffer is just a block of 4096 bytes of memory, and items stored there can be accessed according to their ASSUMED type - which you can change at will.

copy_to_fft_buffer() is a bit tricky because it assumes uint32_t variables when storing in buffer, while the incoming data is int16_t. Thus the pointer arithmetic

*dst++ = *src++; // real sample plus a zero for imaginary

when incrementing causes the address pointed to by dst to increase by 4 bytes (because dst is defined to be a pointer to unint32_t), but the address pointed to by src is only incremented by 2 bytes (because src is defined as a pointer to int16_t). Tricky!
 
Last edited:
Status
Not open for further replies.
Back
Top