In the FFT_1024 function, what does the 1024 refer to?
1024 is the number of audio samples it analyzes. At 44.1 kHz, each FFT result represents the spectrum based on 23.2 ms of time. By default a Hanning window scaling is applied, so you're getting results that mostly represent the middle ~13 ms of that time. This is covered in detail in the tutorial. Check out pages 27-29 in the tutorial PDF, or watch that section of the video walkthrough.
https://www.pjrc.com/store/audio_tutorial_kit.html
Fundamentally speaking, FFT math is complex. Not necessarily "complex" as in hard to understand, though it certainly can seem that way from the terribly written explanations in many academic textbooks, and sites like wikipedia which essentially copy textbooks rather than trying to explain. FFT is "complex" in that the numbers both input and output are real+imaginary numbers (or 2D vectors or single frequency waveforms or however you like to think of real+imaginary numbers).
The FFT1024 feature in the Teensy audio lib deals only with ordinary real numbers. Internally it's feeding audio data into the FFT as real numbers. The imaginary part of the input is set to zero. The FFT math gives a complex (real+imaginary) number output for each frequency bin. Conceptually, each frequency has an amplitude (or "magnitude" would be the more mathematically correct term) and a phase shift which is relative to the 23.2 ms time period where the analysis was done. That is *why* FFT output must be real+imaginary numbers; you simply can't represent amplitude *and* phase shift with a single number! The audio library's FFT1024 is written with the assumption you're doing music visualization or other spectral analysis where you only care about how intense each frequency bin is, but you couldn't care less about the relative timing or phase shift between each frequency bin. So it combines the real & imaginary numbers into only a single "magnitude" output for each bin. The downside is you can't get the phase info (at least not using the simple object from the library) but the library is simpler to use for most ordinary projects where the phase info isn't important.
FFT has a special property if you give it only real numbers input (the imaginary part of all 1024 inputs are zero). The 2nd half of the output is a redundant mirrored copy of the first half. So you put in 1024 real-only numbers, and you get out 1024 real+imaginary complex numbers out, but only the first half of those numbers are meaningful. Many textbooks go into rigorous proofs of this property, which is great if you're a mathematician, but a distraction if you only want to learn how to actually use FFT. Rather than talk in terms of hard-to-follow equations, I'll briefly mention this is related to Nyquist sampling theory, which says time-sampled (real, not imaginary numbers) data can only represent frequencies up to 1/2 of the sample rate. Again textbooks go into too much math to prove a point I believe pretty much everyone accepts, that 44 kHz sample data represents 22 kHz of audio bandwidth.
This is the reason why virtually all FFT implementations give you half as many numbers output as the number you see written in their descriptions. FFT1024 means it takes in 1024 audio samples. You get 512 (or sometimes 513 depending on the code) frequency bin numbers output.
Hopefully this explains what the 1024 means. If you haven't read the tutorial or watched the video, please do. I put quite a bit more info in there about the practical realities of actually using FFT. Unless you're analyzing waveforms which are perfectly phase sync'd to the FFT (which is pretty much never for any ordinary signals), you really do need to use a window to avoid the spectral leakage problem. Again, hopefully I explained that well enough in the tutorial material?