Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 3 1 2 3 LastLast
Results 1 to 25 of 62

Thread: Excellent results with Floating Point FFT/IFFT Processing and Teensy 3.6

  1. #1
    Junior Member
    Join Date
    Sep 2016
    Posts
    13

    Excellent results with Floating Point FFT/IFFT Processing and Teensy 3.6

    I 'm not sure where to post this, but I'm very pleased with the results so far with testing FFT/IFFT processing using the 32-bit floating point versions of the arm_math Library and the Teensy 3.6 and its FPU.

    Previously, passing audio through pairs of the q15 based arm_math library FFT/IFFT functions resulted in very noisy results, due to the limitations of the 16 bit depth. Using the arm_cfft_radix4_f32 as pretty much a drop in replacement for the arm_cfft_radix4_q15 function, the results are dramatically better.

    For example with a 1024 FFT/IFFT length, artefact tone levels dropped from around -40/-45 dB to around -78dB. Base level white noise went down from around -63 dB at say 5KHz to occasional peaks of about -78 dB, mostly at much lower levels than that.

    I have attached some Spectrographs of the FFT/IFFT reconstruction of a 1K Hz sine wave to show the differences, both at 1024 and 256 lengths. Note that so far there is no shaped windowing or overlapping etc done at all, so I'm hoping for some reduction in some of the artefact peaks on these graphs with some window processing.

    Music played from iTunes through the FFT/IFFT pair at either 1024 or 256 lengths is quite acceptable for casual listening at least. There is certainly a slight increase in background hiss noticeable with more careful listening.

    Now I'm ready to start with the timestretching, pitch shifting, sound tone freezing, sound blending across bins and and
    I expect this will take a while to get together though. I am hoping it will be usable in a realtime performance context eventually.

    For the graphs shown, the audio is played into Line-in on the Audio shield via a record 'queue' Library object , an FFT/IFFT is performed on it and then sent to Line-out via a play 'queue' Library object.

    Measured Latency data -
    1024 length FFT/IFFT passthrough: 29.57 ms (23.22 ms of that would be serialisation delay)

    256 length FFT/IFFT passthrough: 12.15ms (5.8 ms of that would be serialisation delay)

    For reference, for a 128 length Audio block passthrough via Library 'queue' objects without any FFT/IFFT processing, the latency measures as 9.25 ms (2.9 ms of that would be serialisation delay)

    A big thanks to everyone who helped get me to this point, particularly DerekR and Duff. I suspect the next bits will be harder :-)

    Let me know if you have any questions.


    Click image for larger version. 

Name:	TEENSY_IFFT_Q15_256_Sine1k_Sgraph.jpg 
Views:	221 
Size:	95.0 KB 
ID:	8480Click image for larger version. 

Name:	TEENSY_IFFT_Q15_1024_Sine1k_Sgraph.jpg 
Views:	101 
Size:	98.3 KB 
ID:	8481Click image for larger version. 

Name:	TEENSY_IFFT_32F_256_Sine1k_Sgraph.jpg 
Views:	109 
Size:	92.8 KB 
ID:	8482Click image for larger version. 

Name:	TEENSY_IFFT_32F_1024_Sine1k_Sgraph.jpg 
Views:	97 
Size:	93.3 KB 
ID:	8483

  2. #2
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Thanks for this post. I, too, have found that the Int32 and Float32 FFTs to be very nice. They're also quite fast on the Teensy 3.6:

    https://openaudio.blogspot.com/2016/...6-is-fast.html

    This part of your message caught my eye:

    Quote Originally Posted by Ray.E View Post
    For reference, for a 128 length Audio block passthrough via Library 'queue' objects without any FFT/IFFT processing, the latency measures as 9.25 ms (2.9 ms of that would be serialisation delay)
    Can you say a little more about how you did this test? For my audio processing projects, latency will end up being quite important. If a straight pass-through of 128 points takes 9.25 ms, instead of 128/44100 x 2 = 5.8 ms as expected, that could be problematic.

    How did you do your latency test?

    Thanks,

    Chip

  3. #3
    Junior Member
    Join Date
    Sep 2016
    Posts
    13
    Chip,
    Just to be pedantic a little for clarity, I'm looking to measure the _additional_ latency of passing through the teensy. A wire connecting two devices directly will have a 'latency' (all serialisation) of 128/44100 x 1 in this case. Putting a Teensy in the path adds additional latency - another 128/44100 serialisation (2.9 ms) plus processing time.

    That said, for my processing in the Teensy I'm using the Audio Library queue functions to get a queue receive buffer, recieve the 128 samples, into the buffer, then get an output queue buffer, copy the input to output (memcpy), then send buffer to be output queue. Not exactly direct, but it gives me visibility of the samples at least, and I don't have the knowledge to do native access to the Audio shield. The cost is the processing time for all this.

    The Audio Library does a more direct input to output streaming connection capability with no sketch code involved. With a Teensy 3.2/Audio Shield this has an additional latency figure (including serialisation) of around 6.3 ms. Even this is generalised library code to some extent so may be able to be optimised even further.

    Unfortunately I can't test this with Teensy 3.6 as the Audio library function for this direct streaming currently has a bug for the 3.6, at least at high clock rates.

    In more general terms, I measure the additional latency by comparing the timing of a purely wired analogue audio loop with an analogue wired loop including the Teensy. This is done with Metric Halo Spectrafoo's Transfer Function tool, set to detect the _difference_ in timing between audio from the purely wired loop compared with the loop including the Teensy. I've found it be quite accurate. I can decribe the setup in more detail if it is of interest.

    Hope this helps.

  4. #4
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Thanks for the added detail.

    I like the fact that you're comparing to literally a purely wired loop. This directly addresses my concerns that you might have been using your soundcard as an oscilloscope, but NOT allowing for the possibility that your soundcard had its own latency. It sounds like your test method is excellent.

    Thanks,

    Chip

  5. #5
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,597
    Quote Originally Posted by Ray.E View Post
    Unfortunately I can't test this with Teensy 3.6 as the Audio library function for this direct streaming currently has a bug for the 3.6, at least at high clock rates.
    Can you please point me to more info about this bug? I've missed quite a lot of forum posts over the last few weeks.

  6. #6
    Senior Member
    Join Date
    Nov 2012
    Posts
    1,155
    @Paul:
    I reported some problems with audio in the K66 beta thread.

    Can't get microphone or line-in audio to play straight through.
    The code in this message still doesn't work on a new T3.6
    msg #936 https://forum.pjrc.com/threads/34808...ta-Test/page38

    Then see #955 on the next page

    These are specifically about clock speed
    #997 and #999 on page 40

    There were earlier messages from others about I2S problems in the K66 beta thread.

    Pete

  7. #7
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    9,777
    @Pete: on windows at least with edge/ie you can right click on the post number and copy link to give direct access:

    936
    955
    997
    999

  8. #8
    Senior Member
    Join Date
    Nov 2012
    Posts
    1,155
    @defragster: Thanks. Works with Firefox too.

    Pete

  9. #9
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by el_supremo View Post
    @Paul:
    Can't get microphone or line-in audio to play straight through.
    The code in this message still doesn't work on a new T3.6
    I, too, have problems with any sort of straight-through signal chain...i2s_in -> i2s_out. I tried to get smart and insert a "dummy" filter algorithm, but the optimizer or whatever sees right through my ploy. Works on Teensy 3.2, doesn't on Teensy 3.6.

    My code is below, and on my github:

    https://github.com/chipaudette/OpenA...Through_LineIn

    Chip


    Code:
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    #include <SD.h>
    #include <SerialFlash.h>
    
    class AudioFilterEmpty : public AudioStream
    {
      public:
        AudioFilterEmpty(char *foo_txt) : AudioStream(1, inputQueueArray) {myName = foo_txt; }
        char* myName;
        void update(void)
        {
          audio_block_t *block;
          block = receiveWritable();
          if (!block) {
            return;
          }
          transmit(block);
          release(block);
        }
    
      private:
        audio_block_t *inputQueueArray[1];
    };
    
    AudioInputI2S            i2s1;         
    AudioOutputI2S           i2s2;        
    AudioFilterEmpty         filter1("Filter1");
    AudioFilterEmpty         filter2("Filter2");
    AudioConnection          patchCord1(i2s1, 0, filter1, 0);
    AudioConnection          patchCord2(i2s1, 1, filter2, 0);
    AudioConnection          patchCord3(filter1, 0, i2s2, 1);
    AudioConnection          patchCord4(filter2, 0, i2s2, 0);
    AudioControlSGTL5000     sgtl5000_1;     
    
    void setup() {
      AudioMemory(20);
      delay(500);
    
      // Enable the audio shield and set the output volume.
      sgtl5000_1.enable();
      sgtl5000_1.inputSelect(AUDIO_INPUT_LINEIN);
      sgtl5000_1.volume(0.45); //headphone volume
    }
    
    void loop() {
       delay(20);
    }

  10. #10
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,679
    Interesting. Is it speed-dependant ? Does it work with 96Mhz on the 3.6?

  11. #11
    Senior Member
    Join Date
    Nov 2012
    Posts
    1,155
    Yep, I noted speed dependence in https://forum.pjrc.com/threads/34808...l=1#post110959

    Pete

  12. #12
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by Frank B View Post
    Interesting. Is it speed-dependant ? Does it work with 96Mhz on the 3.6?
    Great question. I didn't think to try that.

    It turns out that it works at 96 MHz and 120 MHz, but it doesn't work at 144, 168, or 180 MHz.

    Does that help locate the problem?

    Chip

    Updated (simplified) code is on my GitHub and below:

    Code:
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    #include <SD.h>
    #include <SerialFlash.h>
    
    AudioControlSGTL5000     sgtl5000_1;    
    AudioInputI2S            i2s1;         
    AudioOutputI2S           i2s2;
    
    //simplest pass-through  (On Teensy 3.6: works at 96 MHz and 120MHz.  Not at 144, 168, or 180 MHz)
    AudioConnection          patchCord1(i2s1, 0, i2s2, 0);
    AudioConnection          patchCord2(i2s1, 1, i2s2, 1);
    
    void setup() {
      Serial.begin(115200);
      delay(500);
      Serial.println("Pass-Through Line-In to Headphone...");
      
      AudioMemory(20);
      delay(250);
    
      // Enable the audio shield and set the output volume.
      sgtl5000_1.enable();
      sgtl5000_1.inputSelect(AUDIO_INPUT_LINEIN);
      sgtl5000_1.volume(0.45); //headphone volume
      //sgtl5000_1.lineInLevel(11, 11); //max is 15, default is 5
    }
    
    void loop() {
       delay(20);
    }

  13. #13
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by el_supremo View Post
    Yep, I noted speed dependence in https://forum.pjrc.com/threads/34808...l=1#post110959

    Pete
    Ah, Pete's been all over this problem. Sorry to duplicate!

  14. #14
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,679
    Quote Originally Posted by chipaudette View Post
    Great question. I didn't think to try that.

    It turns out that it works at 96 MHz and 120 MHz, but it doesn't work at 144, 168, or 180 MHz.

    Does that help locate the problem?
    I hope NOT !
    These frequencies suggest that it has something to do with HSRUN. That would not be good. Could indicate a silicon-bug ..
    But we have to review the code first...
    Can you make a little test for me ? I don't know wether it is related or not: On my 3.6 BETA-Board and the pre-production, i have massive problems with I2S Output when using more than 192 MHz. Can you check this on your Teensy ? Something with the timing is totally wrong and it is very distorted.
    Unfortunately, these both are the only 3.6 i have at the moment..i'm still waiting for the others.
    Last edited by Frank B; 10-19-2016 at 09:02 PM.

  15. #15
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by Frank B View Post
    I hope NOT !
    These frequencies suggest that it has something to do with HSRUN. That would not be good. Could indicate a silicon-bug ..
    But we have to review the code first...
    Can you make a little test for me ? I don't know wether it is related or not: On my 3.6 BETA-Board and the pre-production, i have massive problems with I2S Output when using more than 192 MHz. Can you check this on your Teensy ? Something with the timing is totally wrong and it is very distorted.
    Unfortunately, these both are the only 3.6 i have at the moment..i'm still waiting for the others.
    I don't know what you mean by more than 192MHz. Is that the clock rate of the I2S bus? Regardless, how do I change it to do your test?

    Chip

  16. #16
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,679
    Please try 216MHz "Overclock" / 240MHz "Overclock" and "PlaySynthMusic" from the examples. Up to 192MHz, everything is OK on my 3.6. But not above.
    I'm sure, for MY 3.6, it is a hardwareproblem..
    Last edited by Frank B; 10-19-2016 at 09:36 PM.

  17. #17
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,597
    Quote Originally Posted by chipaudette View Post
    It turns out that it works at 96 MHz and 120 MHz, but it doesn't work at 144, 168, or 180 MHz.
    Thanks. I've got this on my high-priority bug list now.

    At this moment I'm working on bringing SDIO support into the Arduino SD library. That's at the absolute top of my list right now!

    Will try to get as much of this other stuff as I can before 1.31-beta2. But realistically, I really want to get beta2 out ASAP with the most urgent fixes. This and quite a few other things might end up waiting another week for beta3.

    It's now on my short list, and we're *finally* recovering from Kickstarter and then a few weeks of being short-staffed due to much-needed vacations... so I will look at this soon.

  18. #18
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,679
    SDIO: Great news. I can't wait to publish the "official" version of "TeensyPlaysVideo" :-)

  19. #19
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,597
    Just to set expectations appropriately (low), this first version will read/write only 1 sector at a time using polling. Pretty much the same as the SPI code does, but with 4 bits parallel at 25 MHz.

  20. #20
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,679
    Ok, perhaps thats fast enough. OK. We'll see

    Paul: Another info regarding I2S: I don't know wether this is useful to know, or not, but i thought i should mention it: My problems with I2S and 216/240MHZ exist only with the SGTL-5000. Not with the PT8211.
    I wonder why, and I have no real explanation for this, since there are only little differences.

  21. #21
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by Frank B View Post
    Please try 216MHz "Overclock" / 240MHz "Overclock" and "PlaySynthMusic" from the examples. Up to 192MHz, everything is OK on my 3.6. But not above.
    I'm sure, for MY 3.6, it is a hardwareproblem..
    My audio pass-through test does not work at any of the overclock settings. 120 MHz or below is as fast as the it'll work with the Teensy 3.6.

    Arduino 1.6.11. Teensy Loader 1.30. USB set to "Serial". Teensy 3.6 from Kickstarter. Audio Shield Rev B.

    Chip
    Last edited by chipaudette; 10-20-2016 at 01:41 AM.

  22. #22
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Ooops, I just saw that you asked for "PlaySynthMusic". This is purely about sound generation, not about the Line-In. So, maybe it'll work at the higher speeds? We'll see. I repeated my tests:

    * 120 MHz: plays OK
    * 180 MHz: plays OK
    * 192 MHz: plays OK
    * 216 MHz: the song is recognizable but the pitches warble hilariously. Funny! And, it stopped playing unexpectedly part way through.
    * 240 MHz: plays OK

    So, my only trouble was with 216 MHz. Everything else was fine.

    By the way, on my PC (Thinkpad T410, Core i5 M540 @ 2.53GHz, 8GM RAM, Non-SDD harddrive, Win7 64-bit), recompiling for the Teensy 3.6 takes longer than the song takes to play. It's quite a bit slower than I'd like. It's quite a bit slower than is comfortable for rapid-fire iterative programming.

    Hope this helps...

    Chip

  23. #23
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    9,777
    Quote Originally Posted by chipaudette View Post
    recompiling for the Teensy 3.6 takes longer than the song takes to play. It's quite a bit slower than I'd like. It's quite a bit slower than is comfortable for rapid-fire iterative programming.
    Chip - what IDE are you using? It seemed 1.6.12 put some attention to recompiling and sped it up.

    from below re-compile is much faster using the 1.6.12:
    [30 second full compile on desktop SSD with IDE 1.6.12 :: recompile 4-5 seconds]
    Last edited by defragster; 10-20-2016 at 04:23 AM.

  24. #24
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    9,777
    PlaySynthMusic on Beta2 T_3.6 - all these speeds sound fine and seem to play through at same speed.
    Using PJRC AUDIO shield :: plays to stop in about 1:22 (timed if noted **)::
    240 - FINE - >> F_BUS 120M
    216 - FINE - ** >> F_BUS 54M default and F_BUS 108M
    192 - FINE - ** >> F_BUS 48M
    180 - FINE - >> F_BUS 90M
    120 - FINE - ** >> F_BUS 60M
    [30 second full compile on desktop SSD with IDE 1.6.12 :: recompile 4-5 seconds]

    NOTE: I have my HSRUN changes on this machine - but that only affects EEPROM writes and Up front Serial# read.

    Nothing 'local' in libraries::
    Multiple libraries were found for "SD.h"
    Used: C:\arduino-1.6.12\hardware\teensy\avr\libraries\SD
    Not used: C:\arduino-1.6.12\libraries\SD
    Using library Audio at version 1.3 in folder: C:\arduino-1.6.12\hardware\teensy\avr\libraries\Audio
    Using library SPI at version 1.0 in folder: C:\arduino-1.6.12\hardware\teensy\avr\libraries\SPI
    Using library SD at version 1.0.8 in folder: C:\arduino-1.6.12\hardware\teensy\avr\libraries\SD
    Using library SerialFlash at version 0.4 in folder: C:\arduino-1.6.12\hardware\teensy\avr\libraries\SerialFlash
    Using library Wire at version 1.0 in folder: C:\arduino-1.6.12\hardware\teensy\avr\libraries\Wire
    <Update> ::
    Compiled for Beta T_3.5 and works at 120 MHz
    Failure of T_3.5 to run at 144M and 168 - is work needed for them to run or should those [and higher] be pulled from BOARDS.txt?
    Last edited by defragster; 10-20-2016 at 04:19 AM.

  25. #25
    Senior Member
    Join Date
    Oct 2015
    Location
    Vermont, USA
    Posts
    190
    Quote Originally Posted by defragster View Post
    Chip - what IDE are you using? It seemed 1.6.12 put some attention to recompiling and sped it up.

    from below re-compile is much faster using the 1.6.12:
    [30 second full compile on desktop SSD with IDE 1.6.12 :: recompile 4-5 seconds]
    I'm using 1.6.11. When I installed Teensyduino, it wouldn't let me install into 1.6.12, so I downgraded to 1.6.11. This is for the default downloadable Teensyduino installer for Windows. If you can run it on 1.6.12, and if it's faster, that great! I'm looking forward to it!

    Chip

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •