Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 12 of 12

Thread: Everything works fine then dramatically slows after about a minute

  1. #1
    Junior Member
    Join Date
    Feb 2016
    Posts
    18

    Everything works fine then dramatically slows after about a minute

    Hi - I am running an 8x32 matrix on a Teensy 3.2 using fastled in parallel. I'm displaying a sinusoid function. It works fine for about a minute and then slows to a framerate of about 2FPS - and keeps getting slower after that. Any ideas of what could be causing it?

    Code:
     
    
    #include "FastLED.h"
    uint8_t Width  = 32;
    uint8_t Height = 8;
     // y dimension  x=0, y=0 at lower left hand corner (pixel 8)
    
    float speed = 1.0; // speed of the movement along the Lissajous curves
    float size = 4;    // amplitude of the curves
    
    // NUM_LEDS = Width * Height
    #define NUM_LEDS_PER_STRIP 64
    // Note: this can be 12 if you're using a teensy 3 and don't mind soldering the pads on the back
    #define NUM_STRIPS 4
    #define NUM_LEDS      256
    #define BRIGHTNESS    100
    #define FPS 100
    #define FPS_DELAY 1000/FPS
    CRGB leds[NUM_LEDS];
    
    void setup() {
      LEDS.addLeds<WS2811_PORTD, NUM_STRIPS, GRB>(leds, NUM_LEDS_PER_STRIP);
      FastLED.setBrightness(BRIGHTNESS);
    }
    
    void loop() 
    {
    sinusoid();
      FastLED.show();
    //  FastLED.delay(FPS_DELAY);
    }
    
    void sinusoid()
    {
    for (uint8_t y = 0; y < Height; y++) {
        for (uint8_t x = 0; x < Width; x++) {
    
          float cx = y + float(size * (sinf (float(speed * 0.003 * (millis() ))) ) ) - (Width/2);  // the 8 centers the middle on a 16x16
          float cy = x + float(size * (cosf (float(speed * 0.0022 * (millis()))) ) ) - (Height/2);
          float v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          uint8_t data = v;
          leds[XY(x, y)].r = data;
    
          cx = x + float(size * (sinf (speed * float(0.0021 * (millis()))) ) ) - (Width/2);
          cy = y + float(size * (cosf (speed * float(0.002 * (millis() ))) ) ) - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].b = data;
    
          cx = x + float(size * (sinf (speed * float(0.0041 * (millis() ))) ) ) - (Width/2);
          cy = y + float(size * (cosf (speed * float(0.0052 * (millis() ))) ) ) - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].g = data;
    
        }
      }
    }
    // Helper function that translates from x, y into an index into the LED array
    uint16_t XY( uint8_t x, uint8_t y)
    {
      uint16_t ledNum;
      if ( x & 0x01)
      {
        // Odd rows run backwards
        ledNum = ((x+1) * Height) - (y+1);
      }
      else
      {
        // Even rows run forwards
        ledNum = ((x * Height) + y);
      }
    
    
      return ledNum;
    }
    Last edited by Theremingenieur; 01-23-2018 at 09:00 AM.

  2. #2
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    6,842
    Code is easier to read/scan if "Ctrl+F" formatted in IDE before posting within the CODE tag block - the "#" on the REPLY toolbar.
    A delay of some sort in loop might help see what is going on. Not sure if calling too fast can cause issues? delay(10) is the easy thing - but only updating with an elapsedMillis variable lets the loop keep running.

    Code:
    elapsedMillis wLED;
    
    void setup() {
      // ...
      wLED = 0;
    }
    
    void loop() 
    {
      if ( wLED > 10 ) {
        wLED -= 10;
        sinusoid();
        FastLED.show();
        // FastLED.delay(FPS_DELAY);
      }
    }

  3. #3
    Senior Member
    Join Date
    May 2017
    Posts
    141
    For something that gets slower and slower, I would be very suspicious of those formulas that use millis() in the calculation. I don't have an explanation as to why. Your formulas like for example :
    cx = x + float(size * (sinf (speed * float(0.0021 * (millis()))) ) ) - (Width/2);

  4. #4
    Junior Member
    Join Date
    Feb 2016
    Posts
    18
    The floating point calcs with the rapidly increasing millis are my suspect as well. I tried converting all of the calcs to integers and using the sin8 approximation in fastled. I was about 30% successful. I converted over the first of the three calcs to integers only (which had no effect on speed). As I converted the second and third calcs things started going wrong with the colors. I tried for a few hours but couldn't get it. I think tonight I will just switch over to a 3.6, which should have no problem with the floating calcs and also was the teensy used in the example I'm basing this on. I'll let you know how it goes.

  5. #5
    Senior Member pictographer's Avatar
    Join Date
    May 2013
    Location
    San Jose, CA
    Posts
    633
    I'd suggest you narrow down the root cause by divide and conquer. If you just switch hardware, you might avoid the problem or you might delay it, but either way you won't learn what it was.

    Comment out half the code. See if it still slows down. If so, comment half of what remains. If not, the problem is in something you commented, so uncomment half. Etc. It shouldn't take more than half a dozen tests to narrow it down to a single line or give you a clue about which function is slowing down.

    If you decide to do this, you might as well grab the value of millis() once per iteration and stick it in a variable.

  6. #6
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    3,653
    Sorry I may be missing something obvious, but what is the purpose of passing in millis() to these functions?

    That is, if you look at the simple sub expressions like: (sinf (float(speed * 0.003 * (millis() )))
    After something like 2 minutes, you will be taking the Sinf(360.0). I assume the sinf code does something to normalize this from 0-2 ... If my rusty math knowledge is correct... Wonder if it does by doing some remainder calculation or does it do repeated subtraction... If repeated subtraction, then each time it will take longer...

    As for using sin8... I assume you did something to constrain the values passed in to be between 0-255 (i.e. the input theta is a uint8_t value...

    Edit: forgot to mention, if it were me, I would also instrument the code, and in each pass I would probably
    calculate a delta Time for the calculations and maybe print out a sum of these for each 100 times through the loop to see if they keep incrementing... I might actually have a sum for each of the calculations and maybe one for the LEDs show...

  7. #7
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    6,842
    Nice to see the CODE # tag! It is readable now.

    Indeed - millis() was what I thought to find as a problem but couldn't bother to read the flattened code.

    How about something like :: ( millis() % 360 ) - or whatever makes sense to get a usable 'remainder' and limit the value to a proper range.

  8. #8
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    18,337
    Just played with this code a bit. Indeed the trig functions become *much* slower when given angles far more than 2*PI. Whatever they're doing to turn such a huge number into the 0-2PI range takes quite a lot of CPU time.

    Here's one possible idea to keep the angles/phase inputs from growing without bound. Instead of using millis(), I just put in a fixed increment (based on the assumption you want a consistent frame rate and will add an elapsedMicros or similar delay in loop). All 6 phases are pre-computed at the beginning of the update, and if any have grown beyond 2*PI, they're trimmed before use with the trig functions.

    I also put in a bit code into loop() to print the number of microseconds taken for the computation. Before this change, it started around 80000 and after running for some time would grown to about 450000, which corresponds with the 2 Hz refresh rate you're seeing. This approach of increment & limiting the phase/angle to 2*PI gives a pretty consistent 62000 to 74000 microsecond compute time, even after running for many minutes.

    Hope this helps?

    Code:
    #include "FastLED.h"
    uint8_t Width  = 32;
    uint8_t Height = 8;
     // y dimension  x=0, y=0 at lower left hand corner (pixel 8)
    
    float speed = 1.0; // speed of the movement along the Lissajous curves
    float size = 4;    // amplitude of the curves
    
    // NUM_LEDS = Width * Height
    #define NUM_LEDS_PER_STRIP 64
    // Note: this can be 12 if you're using a teensy 3 and don't mind soldering the pads on the back
    #define NUM_STRIPS 4
    #define NUM_LEDS      256
    #define BRIGHTNESS    100
    #define FPS 100
    #define FPS_DELAY 1000/FPS
    CRGB leds[NUM_LEDS];
    
    void setup() {
      LEDS.addLeds<WS2811_PORTD, NUM_STRIPS, GRB>(leds, NUM_LEDS_PER_STRIP);
      FastLED.setBrightness(BRIGHTNESS);
    }
    
    void loop()  {
      elapsedMicros usec=0;
      sinusoid();
      Serial.println(usec);
      FastLED.show();
    //  FastLED.delay(FPS_DELAY);
      
    }
    
    const unsigned long millis_increment = 80;
    float phase1 = 0.0;
    float phase2 = 0.0;
    float phase3 = 0.0;
    float phase4 = 0.0;
    float phase5 = 0.0;
    float phase6 = 0.0;
    
    void sinusoid() {
      phase1 += speed * 0.0030 * millis_increment;
      phase2 += speed * 0.0022 * millis_increment;
      phase3 += speed * 0.0021 * millis_increment;
      phase4 += speed * 0.0020 * millis_increment;
      phase5 += speed * 0.0041 * millis_increment;
      phase6 += speed * 0.0052 * millis_increment;
    
      const float pi2 = PI * 2.0;
    
      if (phase1 > pi2) phase1 -= pi2;
      if (phase2 > pi2) phase2 -= pi2;
      if (phase3 > pi2) phase3 -= pi2;
      if (phase4 > pi2) phase4 -= pi2;
      if (phase5 > pi2) phase5 -= pi2;
      if (phase6 > pi2) phase6 -= pi2;
      
      for (uint8_t y = 0; y < Height; y++) {
        for (uint8_t x = 0; x < Width; x++) {
    
          float cx = y + size * sinf(phase1) - (Width/2);  // the 8 centers the middle on a 16x16
          float cy = x + size * cosf(phase2) - (Height/2);
          float v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          uint8_t data = v;
          leds[XY(x, y)].r = data;
    
          cx = x + size * sinf(phase3) - (Width/2);
          cy = y + size * cosf(phase4) - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].b = data;
    
          cx = x + size * sinf(phase5) - (Width/2);
          cy = y + size * cosf(phase6) - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].g = data;
    
        }
      }
    }
    // Helper function that translates from x, y into an index into the LED array
    uint16_t XY( uint8_t x, uint8_t y)
    {
      uint16_t ledNum;
      if ( x & 0x01)
      {
        // Odd rows run backwards
        ledNum = ((x+1) * Height) - (y+1);
      }
      else
      {
        // Even rows run forwards
        ledNum = ((x * Height) + y);
      }
    
    
      return ledNum;
    }

  9. #9
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    18,337
    Just for fun, I tried running this on Teensy 3.6. With the FPU and 180 MHz clock speed, it does the computation in 2100 to 2500 us.

    Also tried the original code. Indeed a similar slowdown happens on Teensy 3.6, originally taking ~2600 us and slowing to ~22000 us after a few minutes.

  10. #10
    Junior Member
    Join Date
    Feb 2016
    Posts
    18
    Thanks guys! It does work! I had come at this a much clumsier way by using a counter instead of millis, and limiting the counter to about 20,000. That also worked but it had a seam when the counter restarted. Paul's solution is much more elegant and doesn't have a seam.

    This was way better than desoldering the 3.2! The old chip has still got some tricks! I'm making a hat - which is why the teensy isn't socketed - not enough space. I'll post some more vid when I get that done.


  11. #11
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    18,337
    I believe you may be able to roughly double your frame rate (reaching approx 30 Hz video speed) by moving the 6 trig functions outside the loop. They take the same input data during each loop iteration, so there's no need to recompute them for every pixel. This seems like the sort of thing the compiler should optimize, but perhaps it doesn't "know" the trig functions are purely a math function of their input?

    Code:
    #include "FastLED.h"
    uint8_t Width  = 32;
    uint8_t Height = 8;
     // y dimension  x=0, y=0 at lower left hand corner (pixel 8)
    
    float speed = 1.0; // speed of the movement along the Lissajous curves
    float size = 4;    // amplitude of the curves
    
    // NUM_LEDS = Width * Height
    #define NUM_LEDS_PER_STRIP 64
    // Note: this can be 12 if you're using a teensy 3 and don't mind soldering the pads on the back
    #define NUM_STRIPS 4
    #define NUM_LEDS      256
    #define BRIGHTNESS    100
    #define FPS 100
    #define FPS_DELAY 1000/FPS
    CRGB leds[NUM_LEDS];
    
    void setup() {
      LEDS.addLeds<WS2811_PORTD, NUM_STRIPS, GRB>(leds, NUM_LEDS_PER_STRIP);
      FastLED.setBrightness(BRIGHTNESS);
    }
    
    void loop()  {
      elapsedMicros usec=0;
      sinusoid();
      Serial.println(usec);
      FastLED.show();
    //  FastLED.delay(FPS_DELAY);
      
    }
    
    const unsigned long millis_increment = 80;
    float phase1 = 0.0;
    float phase2 = 0.0;
    float phase3 = 0.0;
    float phase4 = 0.0;
    float phase5 = 0.0;
    float phase6 = 0.0;
    
    void sinusoid() {
      phase1 += speed * 0.0030 * millis_increment;
      phase2 += speed * 0.0022 * millis_increment;
      phase3 += speed * 0.0021 * millis_increment;
      phase4 += speed * 0.0020 * millis_increment;
      phase5 += speed * 0.0041 * millis_increment;
      phase6 += speed * 0.0052 * millis_increment;
    
      const float pi2 = PI * 2.0;
    
      if (phase1 > pi2) phase1 -= pi2;
      if (phase2 > pi2) phase2 -= pi2;
      if (phase3 > pi2) phase3 -= pi2;
      if (phase4 > pi2) phase4 -= pi2;
      if (phase5 > pi2) phase5 -= pi2;
      if (phase6 > pi2) phase6 -= pi2;
    
      float s1 = sinf(phase1) * size;
      float c2 = cosf(phase2) * size;
      float s3 = sinf(phase3) * size;
      float c4 = cosf(phase4) * size;
      float s5 = sinf(phase5) * size;
      float c6 = cosf(phase6) * size;
      
      for (uint8_t y = 0; y < Height; y++) {
        for (uint8_t x = 0; x < Width; x++) {
    
          float cx = y + s1 - (Width/2);  // the 8 centers the middle on a 16x16
          float cy = x + c2 - (Height/2);
          float v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          uint8_t data = v;
          leds[XY(x, y)].r = data;
    
          cx = x + s3 - (Width/2);
          cy = y + c4 - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].b = data;
    
          cx = x + s5 - (Width/2);
          cy = y + c6 - (Height/2);
          v = 127 * (1 + sinf ( sqrtf ( ((cx * cx) + (cy * cy)) ) ));
          data = v;
          leds[XY(x, y)].g = data;
    
        }
      }
    }
    // Helper function that translates from x, y into an index into the LED array
    uint16_t XY( uint8_t x, uint8_t y)
    {
      uint16_t ledNum;
      if ( x & 0x01)
      {
        // Odd rows run backwards
        ledNum = ((x+1) * Height) - (y+1);
      }
      else
      {
        // Even rows run forwards
        ledNum = ((x * Height) + y);
      }
    
      return ledNum;
    }

  12. #12
    Junior Member
    Join Date
    Feb 2016
    Posts
    18
    I think that is probably right! I will check it tomorrow. Thank you so much for your help! You have saved me all the time that I anticipated resoldering everything. My coding chops are way out of practice. There are surprisingly few code examples of matrix effects that translate well to a simple WS2811 array. If I can pull a few together in one place I'll try to publish them - including this example. I love the teensy and have used it for all kinds of practical things - but my understanding of the intersection of math and art is lacking.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •