I just wanted to share a debugging technique I came across which most of you will no doubt think is basic, but may be helpful to other newbies like myself.
I'm working on a project to use a Teensy 3.2 with FastLED to drive 8 strands of 550 WS2812 LEDs, and I'm trying to get it running at 60 frames per second.
It works if you are quickly generating the pixel data onboard. I want to send the pixel data over ethernet, and I discovered that once you account for the time it takes to process the data being received over the network, I've found that I can only get up to about 8x425 pixels at full speed. So I've been trying to find the bottleneck and optimize that code.
This project uses FastLED with the OctoWS2811Controller (see here for documentation).
I'm suspecting that a lot of CPU time is being spent rearranging bytes from the CRGB array format to the format needed for DMA. But without a debugger or profiler, it's hard to really measure that.
So here's what I did. I hooked up a little Saleae logic analyzer to a couple of the free output pins. Right before the code I want to "time", I set one of those pins HIGH, and after the code, I set that pin to LOW. For example:
The logic analyzer now gives me a nice little picture of exactly how much time is being spent in which parts of the code, for example:
You can use multiple pins at the same time to watch different parts of the code.
I could see from this analysis that 6.573 ms are being spent in the block of code for every loop. I'm only at the beginning of the journey to optimize this code, but at least I can measure it now!
PS you can use an oscilloscope instead of a logic analyzer if that's what you have handy.
PPS if anyone is familiar with FastLED's COctoWS2811Controller::showPixels function and has some ideas for making it faster, I'd love to hear them!
I'm working on a project to use a Teensy 3.2 with FastLED to drive 8 strands of 550 WS2812 LEDs, and I'm trying to get it running at 60 frames per second.
It works if you are quickly generating the pixel data onboard. I want to send the pixel data over ethernet, and I discovered that once you account for the time it takes to process the data being received over the network, I've found that I can only get up to about 8x425 pixels at full speed. So I've been trying to find the bottleneck and optimize that code.
This project uses FastLED with the OctoWS2811Controller (see here for documentation).
I'm suspecting that a lot of CPU time is being spent rearranging bytes from the CRGB array format to the format needed for DMA. But without a debugger or profiler, it's hard to really measure that.
So here's what I did. I hooked up a little Saleae logic analyzer to a couple of the free output pins. Right before the code I want to "time", I set one of those pins HIGH, and after the code, I set that pin to LOW. For example:
Code:
void setup() { pinMode(0, OUTPUT); }
// later in my code:
digitalWriteFast(0, HIGH);
// followed by the code I want to measure
digitalWriteFast(0, LOW);
The logic analyzer now gives me a nice little picture of exactly how much time is being spent in which parts of the code, for example:
You can use multiple pins at the same time to watch different parts of the code.
I could see from this analysis that 6.573 ms are being spent in the block of code for every loop. I'm only at the beginning of the journey to optimize this code, but at least I can measure it now!
PS you can use an oscilloscope instead of a logic analyzer if that's what you have handy.
PPS if anyone is familiar with FastLED's COctoWS2811Controller::showPixels function and has some ideas for making it faster, I'd love to hear them!