Performance measurements with a logic analyzer

Status
Not open for further replies.

spolsky

Member
I just wanted to share a debugging technique I came across which most of you will no doubt think is basic, but may be helpful to other newbies like myself.

I'm working on a project to use a Teensy 3.2 with FastLED to drive 8 strands of 550 WS2812 LEDs, and I'm trying to get it running at 60 frames per second.

It works if you are quickly generating the pixel data onboard. I want to send the pixel data over ethernet, and I discovered that once you account for the time it takes to process the data being received over the network, I've found that I can only get up to about 8x425 pixels at full speed. So I've been trying to find the bottleneck and optimize that code.

This project uses FastLED with the OctoWS2811Controller (see here for documentation).
I'm suspecting that a lot of CPU time is being spent rearranging bytes from the CRGB array format to the format needed for DMA. But without a debugger or profiler, it's hard to really measure that.

So here's what I did. I hooked up a little Saleae logic analyzer to a couple of the free output pins. Right before the code I want to "time", I set one of those pins HIGH, and after the code, I set that pin to LOW. For example:

Code:
   void setup() { pinMode(0, OUTPUT); }

   // later in my code:
   digitalWriteFast(0, HIGH);
   // followed by the code I want to measure
   digitalWriteFast(0, LOW);

The logic analyzer now gives me a nice little picture of exactly how much time is being spent in which parts of the code, for example:

Screen Shot 2020-01-06 at 3.42.22 PM.jpg

You can use multiple pins at the same time to watch different parts of the code.

I could see from this analysis that 6.573 ms are being spent in the block of code for every loop. I'm only at the beginning of the journey to optimize this code, but at least I can measure it now!

PS you can use an oscilloscope instead of a logic analyzer if that's what you have handy.
PPS if anyone is familiar with FastLED's COctoWS2811Controller::showPixels function and has some ideas for making it faster, I'd love to hear them!
 
Yep, that is exactly correct. The speed of the protocol means that if you want to update 60 times a second for high quality persistence of vision effects, you can only have 552 pixels per strand. On Teensy’s, there is a nice library that lets you bang out 8 strands at once from DMA, so you can do 4416 pixels, in theory.

There are still reasons to prefer WS281x LED strips (One wire for protocol instead of two, and there are 12v variants which don’t need power injection so often).
 
Status
Not open for further replies.
Back
Top