OctoWS2811 flicker

jimparis

Active member
Hi,

This is related to the OctoWS2811 flicker when connected via USB thread, but after further testing, USB is not the root cause or even necessary to show the problem.

It seems that OctoWS2811 has bus arbitration issues with the DMA whenever the CPU is doing anything else. With a small strip of 100 WS2812B LEDs (from Adafruit here, with the OctoWS2811 adapter board), the following code flickers very badly, particularly towards the end of the strip:

Code:
#include <OctoWS2811.h>

#define STRIP_LEN 100

static DMAMEM int displayMem[STRIP_LEN * 8];
static int drawMem[STRIP_LEN * 8];
OctoWS2811 octo(STRIP_LEN, displayMem, drawMem, WS2811_GRB | WS2811_800kHz);

void setup()
{
 	Serial.begin(115200);
        octo.begin();
        for (int i = 0; i < STRIP_LEN * 8; i++)
		octo.setPixel(i, 0x010101);
}

void loop()
{
        static float foo = 1234.45;
	octo.show();
        while (octo.busy()) {
                float m = cos(sin(millis())) + 10;
                foo *= m;
                foo /= m;
	}
}

Looking at it with a logic analyzer, I sampled the Teensy's output for 10 seconds at 50 MHz and plotted a histogram of the "1" pulse widths. Here are all pulses shorter than 400ns:

t1-short.png

Here are the pulses longer than 400ns:

t1-long.png

The short pulses around 220ns are probably causing at least some of the problem -- looking further down the chain of LEDs, it appears that they're too short, and getting dropped. I think this is happening because the DMA that sets the output high is getting delayed. There may be other problems too; this is just the most obvious.

The raw data for those histograms is:
Code:
       219    146003
       220     86207
       239    215491
       240    274804
       259     86392
       260    166190
       279    294279
       280   1484490
       299   3423085
       300    487988
       319      3080
       320      1179
       699       148
       700       164
       719       373
       720      1142
       739      1923
       740       506
       759      2865
       760      2059
       779     11777
       780     11783
       799     21832
       800     46214
       819     68660
       820     90952
       839    110116
       840    361677
       859     28411
       860     94094
       879     68680
       880      4320
       899     16555
       900      5197
       919      1336
       920       681
       939       524
       940       631
       959        80
       960        41
 
Adjusting the timings to shift the center of the histograms closer to 400/800ns doesn't help. To validate my hardware, I wrote a test that uses the SPI peripheral at 1.2 MHz, with Continuous SCK and Continuous Selection to generate unbroken waveforms (as shown in Figure 45-76 in the reference manual). I'm sending:
Code:
  1 is 416.66 ns high, 833.33 ns low;
  0 is 833.33 ns high, 416.66 ns low.
With this, lights are perfectly stable, no flicker whatsoever. With the SPI FIFO being 4 entries deep and 12 bits per transfer (to simplify breaking up the bytes), this gives us almost 20 us hardware buffer; maybe the DMA is at good enough to feed that without hiccups? I dunno... It has the drawback of needing 4 times the RAM and can only handle one output pin.
 
The bus arbitration does cause some jitter in the waveform timing.

On the LEDs I've tested here, and hundreds of other projects people have built, this small jitter has not caused any visibly noticeable problems.

I have no idea why it's not working for you.
 
Maybe it's specific to my strips, which I get here. Does my above code work for you with no flicker, on a 3.1 board? Would it be useful if I mailed you a strip that exhibits this problem?
 
StableWS2811

I wrote a library that utilizes the SPI hardware FIFO to eliminate jitter, while still feeding data via DMA so that the CPU is free to do other things. It ended up needing even more RAM than I thought, and can only handle one output at a time, but at least it's completely flicker-free.

https://github.com/jimparis/StableWS2811.git
 
I was able to get my hands on an older oscilloscope. I am clearly seeing jitter in the data line. The consistently reproducible for me with the stock Plasma Animation example. The flicker is the worst near the end of my strips. Lowering the data rate to 400khz helps eliminate most of the visible flicker, but the jitter is still in the data line.

It is interesting to note that this only happens on Plasma, when I switch to the Rainbow or Basic Test there is no jitter in the signal. I have another sketch that rotates between modes and it was easy to test each of them.

Here is an example of what I'm seeing on the Oscilloscope: http://imgur.com/0yQxkVf You can clearly see the changes in timing. This doesn't happen with the other examples. Video: https://www.youtube.com/watch?v=3jkspH_u5HA

My setup:
Teensy 3.1 @ 96mhz
OctoWS2811 Adaptor board.
Arduino 1.0.5 Linux 64bit
Teensyduino 1.18
8x 120px WS2812b strips (Thanks Ray Wu)
60A power supply.


@jimparis: I wasn't able to use StableWS2811 as a drop-in replacement for OctoWS2811. The sketch would upload then die and need to be reset. I was able to get the FlickerTest to run.
 
@jimparis: I wasn't able to use StableWS2811 as a drop-in replacement for OctoWS2811. The sketch would upload then die and need to be reset. I was able to get the FlickerTest to run.
Since FlickerTest worked, my guess would be that your buffers aren't sized right. For OctoWS2811 with 800 LEDs (100 LEDs per strip), you'd use something like
Code:
DMAMEM int displayMemory[100 * 6];
int drawingMemory[100 * 6];
OctoWS2811 leds(100, displayMemory, drawingMemory, config);
whereas for StableWS2811, which only supports one strip, 800 LEDs would be:
Code:
DMAMEM uint32_t spiBuf[800 * 6];
uint8_t pixelBuf[800 * 3];
StableWS2811 leds(100, spiBuf, pixelBuf, config);
If you paste your full code I can look.
 
Jim, how many LEDs are you using?

OctoWS2811 does indeed get some jitter in the timing due to bus latency from the DMA. It's easy to see on a scope with infinite persistence.

Over the last 2 weeks, PJRC built a display with 4320 LEDs (a grid of 90 by 48). They're controlled by a single Teensy 3.1. Each output controls 540 LEDs (siz rows of 90). I've been testing it extensively over the last few days. No flicker I can see.

I also developed a modified VideoDisplay that reads from a SD card instead of receiving the data over USB. I also used the audio library, so it can read audio and video from the SD card. A single Teensy 3.1 can play to all 4320 and output a 16 bit 44100 Hz audio stream via the DAC (which is only 12 bits, the low 4 audio bits are lost).

This 4320 LED project will be on display at Maker Faire next weekend. Really, it looks great, without any noticeable flickering.
 
I see it with as few as 120 LEDs, when it's really bad (unlucky code layout + USB connection active). I really suspect that it is related to the specific batch of LEDs; my WS2812B chips from Adafruit seem very different from those others are using, based on their 5V requirement, their timings being a little different from what others have measured, etc.

I'd be happy to mail you a strip that exhibits the problem, if you'd like.
 
I'd be happy to mail you a strip that exhibits the problem, if you'd like.

Yes, I would definitely like to get some of those problematic LEDs for testing!

I have some extra reels (240 LEDs per reel) of the ones we just used, which work well. They were purchased from an Aliexpress merchant. How about we swap 240 LEDs?

I still have the original 1920 board. It's been reconfigured as 8 strips of 240. They're the oldest WS2812's (with 6 pins, not the "B" version). They also work great.
 
Paul, do you plan on releasing that code ? I'd be very curious about the SD reading. I did something similar for the Artnet library (to record an artnet stream on sd and play it back) but the framerate drops with more that 480 leds...
 
Paul, do you plan on releasing that code ?

Yes. In fact, since it'll probably be a while until I get around to "cleaning it up" for a proper release as OctoWS2811 version 1.2, I'm going to just dump all the raw, ugly Maker Faire code right here, right now!

This is for a demo to be shown in Freescale's booth. A few weeks ago they contacted me and wanted some way to showcase stuff people have made with their chips. I offered to build them a LED display if they covered the costs. So when you see references to stuff about Freescale, that's why.

First, here's the Teensy hardware. It's just the OctoWS2811 adaptor and WIZ820+SD adaptor we sell, soldered together with long pins.

freescale_demo_hardware.jpg

Actually, what you can't see is pin 4 is cut between the Teensy and WIZ820+SD, and between the WIZ820+SD and Octo board, pins 3 and 4 are shorted together. This routes the SD card's CS signal to pin 3. You can see SD.begin(3) in the code, but I want to be clear this tiny hack is needed. You can't use pin 4, because it's reserved for OctoWS2811.

I should also mention, this is using a work-in-progress version of OctoWS2811, which you can get from github.

https://github.com/PaulStoffregen/OctoWS2811

Version 1.1 can't work together with the audio library, so you must use this newer code which fixes that issue.

On the Teensy, the code you need is Freescale_Demo_3.ino ...and no, I'm not releasing the earlier 1 and 2 versions.... ;)

Then all you need to do is get "DEMO.BIN" onto your SD card. This part is requires several steps.

First, you run Processing with movie2sdcard.pde. This is similar to movie2serial.pde for live playing, except it writes the data to a file. The filenames and other stuff are hard coded. Pay attention to the frame rate. You MUST edit the Processing code with the exact frame rate. The processing code emits headers into the data with elapsed microseconds, which need to be correct for audio to stay in sync. In other words, if your video is 30 frames/sec, the code will emit 33333, 33333, 33334, 33333, etc, so over the long run those one with 1 extra microsecond cause the overall frame rate to be correct.

Unfortunately, the Processing code plays the video at 1X speed. If anyone knows a way to get processing to just run as fast as possible, please let me know?

The .BIN file from the Processing code can play (if renamed to "DEMO.BIN"), but there's no audio. Adding the audio requires 3 steps.

First, extract the audio from the original video using ffmpeg. For example:

./ffmpeg -i Freescale1.mov -vn -f wav Freescale1.wav

Then use sox to convert from WAV to raw data, and adjust the sample rate. Teensy actually uses 48e6/1088 as its sample rate, which is slightly faster than 44100. Getting it exactly matched will keep your audio in sync. Here's the command:

./sox Freescale1.wav -c 1 -b 16 -r 44117.647 Freescale1.raw

After this, run the "addaudio" command. Source code is below. This reads the original .BIN file from Processing and the .RAW file with the audio and writes a new .BIN file with both streams.

As with everything else, this is an extremely ugly hack (in fact, I wrote all this code within the last 12 hours) with file names and other important stuff hard coded. Hopefully everything you need to edit is near the top of each file. Expect to have to do some fiddling. The addaudio program has only been tested on Mac OSX 10.7. It will probably work on Linux. Windows is less likely to be good.

Once you get the DEMO.BIN from addaudio, just copy it onto the SD card and Freescale_Demo_3.ino ought to be able to play it.
 

Attachments

  • Freescale_Demo_3.ino
    5.6 KB · Views: 555
  • movie2sdcard.pde
    5.8 KB · Views: 485
  • addaudio.c
    2.7 KB · Views: 436
You'll probably notice the Freescale_Demo_3.ino still has lots of Serial.print() stuff. It prints a line for every video frame, and a line for every audio frame.

The video frame lines show the frame time, and the elapsed time into the frame when all audio and video data has been read into buffers. The code spends the rest of the video frame running a busy loop. When I ran this here, with 4320 LEDs and 44117 Hz audio, it was printing numbers in the 15000 to 17000 range. In other words, about 50% CPU usage, since that's about half of the 33333 us video frame.

I also did some tests with video only, getting about 10000 to 11000, and reading the audio into memory but not running the audio library, getting about 13000 to 14000.

So I'd imagine this can still scale up quite a bit to larger LED arrays.
 
Thanks a lot ! I see you're reading in 512 bytes chunks from SD. In my code I use readBytes for each frame (each frame is 1492 bytes) I guess reading in smaller chunks is more efficient ?
 
Yes, I would definitely like to get some of those problematic LEDs for testing!

I have some extra reels (240 LEDs per reel) of the ones we just used, which work well. They were purchased from an Aliexpress merchant. How about we swap 240 LEDs?

I still have the original 1920 board. It's been reconfigured as 8 strips of 240. They're the oldest WS2812's (with 6 pins, not the "B" version). They also work great.

I have some "older" WS2812B strips from Adafruit that works just fine, but my "newer" ones flicker. The chips themselves are almost identical, but they do have slight differences in the LED die bonding, at least, so they're definitely different manufacturing runs:

Works fine:
fine.jpg

Flickers:
flickers.jpg
(note the difference in the bottom right pad)

I'll mail you a completely wired up test including 59 LEDs, Teensy, and OctoWS2811 adapter, just to make sure this is super easy for you to reproduce. It's currently running the code below, and you can just power it via USB to see the problem (PC not necessary; you can just use a charger). The brightness is kept low enough that the USB current limit should not be a problem. The flicker is very pronounced because there's a loop that purposely does a lot of CPU stuff while the LEDs are updating.

If you undefine "USE_OCTOWS2811", then StableWS2811 is used instead, and there is no flicker at all.

It might be interesting to hook up your 240 LEDs to this same test code, to see if you get any flicker there.

Code:
#include <OctoWS2811.h>
#include <StableWS2811.h>

/* Define this to 1 to use OctoWS2811 instead of StableWS2811 */
#define USE_OCTOWS2811 1

const int stripLen = 59;
const int config = WS2811_GRB | WS2811_800kHz;

#if defined(USE_OCTOWS2811) && USE_OCTOWS2811 != 0

/* OctoWS2811 */
static DMAMEM int displayBuffer[stripLen * 6];
static int drawBuffer[stripLen * 6];
const int ledStart = stripLen * 2;  /* 3rd output */
OctoWS2811 leds(stripLen, displayBuffer, drawBuffer, config);

#else

/* StableWS2811 */
static DMAMEM uint32_t spiBuf[stripLen * 6];
static uint8_t pixelBuf[stripLen * 3];
const int ledStart = 0;
StableWS2811 leds(stripLen, spiBuf, pixelBuf, config);

#endif

void setup() {
        leds.begin();
        leds.show();
}

elapsedMillis moveElapsed;
int moveLoc = 0;

void loop()
{
        if (moveElapsed > 50) {
                moveElapsed -= 50;
                moveLoc++;
                if (moveLoc >= stripLen)
                        moveLoc = 0;
        }

        for (int i = 0; i < stripLen; i++)
                leds.setPixel(ledStart + i, 0x010101);

        /* Red pixel in front, green in middle, blue trailing.  These
           are kept at a low brightness to keep power usage down, so
           that any flickering is more likely to be caused by software
           problems. */
        leds.setPixel(ledStart + ((moveLoc + 10) % stripLen), 0x110000);
        leds.setPixel(ledStart + ((moveLoc + 5)  % stripLen), 0x001100);
        leds.setPixel(ledStart + ((moveLoc + 0)  % stripLen), 0x000011);

        leds.show();

        /* This loop makes the flicker really bad and super obvious on
           OctoWS2811, although it's also present without it; you just
           need to look very carefully at the last few LEDs */
        static float foo = 1234.45;
        while (leds.busy()) {
                float m = cos(sin(millis())) + 10;
                foo *= m;
                foo /= m;
        }
}
 
Oh, yeah, this is something I learned with quite bit of experimentation on the audio library. The Arduino SD library performs much better if you always read 512 bytes at a time, which corresponds to the SD card sector size.

In all this work, the SD library is definitely the CPU hog. I'd be really curious to see if Bill's newer code in SdFat make a big difference. Not curious enough to divert my attention from Maker Faire prep, or even away from Teensyduino 1.19 and lots of other important stuff after Maker Faire. But curious to hear if someone else does all the hard work of investigating....
 
I'll mail you a completely wired up test including 59 LEDs, Teensy, and OctoWS2811 adapter, just to make sure this is super easy for you to reproduce.

Sounds good. Please do put something printed inside the box, with the URL of this thread.

I'll probably end up working with these in mid-June. Quite a bit of stuff has piled up in the last few weeks while I've been focusing on Maker Faire. I also really want to get Teensyduino 1.19 published, with only bug fixes.

Investigating this flicker issue is pretty high on my priority list. This is pretty good timing, since I've now got a lot of spare LEDs and power supplies and other stuff here due to this Maker Faire build.
 
I used arbitrary lengths, but made a sd_card_read() function with a static buffer to reuse any partial chunks of previous 512 byte reads.

The fact that this extra memory-to-memory copy makes things much faster really speaks volumes about the opportunities to speed up the Arduino SD library. Maybe someday..... when I have lots of free time!
 
Unfortunately, the Processing code plays the video at 1X speed. If anyone knows a way to get processing to just run as fast as possible, please let me know?

Processing 2.0 you can use myMovie.speed(1.5), (2.0) (whatever), like this:

Code:
  frameRate(30);
  myMovie = new Movie(this, "anamedclip.mov");
  myMovie.speed(2.0);
  myMovie.loop();

BUT, I found the realtime playback out of processing started to hit output issues (an older style mac).
 
OctoWS2811 does indeed get some jitter in the timing due to bus latency from the DMA. It's easy to see on a scope with infinite persistence.

Checkout my Oscilloscope, you can see the ghost of several jitters up to ~ 120ns that is close to a 10% shift of the 1250ns (1.25us). Take a look: http://imgur.com/0yQxkVf you can see the waveform taking 8 sections (1.25us/800khz per click.)

I don't doubt that you have massive numbers of LEDs working wonderfully. I need some help getting to the bottom of this. My issue is measured before it gets to the LEDs. I have a second Teensy and a second adapter board I can test with if needed.

Paul, What version of Java do you use?
 
The tests that I have been running have used the stock PlasmaAnimation (https://github.com/PaulStoffregen/O.../examples/PlasmaAnimation/PlasmaAnimation.ino)

Even when the cat6/data line is not connected to anything (no LEDs connected) I see this data-line jitter. If I add a delay line after the LEDs are updated, then it appears to run smoothly and the data signal is clear ad consistent.
Code:
leds.show(); 
delayMicroseconds(3650); //1.25 per bit * 24 * 120px = 3600.  + 50 for the latch pause.

I'm thinking that something that it is doing while the LEDs are updating is causing this jitter. 64bit compiler? something in the fastCosineCalc function? pgm_read_byte_near ? inline?

PS. Paul, Have a blast at Maker Faire!
 
Off hand, you might be experiencing cache thrashing, if each of the buffers being processed start on the same cache boundary (if you have a N-way cache, and you are processing N+1 streams, all aligned to a cache boundary, each load/store will evict the cache of another stream). If you bias the buffers it might help. Alternatively, if you are processing the streams sequentially, it may make sense to use software prefetching. I don't know what the cache layout is for the Teensy's, so I'm just guessing.
 
I am experiencing this problem as well. We have an irregularly-shaped array of some 1920 LEDs (well within the capabilities of the Teensy 3.1), that will flicker when connected to the source computer (running Movie2Serial inside our custom graphics generation Processing sketch) even when simply running the Rainbow program.

We have the WS2811 strips from AliExpress, and have swapped almost all of them out in pursuit of a solution. Adding buffers, shielded signal cables, and other protective elements to eliminate signal issues, it's down to a technical level I don't have the knowledge to troubleshoot.

One route of inquiry we've explored is how the resulting frame rate in our Processing sketch (which is generating graphics that are scraped into the PImage and sent to the Teensy) effects this flicker. We have quite a bit going on, admittedly, with a Kinect tracking a user's hands and effecting a simple particle system to draw colored circles. If we eliminate the Kinect aspect, or the particle aspect, (whereby the framerate increases) we see the flickering occurs at a higher frequency.

How could we diagnose and quantify the burden a Processing sketch is having on communication with Teensy? We've already planned in a future system to have all sensor and rendering activity performed on a second computer, with the Teensy video display generation happening on a dedicated computer with no processing burden.
 
Back
Top