Is FastLED / WS2812Serial Broken in Teensyduino 1.59?

PaulStoffregen · Apr 4, 2024

jmarsh said:
What if you put that same code (to dump the vtable address and init function pointer) in the ClocklessController constructor?

Something very interesting happens. I put this constructor into clockless_arm_k20.h.

Code:

        ClocklessController() {
                Serial.printf("ClocklessController ctor, this = %08X, *this = %08X\n",
                        (uint32_t)this, *(uint32_t *)this);
        }

I get this printed in the serial monitor

Code:

ClocklessController ctor, this = 1FFF8EB0, *this = 000097D8
addLeds, pLed = 1FFF8EB0, Led = 00000000, init = 000012CD

Looks like memory at 1FFF8EB0 did have the vtable address when the instance was constructed, but then later when addLeds() runs it's been changed to zero.

Looking in the asm listing, location 000097D8 is indeed a vtable

Code:

000097d0 <vtable for ClocklessController<1, 24, 60, 36, (EOrder)10, 0, false, 50>>:
        ...
    97d8:       00000c51 00000b01 000004e5 00000d9d     Q...............
    97e8:       000004d5 000004dd 00000a65              ........e...

And sure enough, the number 8 bytes into the vtable is 000004e5 (LSB set to mean execute in Thumb mode) and when I look at address 4e4 in the asm listing, it is indeed the init() function.

Code:

000004e4 <ClocklessController<1, 24, 60, 36, (EOrder)10, 0, false, 50>::init()>:
        virtual void init() {
     4e4:       b510            push    {r4, lr}
        virtual int read() { return usb_serial_getchar(); }
        virtual int peek() { return usb_serial_peekchar(); }
        virtual void flush() { usb_serial_flush_output(); }  // TODO: actually wait for data to leave USB...
        virtual void clear(void) { usb_serial_flush_input(); }
        virtual size_t write(uint8_t c) { return usb_serial_putchar(c); }
        virtual size_t write(const uint8_t *buffer, size_t size) { return usb_serial_write(buffer, size); }
     4e6:       2118            movs    r1, #24
     4e8:       4604            mov     r4, r0
     4ea:       4809            ldr     r0, [pc, #36]   ; (510 <ClocklessController<1, 24, 60, 36, (EOrder)10, 0, false, 50>::init()+0x2c>)
   ........ many more asm instructions ...........

Crazy as this sounds (given FastLED's very mature status), it's starting to look like we may be dealing with a previously unknown buffer overflow or other memory corruption bug.

jmarsh · Apr 4, 2024

What about the stack? I'm not familiar with the memory layout on Teensy 3.x so clutching at straws a bit, but consider:

This post says it is possible to call init() but then the call to addLeds never returns, presumably because the call to init() inside of it wanders off into the unused ISR instead.

Code:

#ifdef USE_WS2812SERIAL
    template<SWS2812 CHIPSET, int DATA_PIN, EOrder RGB_ORDER>
    static CLEDController &addLeds(struct CRGB *data, int nLedsOrOffset, int nLedsIfOffset = 0)
    {
        Serial.println("GFV - Constructing Controller");
        static CWS2812SerialController<DATA_PIN,RGB_ORDER> controller;
        controller.init();
        Serial.println("GFV - Calling addLeds");
        return addLeds(&controller, data, nLedsOrOffset, nLedsIfOffset);
        Serial.println("GFV - Back from addLeds");
    }
#endif

But there's a couple of gotchas in that code:
- The "GFV - Back from addLeds" print statement is after a return statement (the previous line), so it would never be shown regardless of the function completing
- There's a Serial.println() between the init() call and the addLeds() call.

What I'm wondering: could the Serial.println() be overflowing the stack and overwriting the controller's vtable ptr? It seems like the only sane explanation for the vtable/init() function working one moment and then gone the next.

PaulStoffregen · Apr 4, 2024

jmarsh said:
What about the stack?

I added code to read the stack pointer, both in the constructor and right before the failing addLeds() call.

Code:

ClocklessController ctor, this = 1FFF8EB0, *this = 000097E4
sp = 20007FD0
addLeds, pLed = 1FFF8EB0, Led = 00000000, init = 000012E5
sp = 20007FB8

Looks like the stack is nowhere near. That's what you'd expect from the static memory summary too.

Code:

Global variables use 4200 bytes (6%) of dynamic memory, leaving 61336 bytes for local variables. Maximum is 65536 bytes.

Of course it's still theoretically possible something could have temporarily pushed stuff about 60K onto the stack, but seems pretty unlikely.

jmarsh · Apr 4, 2024

True, I'd imagine pretty much every program would be dying on the 3.2 if that were the case.

jmarsh · Apr 4, 2024

I had a quick look, I think I know what the cause is but I'm not sure of a few details.

If you look at the memory allocated immediately before the ClocklessController there should a 4-byte guard variable. Guards are generally used for local static objects to ensure they only get initialized once (when they're first used) - I think it's a thread safety thing, since with a static object you can't be sure which thread will touch it first. g++ calls specific functions for this that are implemented in Teensyduino's mk20dx128.c:

Code:

__extension__ typedef int __guard __attribute__((mode (__DI__)));

__attribute__((weak))
int __cxa_guard_acquire (__guard *g)
{
    return !(*g);
}

__attribute__((weak))
void __cxa_guard_release(__guard *g)
{
    *g = 1;
}

The problem I'm seeing here, is that the compiler thinks __guard is an 8-byte type (that's what the mode __DI__ attrib does). So the ClocklessController gets initialized (assigns vtable ptr, calls the constructor) and then calls __cxa_guard_release() - which puts a 1 in the 4 bytes allocated for the guard variable, and stomps a 0 on the next 4 bytes which happens to be the vtable pointer that it just updated. That's why the vtable appears correct in the constructor and disappears immediately afterwards.

For Teensy 4.x, the __cxa_guard functions use char* arguments instead of __guard*, so they don't have an issue no matter what size is actually allocated for the guard variable (as long as it's at least as big a char / 1 byte).
For Teensy 3.x, it looks like the allocated size for __guard changed from 8 bytes to 4 bytes for some platforms but not others. I haven't really dug deep enough to figure this out, but regardless the easy fix would be copying the usage of char* like what is used for Teensy 4.x.

PaulStoffregen · Apr 4, 2024

jmarsh said:
but regardless the easy fix would be copying the usage of char* like what is used for Teensy 4.x.

Yes, sounds good to me. I'm running it now and FastLED seems to be working.

Committed the fix just now!

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

https://forum.pjrc.com/index.php?threads/74548/post-341648

github.com

gfvalvo · Apr 5, 2024

PaulStoffregen said:
Yes, sounds good to me. I'm running it now and FastLED seems to be working.

Committed the fix just now!

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

https://forum.pjrc.com/index.php?threads/74548/post-341648

github.com

@PaulStoffregen
@jmarsh

Wow, great detective work! Thanks!

Two questions:
1. Out of curiosity, can you go back in Teensyduino history and find where this broke? FastLED definitely worked in some releases prior to 1.59.

2. Can you propose a code that triggers this failure outside of the FastLED framework?

Thanks again.

PaulS · Apr 5, 2024

gfvalvo said:
1. Out of curiosity, can you go back in Teensyduino history and find where this broke? FastLED definitely worked in some releases prior to 1.59.

I think it changed here. I assume it was for a good reason then - hope it's not going to bite us somewhere else now.
Anyway, thanks very much to @PaulStoffregen and @jmarsh after indepth sleuthing!

Paul

jmarsh · Apr 5, 2024

PaulS said:
I think it changed here. I assume it was for a good reason then - hope it's not going to bite us somewhere else now.

I didn't realize this was a recent change to Teensy's code, I thought it was a change in compiler behavior (from upgrading gcc) that didn't get picked up until now. I guess reverting to char* would bring the warning back with LTO?

gfvalvo · Apr 11, 2024

PaulStoffregen said:
Yes, sounds good to me. I'm running it now and FastLED seems to be working.

Committed the fix just now!

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

https://forum.pjrc.com/index.php?threads/74548/post-341648

github.com

@PaulStoffregen,

I see you've committed another change in the GitHub to 'new.cpp' and 'new.h' that appears to be related to this issue. I'd like to try it in my FastLED application. How do I download and use this unreleased version of Teensyduino? All I know how to do is download and run a released Teensyduino installer and point it at Arduino 1.8.x. Or, perhaps better, since I use Eclipse / Sloeber, how do I get a link like this for the unreleased version???
https://www.pjrc.com/teensy/package_teensy_index.json

PaulStoffregen · Apr 11, 2024

If using Linux or Windows or MacOS using Arduino 2.x.x, simplest way is to find those files on your hard drive and simply replace them with copies downloaded from github. The new IDE installs stuff into hidden folder, AppData on Windows, ~/Library on MacOS, ~/.arduino15 on Linux. If you use a search to find the file, you might need to tell it to look those hidden locations.

On MacOS with the old IDE, those files are inside the application bundle. You can control-click and "show package contents" to access them, but editing anything inside the bundle runs afoul of MacOS Notarization. Expect MacOS to eventually notice the software has been tampered with and complain.

gfvalvo · Apr 11, 2024

PaulStoffregen said:
If using Linux or Windows or MacOS using Arduino 2.x.x, simplest way is to find those files on your hard drive and simply replace them with copies downloaded from github. The new IDE installs stuff into hidden folder, AppData on Windows, ~/Library on MacOS, ~/.arduino15 on Linux. If you use a search to find the file, you might need to tell it to look those hidden locations.

On MacOS with the old IDE, those files are inside the application bundle. You can control-click and "show package contents" to access them, but editing anything inside the bundle runs afoul of MacOS Notarization. Expect MacOS to eventually notice the software has been tampered with and complain.

Thanks @PaulStoffregen
So there's no installer or .json link for unreleased versions? Which "those" files do I need to replace? 'new.cpp', 'new.h', mk20dx128.c? Any others?

PaulStoffregen · Apr 11, 2024

You only really need mk20dx128.c.

Is FastLED / WS2812Serial Broken in Teensyduino 1.59?

PaulStoffregen

Well-known member

jmarsh

Well-known member

PaulStoffregen

Well-known member

jmarsh

Well-known member

jmarsh

Well-known member

PaulStoffregen

Well-known member

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

gfvalvo

Well-known member

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

PaulS

Well-known member

jmarsh

Well-known member

gfvalvo

Well-known member

Fix __cxa_guard_release (thanks jmarsh) · PaulStoffregen/cores@638a207

@PaulStoffregen,

PaulStoffregen

Well-known member

gfvalvo

Well-known member

PaulStoffregen

Well-known member

Is FastLED / WS2812Serial Broken in Teensyduino 1.59?

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

@PaulStoffregen,​

Well-known member

Well-known member

Well-known member

@PaulStoffregen,