Frightening rare failure... Any wisdom?

Davidelvig

Well-known member
I use a standard Teensy 4.1 NE on my own board, connecting D+ and D- with pogo pins so that I can present USB C to the outside.
I've shipped over 50 of these without getting this issue.

I left the device (a digital trumpet) plugged in to my Mac overnight (I often do) and the app running on the Teensy was hung in the morning. That's not that unusual... My code has bug remnants...

But it would not restart:
- power off then power on
- boot attempt via PlatformIO and with Arduino (USB C)
- hold the Teensy reset button during power on did not affect this

Upon opening my device, I see the repeating sequence of 4 red LED blinks near the bootloader chip on the Teensy.
I was able to flash the Teensy through the USB Micro port, and it's operating normally now, including flashing through USB C

Questions:
- what might cause the Teensy to get in this state (4-flash LED)?
- What might be different about the USB micro connection and the pogo-pin D+ D- connection and the USB C connection.

In USB C mode, GND is hard wired and VUSB goes through an OnOff circuit on my board that delivers slightly less that VUSB to VIN on the Teensy when "On".When USB Micro connected, of course, it delivers full VUSB to the Teensy.

Attached are 2 pics of the pogo pin connection.
Pogo Pins Top.jpg
Pogo Pins Bottom.jpg
Board.jpg

Any ideas for how I can sleep at night with these units in the field (and how to design better for the next 50?)

Also, this may be a board from PJRC or one from SparkFun. I don't know how to tell, and I've mixed my stock.
 
Oh my goodness... as I continued testing, I saw the state again...4-flash and otherwise unresponsive.

When I unplugged a screen module (in the 1/10th inch header above) the flashing went away and the board booted normally. Plugged the screen back in and it continued to work, and has worked since. I'm trying wiggling the display, and no repeat of failure yet.

So, I'm doubting it was the USB C and pogo pins.

I'm expecting something with a faulty connection of one of the 8 display pins which are:

1: Teensy 33: Backlight_Display - 220 own series resistor
2: Teensy 36: CS_Display - 10K ohm pull-up
3: Teensy 37: DC_Display
4: Teensy 38: Reset_Display
5: Teensy MOSI
6: Teensy SCK
7: Teensy 3v3
8: Teensy GND

Any of these pins seem especially tempting for a short to GND or 3v3 causing this 4-blink Teensy state?

And the question about what is this 4-flash mode? I get this from ChatGPT:

The red LED blinking four times near the USB port on your Teensy 4.1 typically indicates that the bootloader chip is trying to communicate but no user code is loaded or there was a problem with the uploaded code.

Thanks


[edit - DC and Reset pin nums corrected]
 
Last edited:
Latest:

Problem restarted on its own while sitting on the desk, plugged in.
4-blink red LED near USB persists.
I can press and hold reset for 15 seconds (13-17) and the red LED glows for 45 seconds, then returns to 4-blink pattern
This on USB C or Micro-USB power with no display plugged in.

Any more ideas?
1750884625503.png
 
This link to the PJRC website shows the official explanation of the blink codes if you scroll down near the bottom of the page. https://www.pjrc.com/store/ic_mkl02_t4.html

If you have 50 units in the field and no customer complaints, you are probably dealing with an issue specific to your current setup rather than a design issue.

I would suspect the USB cable getting worn out if you plug it in and out a lot in testing as I have seen somewhat similar random red LED blinking issues when a USB cable has passed its useful life, perhaps due to high contact resistance. The fact you see it with both USB ports and with different USB cables would seem to rule that out.

Do you have another system you can try to reproduce the issue with? Perhaps the one you are working with just happens to have a marginal crystal or solder joint, especially if it is an engineering unit that may have been abused a bit.
 
Thanks for that link @KenHahn. So 4 blinks has something to do with:
- the 24MHz crystal on the board not oscillating (I can see this and think I could test this)... or
- a particular connection between the boot loader chip and the main IMXRT (I would not know how to test this)

These are all new components in this failed unit (the Teensy, my carrier board). And like you say, different USB cables. Both of which work to flash other Teensys.

This is likely above my pay grade, but I expect an oscilloscope on the crystal pins would tell me about the first possibility. And if it's that, bad Teensy, I guess. I've had hundreds of issues to work through over the last decade, and I don't think it's ever been the Teensy itself.

I can live with a single bad board, but I'd sleep better if I knew why.

Thanks for your help!
 
That page with the blink codes is mainly for people who are designing their own Teensy boards and integrating the bootloader chip into their own PCB. The explanations of the blink codes is oriented more toward troubleshooting PCB design defects, like forgetting to run a trace or not attaching a crystal, though it may give you a clue.

As for the crystal, it can be hard to probe as an oscilloscope probe capacitance is usually enough to throw it out of whack or stop it from oscillating altogether. Using x10 probe has less negative effect than x1 probe. The parts are also very small and hard to probe without shorting things out and inducing issues like red blink codes. This I know from experience.

The only other thing I can think of is that though your Teensy looks nice and clean, it might be worth giving it another good cleaning with some IPA to make sure there isn't some flux residue or other contaminate hiding somewhere that is causing your issue. The nature of crystals make them sensitive to contaminates. A spray bottle with something like 99% IPA can be useful for flushing stuff off the board.
 
@KenHahn , I inspected those areas (24MHz crystal and boo loader) and saw no visible defects. Then I did a scrubbing with a paper towel with some 99% IPA. I even doused it a bit and scrubbed again.

It didn't come back alive right away. And now I think that was because it was still perhaps a little wet with the alcohol (through my friend ChatGPT says it's a very poor conductor).

Two hours later now I plugged it in and did not get the evil four blinks. And I was able to flash the board through my usual mechanisms.

I'll be thinking through what I should do with my processes and perhaps whether I should be giving my newer teen a bath before shipping them to customers. I hope this was a one off. I'll be testing this board for the foreseeable future after different scenarios of leaving it on and leaving it off.

Thanks again, @KenHahn and @jmarsh for your comments!
 
Washing with isopropyl alcohol probably fixed it.

4 blinks almost always means a problem with the 24 MHz crystal, especially on a properly manufactured board that went though testing.

Even on custom PCBs, it's the same JTAG pins for communication except for 1 signal which selects what part of the chip will respond to JTAG. Part of the chip works with JTAG even without a crystal. The other part only talks if the crystal is working. If you get 4 blinks rather than 2 blinks, that means JTAG was working when talking to the circuitry that doesn't need the crystal. The problem can theoretically be with the signal that selects which part of the chip talks JTAG. But that's pretty reliable and gets well tested on every manufactured Teensy. The crystal also gets well tested. But crystal oscillators are high impedance circuits that are quite sensitive to physical contamination.... the sort that cleans up easily with isopropyl alcohol.
 
Thanks @Paul
The problem is now intermittent on this one board, and may be related to time-on. Blinking yesterday after being plugged in a while. Unplugged overnight. Working correctly this morning on first plugging in. I'm going to leave it for a few hours to see what happens.

Temperature maybe? Nothing feels hot when it's failing.

SerialNumber: 69295218-3F2041D7

... is the serial number of the chip using
teensyUID64(uint8_t *uid64)
from TeensyID.cpp
(Copyright (c) 2017 Stefan Staub)

[Edit] 90 minutes later after being left plugged in. 4 blinks.

[Edit] 2 hours after that and having been unplugged. Plugged in to see 4 flashes.
Now it's in the fridge for a while. I'll see if cooling causes any positive changes.

After that, perhaps it's time for progressive destructive changes to the Teensy's environment. Desolder from the daughter board if I can, and see if the problem persists.
 
Last edited:
I now have my second failing teensy 4.1 mounted.
Both blink the same 4-blink pattern on the bootloader-associated red LED.
Sometimes they start blinking right away. Other times, it takes a few hours for it to start.
When not blinking, my code executes without error.
In fact, this second failing occurred after about a day of being plugged in... having operated normally through testing.

video of behavior

I think that both of these Teensys are from Sparkfun's processes. (I say that because they appear consistently different from older Teensys in their absence of a small gap between the Teensy board and its pre-mounted header pins). I encountered no such problems in hundreds of Teensys prior to the beginning of this thread.

Old and New Teensy 41-lowQ.jpeg


This is a bigger problem because it is happening up to a day after assembly (now 48 solder points in, plus 16 more for a second soldered board), and it worries me to have this in customers' hands (totally unresponsive to flashing or Blinky-recovery, and invisible blinking inside my enclosure.)

@Paul are you the one that could help debug this, or should I seek someone at Sparkfun?
I'm happy to provide the failing mounted boards for testing. Or try to detach the 48 solder points and test the board alone.

Any advice?
 
There is a 'sparkfun now what' thread where Paul noted the 'born on date' of the MCU for Sparkfun production will be after 'some date'
Are they purchased with pins soldered?

Can you post clear photos of the MCU/T_4.1 to show the date and board? Might give Paul a glance.
This shows in the video:
1755109015979.png
- not a Locked T_4.1?

What happens with a 15 second Restore? Does it return to Blink - and will that keep running, or if a simple sketch is uploaded doing similar - maybe add USB echo back to see it running and blinking.
 
Last edited:
I like Defragster's idea of turning one of your units into a socketed test fixture if that is an option. Would allow you to do a burn-in test of the Teensy before permanent install while working through this issue.

As another thought, I have seen posts where Paul has also correlated 4 blink error codes to being a problem with the flash chip. I also saw a post where someone induced the 4 blink code by accidentally removing the USB cable in the middle of a download thus corrupting the flash code.

Is your software by chance writing to the Flash during program execution? Perhaps occasionally stepping on data it shouldn't?
 
Thanks @defragster !

15 second hold of the reset button does the following:

Press and hold reset
Press on at 0
0-15 seconds = 4 blink
release reset ==> solid red
15-60 seconds = solid red
60....... = 4 blink
10 minutes (or so, on one of the two boards) = 9 blinks

9 Blinks = ARM JTAG DAP Init ErrorThe ARM JTAG DAP was detected (4 blinks) but could not be initialized. This error is rather unlikely by hardware! However, a software crash resulting in hard fault or CPU lockup (typically very early in startup) can result in this error.
documentation here

Trying to duplicate.

On Bad Board 2 (only):
0-15 seconds 4 blink
release
15-58 seconds solid red
58 -~66 seconds reset button white light on 1 second off 1 second pattern
66... seconds 9 blinks

...but on two subsequent tests: straight to 4 red blinks at 58 seconds with no reset light interval.

On another note: My daughterboard delivers 5V power through the 5V pin when on (daughterboard has on-off circuit). It delivers about 13 mV to the Teensy 5V pin when "off". Nothing in the microUSB socket. Connection between microUSB-VUSB and 5V pin intact.

Here are the two failing boards:
Bad Board 1.jpeg

Bad Board 2.jpeg


I have created a testing board with male headers and I'm flashing it and letting it sit.
Thanks for your advice!

Dave
 
As another thought, I have seen posts where Paul has also correlated 4 blink error codes to being a problem with the flash chip. I also saw a post where someone induced the 4 blink code by accidentally removing the USB cable in the middle of a download thus corrupting the flash code.

Is your software by chance writing to the Flash during program execution? Perhaps occasionally stepping on data it shouldn't?
I do implement my own firmware flashing code after downloading a hex file (via BLE to the uSD card), using FlasherX. I don't think I had yet done that with these two units. That's usually done in the field in the customers' hands.

I also access EEPROM regularly through what I think are the usual methods.

How might I be corrupting bootloader code or its flash memory? Where is that? In the same flash as I have access to?
That would be great news (something I might be able to fix and avoid)

Thanks, @KenHahn
 
History? These are not new design PCB's [ 1/3/25 19B* ]? Been used with PJRC provided T_4.1's in the past?
Any reading of the date code post linked above?

The vBat is installed it seems? Any change with that removed?

5V from PCB - is USB power clipped/diode protected on programming?
 
I've been shipping a Teensy 4.1-based product mounted on my own board for about 2 years. This latest board, since early this year. No Teensy-relatd problems before these two.

On the two problematic boards, I see the code
CTQB2503J

I have one header-mounted one with this same date code (CTQB2503J) in testing now (plugged, current code running. There are difference differences between this header, mounted board and a soldered board in practice and they include:
- no D+ and D1 connection for the USB C I have on my board
- no screen or touch pin connections (though I might add those for completeness.)

I don't use VBAT. I use a separate clock chip connected via I2C. It uses the coin cell. It also shares the GND and 3v3 buss with the Teensy.

VUSB goes through a diode to get to VIN (V5 pin on Teensy).
My D+ and D- (when mounted to the board) go straight from the USB socket to the Teensy without any diodes. Should there be something there?

Attached are two pictures showing what I think are old (from PJRC) and new (from Sparkfun) Teensys.
The new ones, in my case, have had black text on the bag, have no gap between the PCB and header insulators, and a 2025 date code.
The older ones, in my case, have had white text on the bag, a small gap between the PCB and header insulators, and a 2024 date code.
TeensysAndBags.jpeg
Teensy Chips.jpeg

1755129825844.png
1755129855673.png
 
Either a photography illusion (msg#16) or those header pins all have cold solder joints (balls of solder and donuts of solder surrounding the pins).

You could try holding the Program button first and then apply power to Teensy. At the 13-17 second window, the red LED should flash briefly just 1 time (at least that's what it does on my T4.1, white lettering on bag). That 4 second window is when you have to release the Program button. Now, erasing/programming of FLASH takes place.

Teensy pcb is very thin. Maybe a cracked a solder joint when extracting Teensy from a mating header. Heating might be an issue thus exposing the cracked connection?

Still sounds like a 24MHz crystal failure of some nature. Hopefully not the Bootloader interface.
 
I started to write a post about possible code corruption but after re-reading your posts I have convinced myself that it is not a factor. You sometimes have Teensy self-heal after sitting for a while and in other cases they are not responsive to the 15 second button push which should clear any likely Flash corruption that might have occurred.

I have many SparkFun and PJRC Teensy 4.1 here. The header pin plastic spacers and bag labeling is different as you noted and of course newer date codes on the NXP chips. One other difference is that the PJRC Teensy with presoldered pins used no-clean flux and were not cleaned after assembly, at least on the ones I received. The SparkFun parts are cleaned after pin assembly. That would generally seem to be a good thing, but perhaps contaminates in some cases could migrate under the BGA during cleaning?

Seems unlikely, but if there are any remaining concerns about possible contamination being an issue on the problem units, I am a fan of using 99% IPA in a spray bottle to fully flush any contaminates off the board while holding the Teensy vertically and scrubbing with something like a solder brush with the bristles cut moderately short for a little better scrubbing action. The 99% stuff flushes out under the BGA better and does not leave water behind like the drugstore stuff and is easy to get on Amazon. You do want to wear a rubber glove if you are holding the Teensy as it will also do a good job of cleaning all the oils out of your skin as well.
 
Thank you both, @BillFM and @KenHahn !
As I look at the soldering on the new and old boards, they do look different, but I don't see any clear cold solder joints. Zoom into the picture on post 12 above and you'll see that the old boards are more Hershey's kiss like and the new boards are a little more domed.

I have some 99% isopropyl alcohol here in my hand, but I don't specifically have a solder brush. I imagine a toothbrush would do some good. I can only get at the tops of these problematic boards, and I imagine it's particularly the area around the boot loader chip that I want to rinse and scrub. I can make sure to put the IPA on top of the main chip as well with the board held vertically so it might drain underneath.

Fingers crossed!

A newer board is sitting on my head or pin test unit, now for a couple of hours and no difficulty. I may re-flash it with version that emulates what the product does when more actively in use.
 
Progress Report:

I took one of the failing soldered-on boards and gave it a good scrubbing on its top side with a toothbrush and 99% IPA. [The alcohol, not the beer :) ]

I also held the board vertically and trickled some IPA across it in the area of the main CPU hoping that the underside might get a rinse. Allowing 12 hours or so before testing to dry, the behavior is the same… Still failing.

Separately, I mounted a brand new Teensy 4.1 on one of my daughter boards with headers and loaded and ran my application with a fair workload running, playing a rapid series of musical notes throughout the night through USB and through a BLE connection. This morning, the music is still playing, so no failure with this board. It is one of the newer Teensys from Sparkfun (25 model year).

I will certainly run this testing process for the time being for each customer unit going out, though it does not leave me with full comfort. The root cause is still unknown.

I'm happy to take any steps that would help debug the problem for my benefit and possibly others... including separating the problematic Teensy from my daughter board and seeing if the Teensy still misbehaves.

I presume to do this: (and I'd welcome advice)
  1. solder-wick or heated suction to remove the solder from the daughter board side of the junction
    or
  2. somehow remove the insulation from the pins between the boards
    then
  3. pry the boards apart
    or
  4. cut the pins between the boards
    then
  5. retest the Teensy by itself

Before taking any such destructive steps though, I'd like to enlist any resources from PJRC or Sparkfun in the event they want to find the source of this problem in a different way.

Any direction, @Paul ?
 
I would suggest cutting (carefully) the pins between the boards.
Attempting to de-solder and separate the boards in my mind is a recipe for disaster.
It's always difficult to pull more than one pin out from a board at the same time, trying to do that with 48 or more pins, in my mind, would be virtually impossible without damage.
 
OK, Done with a Dremel.
Results:
Same behavior from isolated Teensy (2025 model) liberated from the second failing board
1) power on: 4 flash at boot loader
2) 15-second-reset-hold: 45 second solid red followed by 4-flash at boot loader
3) unplug and replug - 4 flash at boot loader.

So it appears that the teensy itself in isolation, at least at this point, fails in a particular way.

Does access to the backside of the board offer me any opportunities for additional debugging?
Any other ideas?

IMG_3090.jpeg

IMG_3091.jpeg

IMG_3093.jpeg
 
Back
Top