Teensy 4.0 First Beta Test

Status
Not open for further replies.
Just a quick update - the contract manufacturer put the chips onto the 6 layer boards... and I've been soldering the rest of the parts. The boards look awesome! Robin and I keep having a "double take" when we glance at them. They kind of look like a Teensy 3.2 from a distance, but then you notice it's different. The arrangement of parts on the top side isn't changed much from the center region of the first beta, but still, seeing it now actually in the Teensy form factor looks awesome.

It's only been a few months since I brought up the first round of beta boards, but it kinda feels like a lifetime ago. And ~2090 messages ago! At first I had a terrible sinking feeling, when the first 2 boards weren't working at all. A few hours into troubleshooting, I found my saved notes about the test procedure, which involves the important first step of setting a couple of the OTP fuses. With all this focus on Arduino stuff, and then on the PCB layout, I'd mostly forgotten that process which is based on stuff from my earliest experiments with the chip (long before the bootloader).

Great news! I am very curious what they look like.
On the whole, the software already works, and the beta boards are easy to use - even if there are still some small things missing in the software. The home straight is in sight!

Do you already know more about the expected price? I know you've written about it before, but I can't find it.
 
Nope. The PCBs and assembly costs are still unknowns. Most of the work I've been doing for the last few weeks is changes that will help lower the assembly cost. It's still a big unknown, and I'm a little worried it's going to be more than we had planned due to parts on both sides of the board. But we can't avoid bottom side parts, since the power supply decoupling capacitors really must be directly underneath the center of the chip.
 
defragster said:
@mjs513 - got the buddhaBrot with nice shifted display pixels clearing the text area, works well.

Compiling the T_3.6 fastest/LTO/mpure drops 460K cycles down to 405K.

If you wanted to add cycle cnt display I did this:
Code:
…….
<edit> - Mike - it gets boring when it runs beyond some point - I added at loop() end this for a fresh random start:
Code:
…...
Nothing specific about 10M - though it does start to get muddy about then - maybe a snapshot BMP at 9M?

Glad we got Buddhabrot all working along with the BMP save function. That made me happy.

As for the different datums - we just added that to the last library update, it was based on what someone create for the DUE and the Seedstudion ILI9341 libs. There is also an example called TFT_StringAlign if you want to see the different datums. Of course you don't even need to specify one and just use coordinates.

Good idea about the iteration limit and cycle counts. One thing I was thing about doing was adding a pause and take snapshot command once USB is updated. Don't want to do it now though. No rush on that.
 
Just a quick update - the contract manufacturer put the chips onto the 6 layer boards... and I've been soldering the rest of the parts. The boards look awesome! Robin and I keep having a "double take" when we glance at them. They kind of look like a Teensy 3.2 from a distance, but then you notice it's different. The arrangement of parts on the top side isn't changed much from the center region of the first beta, but still, seeing it now actually in the Teensy form factor looks awesome.

It's only been a few months since I brought up the first round of beta boards, but it kinda feels like a lifetime ago. And ~2090 messages ago! At first I had a terrible sinking feeling, when the first 2 boards weren't working at all. A few hours into troubleshooting, I found my saved notes about the test procedure, which involves the important first step of setting a couple of the OTP fuses. With all this focus on Arduino stuff, and then on the PCB layout, I'd mostly forgotten that process which is based on stuff from my earliest experiments with the chip (long before the bootloader).

I'm still working on a couple unknown issues involving startup & rebooting. The 1062 chip is so similar, but does have minor differences.

I'm also still waiting on PCBs for pogo pins to mate with the new boards. For these first 2 boards, I've got a little cable harness soldered to test points. My hope is for those first test fixture boards sometime next week. We have several of these boards set aside for everyone who's been highly active on this thread. I believe we'll be able to ship them in about 10 days. Another larger batch (not involving me hand soldering) will happen in about 4 weeks. At that time, we'll make 1062-based boards available to all beta testers.

Just to give you a realistic (and perhaps somewhat disappointing) expectation, over the next 3-4 weeks almost all of my time & attention is going to go into the bed of nails test fixture and a variety of small but critically important issues dealing with the final PCB panel shape and how our contract manufacturer is going to work with this new board. This stuff needs to happen now, if we're going to be able to actually release the product and ship in volume.

I know there are a lot of issues that have come up here. I know it's frustrating when I don't get to pull requests and replying on this thread promptly. Please try to be patient, and please understand this is going to get worse over the next several weeks as we enter the final stretch of manufacturing & testing stuff. Then it will get so much better, once we're up and running with a proper test fixture and all the little manufacturing details are finalized. Then I'll be able to pour all my time into the software & documentation side... and there is so much I want to do there. Please keep posting here about any issues. I will be reviewing *all* the messages on this mammoth thread.


Wow! paul I dont know how you do it, I'd love to help test but I dont really have a need for one of these yet. as my only project is to replace the controll system on my animatronic costume wings with teensy instead of a propeller. but keep up the good work! I keep reading this post every day, sometimes more than once a day :)
 
Uhs-i sd

I assume UHS-I SD cards will be supported. About a year ago I bought a STM32H7 400 MHz Nucleo board with UHS-I 208 MHz support.

It's really difficult to get even fair DMA performance. UHS-I cards have 512 KB "Record Units", RU. To get full speed, a write command must write one or more RUs.

If a write is less than one RU, data is copied from a partial RU to an new RU and the entire 512 KB of flash is programmed. With Teensy 3.6, I was able to do large writes by using the SDIO FIFO instead of DMA.

Here are Teensy 3.6 DMA test results compared to a model for transfers up to 128KB. I used four cards for tests but only plotted one for 64KB and 128KB transfers since all cards were similar. There were curious artifacts for small transfers.

The model is simple, it assumes a 50MHz SDIO clock, and about a 600 MB/sec internal move/programming rate for the 512KB RU.
DMA50MHz.jpg

If I increase the SDIO clock to 208 MHz, I get this model:
DMA208MHz.jpg

So you need 512 KB buffers to get full speed. I tested the write speed for the SanDisk UHS-I SD used in the Teensy 3.6 tests with an Android Samsung UHS-I driver on a Cortex-A tablet. The result was 67.82 MB/sec, close to the 73.8 MB/sec for the model.

This means there will be rotten performance for small DMA transfers. Also often DMA will require a copy of data for 4-byte alignment making large transfers impossible.

Looks like the best hope for Teensy 4.0 is the FIFO trick I used on Teensy 3.6.

New SD cards are extremely complicated. The SD standard now defines a 128TB 985MB/Sec NVMe card. These card, like previous cards are required to be formatted exactly to the standard by the Association formatter. You must use special commands to write directory or bit-map/FAT areas.

These cards are designed for Application Processors like Cortex-A where you have GBs of RAM for buffers and caches.
 
My T4 is stuck in customs because they want me to name a "realistic value" for the device. Any pointers from people in germany who successfully got their T4? What did you suggest to them? Any idea from pjrc?
 
What's the price for a Chinese Nano ? 3.- EUR ?
They can't tell the difference.. - both are small microcontrollers.

German Customs (via DHL) is funny: If it comes via import - airport Leipzig, I never had to pay anything. If it comes via airport Cologne (sadly, everything from US takes this way). I have to pay minimum 21EUR.
Airport Frankfurt - don't remember..
 
My T4 is stuck in customs because they want me to name a "realistic value" for the device. Any pointers from people in germany who successfully got their T4? What did you suggest to them? Any idea from pjrc?

I'm sorry you're having problems with customs. There seems to be challenges with both DHL and UPS when shipping to Germany.
I'll send you an email directly to see how we help with this
 
IIRC, there is a 5.0USD value given on the invoice with a -5.0USD reduction for sample.
So, maybe PJRC could send a declaration that the given value is the material value and that no cost (material+shipping) is charged to recipient.
 
IIRC, there is a 5.0USD value given on the invoice with a -5.0USD reduction for sample.
So, maybe PJRC could send a declaration that the given value is the material value and that no cost (material+shipping) is charged to recipient.

Paul and I were discussing this last night. We'll probably start listing a higher dollar value on the samples that we send out and include clear documentation that there it is a commerical sample and that there is no charge to the receiver. I'll also be asking our international beta testers which carrier works best in their area - UPS or DHL.
 
Paul and I were discussing this last night. We'll probably start listing a higher dollar value on the samples that we send out and include clear documentation that there it is a commerical sample and that there is no charge to the receiver. I'll also be asking our international beta testers which carrier works best in their area - UPS or DHL.

Last time I complained to DHL. They replied that the absolute value of the delivery is the deciding factor. Subtracting amounts is not considered. "Free sample" does not count, esp. if it was sent by a company. This also includes the delivery costs. If both together are >22EUR, you have to pay. DHL wants 21EUR as handling fee, which is added to customs. The tax itself was a few cents only.

This can hurt sometimes... I got something which was 23 EUR incl shipping... had to pay the same amount again.
 
Last edited:
question about "FRAMESZ":

would it make sense to maybe integrate this with, say, SPISettings? further upthread KurtE mentioned something along these lines ("a) Update Transfer16 should be simple... Just need to update TCR.FRAMESZ - question to self is to change once and back after the transfer16, or cache the information and only change when needed...").

as is, transfer16 sets TCR every time it's called, which kind of does add up. setting LPSPI_TCR_FRAMESZ(15) resp. LPSPI_TCR_FRAMESZ(31) once and for all saves some time (about 20%); it's easy enough to comment out, though, i suppose.
 
question about "FRAMESZ":

would it make sense to maybe integrate this with, say, SPISettings? further upthread KurtE mentioned something along these lines ("a) Update Transfer16 should be simple... Just need to update TCR.FRAMESZ - question to self is to change once and back after the transfer16, or cache the information and only change when needed...").

as is, transfer16 sets TCR every time it's called, which kind of does add up. setting LPSPI_TCR_FRAMESZ(15) resp. LPSPI_TCR_FRAMESZ(31) once and for all saves some time (about 20%); it's easy enough to comment out, though, i suppose.

I am surprised that it would add up as much, when considering the time that it takes to actually shift out the data, but then again I knew it added some...

There are a few options we could maybe look at:

1) That is we could cache in the SPI class if the last call was 8 bit or 16 bit and only update TCR if we really need to. This would add a test and set code to all of the transfer functions. Not difficult.

2) Earlier I had version of SPI library where I implemented transfer16(buf, retbuf, cnt) and likewise the asynchronous version, as a way to package all of these up into only the overhead of setting up. It also made sure the SPI FIFO was not empty so again reduced overhead... I had the start of a frame buffer implementation for one of the displays that used these functions instead of rolling it's own... But I am not sure I have it anymore... Alternative was to hack up the code that stored 16 bit color in frame buffer, in byte wise as to allow SPI.transfer(frame_buff, NULL, cnt*2)... but punted...
 
I am surprised that it would add up as much, when considering the time that it takes to actually shift out the data, but then again I knew it added some...

it's not very dramatic. i mainly wanted to see how fast things can go. fwiw, i've tested with a board which has a 8 channel ADC and 8 channel DAC on the same SPI bus, both devices have 32 bit wide registers, so that's 32 x write16 per ISR. with transfer16 as is, the (unrolled) read/write stuff would take 31us (vs. 35us with T3.2 @ 30 MHz SCK); without the the TCR update that's down to 25us; with LPSPI_TCR_FRAMESZ(31) (and w/o TCR update) it's ~ 20us. that's more or less in the vicinity of T3.2 + SPIFIFO.h (i don't understand the CS mechanics of LPSPI in this regard, ditto for DMA. (... i've poked around a bit in your ILI9341_t3n library for clues, it looks quite a bit more daunting than interleaving some value into SPI0_PUSHR); but then here there's only one CS pin anyways.)
 
@Paul and other,
@KurtE and @defragster

Well, here we go again with my T4. Had it up and running for a while without the display attached. Attached the display and it ran fine for a couple of iterations and then I made a change to the sketch and it hung again - but this time I had Serial4 attached:
Code:
Fault irq 3
stacked_r0 :: 20002E14
stacked_r1 :: 00000055
stacked_r2 :: 00000010
stacked_r3 :: B1134B04
stacked_r12 :: 20007784
stacked_lr :: 0000B38F
stacked_pc :: 0000B192
stacked_psr :: 41000000
_CFSR :: 00000082
_HFSR :: 40000000
_DFSR :: 00000000
_AFSR :: 00000000
_BFAR :: B1134B04
_MMAR :: B1134B04
need to switch to alternate clock during reconfigure of ARM PLL
USB PLL is running, so we can use 120 MHz
Freq: 12 MHz * 75 / 3 / 1
ARM PLL=80002064
ARM PLL needs reconfigure
ARM PLL=8000204B
New Frequency: ARM=300000000, IPG=150000000
Decreasing voltage to 1150 mV
It looks like the fault was for IRQ3 - DMA channel3???????
While working with the T4 and the ILI9341 display and USBHost_t36 with a wired joystick I ran into the problem with the T4 hanging so I attached it to serial4 to get a debug print. @KurtE suggested that it would probably be better to post this conversation over on the T4 thread. Things have a way of overlapping when you do testing.

@defragster provided me a pdf from arm/Keil, Using Cortex-M3/M4/M7 Fault Exceptions, to try to debug the issue. The " _CFSR :: 00000082" is indicating that:
the processor attempted a load or store at a location that does not permit the operation. The PC value stacked for the exception return points to the faulting instruction. The processor has loaded the MMFAR with the address of the attempted access.

Not sure what is causing this.

Now for another data point. To get the sketch to run I reload the code with no power to display and the sketch will run. If I unplug and reattach the power line to the ILI9341 the sketch will run fine with a working display and no errors. This is the only way to fix the problem.
 
@defragster
What do you think about adding this - start for more details on the fault:
Code:
  if((_CFSR & 1) == 1){
	printf_debug("\t(IACCVIOL) Instruction Access Violation\n");
  } else  if(((_CFSR & (0x02))>>1) == 1){
	printf_debug("\t(DACCVIOL) Data Access Violation\n");
  } else if(((_CFSR & (0x08))>>3) == 1){
	printf_debug("\t(MUNSTKERR) MemMange Fault on Unstacking\n");
  } else if(((_CFSR & (0x10))>>4) == 1){
	printf_debug("\t(MSTKERR) MemMange Fault on stacking\n");
  } else if(((_CFSR & (0x20))>>5) == 1){
	printf_debug("\t(MLSPERR) MemMange Fault on FP Lazy State\n");
  }
  if(((_CFSR & (0x80))>>7) == 1){
	printf_debug("\t(MMARVALID) MemMange Fault Address Valid\n");
  }
 
@defragster
What do you think about adding this - start for more details on the fault:
Code:
  if((_CFSR & 1) == 1){
	printf_debug("\t(IACCVIOL) Instruction Access Violation\n");
  } else  if(((_CFSR & (0x02))>>1) == 1){
	printf_debug("\t(DACCVIOL) Data Access Violation\n");
  } else if(((_CFSR & (0x08))>>3) == 1){
	printf_debug("\t(MUNSTKERR) MemMange Fault on Unstacking\n");
  } else if(((_CFSR & (0x10))>>4) == 1){
	printf_debug("\t(MSTKERR) MemMange Fault on stacking\n");
  } else if(((_CFSR & (0x20))>>5) == 1){
	printf_debug("\t(MLSPERR) MemMange Fault on FP Lazy State\n");
  }
  if(((_CFSR & (0x80))>>7) == 1){
	printf_debug("\t(MMARVALID) MemMange Fault Address Valid\n");
  }

Very good start … Something like that was my intent when I searched and found that PDF 2+ months back. IIRC I also found the 'Reason for Fault' coding - like somebody pulled out for T_3's - that might be helpful as well. I could make a project of that - getting it to express those strings - or you could … let me know.
 
The Restart Reason codes are more limited and boring than the T_3 ones … and only able to see :: Indicates whether reset was the result of ipp_reset_b pin (Power-up sequence)

Not sure if it gets cleared during init before setup() - or if any power cycle or upload results in a 'reset through Power-up sequence' from the red eyed bootloader chip?

Code:
void resetReason( ) {
	uint16_t mask = 1;
	int resetReasonHw;
#if defined(__IMXRT1052__)
	resetReasonHw = SRC_SRSR;
#else
	resetReasonHw = RCM_SRS0;
	resetReasonHw |= (RCM_SRS1 << 8);
#endif
	Serial.print(">>> Reason for 'reset': ");
	Serial.print(resetReasonHw, HEX);
	do {
		switch (mask & resetReasonHw) {
#if defined(__IMXRT1052__)
		// SRC_SRSR // 20.8.3 SRC Reset Status Register (SRC_SRSR)
		case 0x0001: Serial.print(" IPP_RESET_B"); break;
		case 0x0002: Serial.print(" LOCKUP_SYSRESETREQ");  break;
		case 0x0004: Serial.print(" CSU_RESET_B"); break;
		case 0x0008: Serial.print(" WDOG_RST_B"); break;
		case 0x0010: Serial.print(" JTAG_RST_B"); break;
		case 0x0020: Serial.print(" JTAG_SW_RST"); break;
		case 0x0040: Serial.print(" WDOG3_RST_B"); break;
		case 0x0080: Serial.print(" TEMPSENSE_RST_B"); break;
#else
		//RCM_SRS0
		case 0x0001: Serial.print(F(" wakeup")); break;
		case 0x0002: Serial.print(F(" LowVoltage"));  break;
		case 0x0004: Serial.print(F(" LossOfClock")); break;
		case 0x0008: Serial.print(F(" LossOfLock")); break;
		//case 0x0010 reserved
		case 0x0020: Serial.print(F(" wdog")); break;
		case 0x0040: Serial.print(F(" ExtResetPin")); break;
		case 0x0080: Serial.print(F(" PwrOn")); break;

		//RCM_SRS1
		case 0x0100: Serial.print(F(" JTAG")); break;
		case 0x0200: Serial.print(F(" CoreLockup")); break;
		case 0x0400: Serial.print(F(" SoftWare")); break;
		case 0x0800: Serial.print(F(" MDM_AP")); break;

		case 0x1000: Serial.print(F(" EZPT")); break;
		case 0x2000: Serial.print(F(" SACKERR")); break;
			//default:  break;
#endif
		}
	} while (mask <<= 1);
	Serial.println(" :: done Reason");
}
 
The Restart Reason codes are more limited and boring than the T_3 ones … and only able to see :: Indicates whether reset was the result of ipp_reset_b pin (Power-up sequence)

Not sure if it gets cleared during init before setup() - or if any power cycle or upload results in a 'reset through Power-up sequence' from the red eyed bootloader chip?

Code:
void resetReason( ) {
	uint16_t mask = 1;
	int resetReasonHw;
#if defined(__IMXRT1052__)
	resetReasonHw = SRC_SRSR;
#else
	resetReasonHw = RCM_SRS0;
	resetReasonHw |= (RCM_SRS1 << 8);
#endif
	Serial.print(">>> Reason for 'reset': ");
	Serial.print(resetReasonHw, HEX);
	do {
		switch (mask & resetReasonHw) {
#if defined(__IMXRT1052__)
		// SRC_SRSR // 20.8.3 SRC Reset Status Register (SRC_SRSR)
		case 0x0001: Serial.print(" IPP_RESET_B"); break;
		case 0x0002: Serial.print(" LOCKUP_SYSRESETREQ");  break;
		case 0x0004: Serial.print(" CSU_RESET_B"); break;
		case 0x0008: Serial.print(" WDOG_RST_B"); break;
		case 0x0010: Serial.print(" JTAG_RST_B"); break;
		case 0x0020: Serial.print(" JTAG_SW_RST"); break;
		case 0x0040: Serial.print(" WDOG3_RST_B"); break;
		case 0x0080: Serial.print(" TEMPSENSE_RST_B"); break;
#else
		//RCM_SRS0
		case 0x0001: Serial.print(F(" wakeup")); break;
		case 0x0002: Serial.print(F(" LowVoltage"));  break;
		case 0x0004: Serial.print(F(" LossOfClock")); break;
		case 0x0008: Serial.print(F(" LossOfLock")); break;
		//case 0x0010 reserved
		case 0x0020: Serial.print(F(" wdog")); break;
		case 0x0040: Serial.print(F(" ExtResetPin")); break;
		case 0x0080: Serial.print(F(" PwrOn")); break;

		//RCM_SRS1
		case 0x0100: Serial.print(F(" JTAG")); break;
		case 0x0200: Serial.print(F(" CoreLockup")); break;
		case 0x0400: Serial.print(F(" SoftWare")); break;
		case 0x0800: Serial.print(F(" MDM_AP")); break;

		case 0x1000: Serial.print(F(" EZPT")); break;
		case 0x2000: Serial.print(F(" SACKERR")); break;
			//default:  break;
#endif
		}
	} while (mask <<= 1);
	Serial.println(" :: done Reason");
}

Tim. See you been busy since I was sleeping. Question - where are you putting the reset function - in startup?

This is what I have in startup.c so far, but now I am getting gibberish on Serial4 when a error occurs using my problematic sketch.
Code:
  //Memory Management Faults
  if((_CFSR & 1) == 1){
	printf_debug("      (IACCVIOL) Instruction Access Violation\n");
  } else  if(((_CFSR & (0x02))>>1) == 1){
	printf_debug("      (DACCVIOL) Data Access Violation\n");
  } else if(((_CFSR & (0x08))>>3) == 1){
	printf_debug("      (MUNSTKERR) MemMange Fault on Unstacking\n");
  } else if(((_CFSR & (0x10))>>4) == 1){
	printf_debug("      (MSTKERR) MemMange Fault on stacking\n");
  } else if(((_CFSR & (0x20))>>5) == 1){
	printf_debug("      (MLSPERR) MemMange Fault on FP Lazy State\n");
  }
  if(((_CFSR & (0x80))>>7) == 1){
	printf_debug("      (MMARVALID) MemMange Fault Address Valid\n");
  }
  //Bus Fault Status Register
  if(((_CFSR & 0x100)>>8) == 1){
	printf_debug("      (IBUSERR) Instruction Bus Error\n");
  } else  if(((_CFSR & (0x200))>>9) == 1){
	printf_debug("      (PRECISERR) Data bus error(address in BFAR)\n");
  } else if(((_CFSR & (0x400))>>10) == 1){
	printf_debug("      (IMPRECISERR) Data bus error but address not related to instruction\n");
  } else if(((_CFSR & (0x800))>>11) == 1){
	printf_debug("      (UNSTKERR) Bus Fault on unstacking for a return from exception \n");
  } else if(((_CFSR & (0x1000))>>12) == 1){
	printf_debug("      (STKERR) Bus Fault on stacking for exception entry\n");
  } else if(((_CFSR & (0x2000))>>13) == 1){
	printf_debug("      (LSPERR) Bus Fault on FP lazy state preservation\n");
  }
  if(((_CFSR & (0x8000))>>15) == 1){
	printf_debug("      (BFARVALID) Bus Fault Address Valid\n");
  }  
  //Usuage Fault Status Register
  if(((_CFSR & 0x10000)>>16) == 1){
	printf_debug("      (UNDEFINSTR) Undefined instruction\n");
  } else  if(((_CFSR & (0x20000))>>17) == 1){
	printf_debug("      (INVSTATE) Instruction makes illegal use of EPSR)\n");
  } else if(((_CFSR & (0x40000))>>18) == 1){
	printf_debug("      (INVPC) Usage fault: invalid EXC_RETURN\n");
  } else if(((_CFSR & (0x80000))>>19) == 1){
	printf_debug("      (NOCP) No Coprocessor \n");
  } else if(((_CFSR & (0x1000000))>>24) == 1){
	printf_debug("      (UNALIGNED) Unaligned access UsageFault\n");
  } else if(((_CFSR & (0x2000000))>>25) == 1){
	printf_debug("      (DIVBYZERO) Divide by zero\n");
  }
 
Last edited:
Status
Not open for further replies.
Back
Top