Teensy 4? testing mbed NXP MXRT1050-EVKB (600 Mhz M7)

Is that a Rigol 1000Z series scope?

Those are at most 100 MHz bandwidth, and have 150 MHz probes, right?

Yep. NXP LPSPI/DMA configuration also might still need tuning ... I'll wait for Teensy 4 before spending more cycles tuning the NXP eval board LPSPI.
 
Pretty sure the key to high sustained LPSPI throughput for a larger block of data is configuring a big frame size in TCR. It's on my list to do soon. Right now dealing with the finer points of USB VBUS detection and dynamically reprogramming the LDO power supply settings....
 
Since I couldn't get the basic spi_interrupt example running on the 1052 board decided to give the basic quadrature encoder/decoder example a try and compare it to what I got from a T3.5 using the encoder library. For this initial test I use a 12v uxcell. Both gave me pretty much the same values at 12v, approximately 4400. Going to be a little more scientific about this in the next post as I just want to see if the example worked. There are a couple of nice features they add.
 
As a follow on to post #129, this is a better comparison between values obtained between the 1052 and a T3.5 based on a average of several readings of 10 seconds each:
Code:
Volts	 1052	     3.5
  5     21,801     21,155  (counts)
7.5     32,104     32,009
 10     42,833     43,262
Since I don't have a digital power supply some of the differences between the 1052 and the T3.5 could be due repeatability in setting the supply voltage to the motor.
 
For some strange reason I still can not get the lspi_interrupt example to work - just hangs. So, finally dug out my scope from storage and think I figured out how to use it again. I played around with the example file and looked at clock and mosi. The clock freq was at 262Mhz and ran the test with a SPI rate of 80Mhz. I am posting the photo but not sure its right. Can someone give me a sanity check. I did make the modifications to the pad configurations, and used delays of 100ns:
20181204_172057.png

Bottom trace is the clock and top trace is mosi.

Thanks
Mike

EDIT: just as a fyi the scope is a hantex dso5102P (100Mhz). There is a hack to get it up to 200Mhz but haven't done it yet. If you all think its worth it I will give it a try.
 
Pretty sure the key to high sustained LPSPI throughput for a larger block of data is configuring a big frame size in TCR. It's on my list to do soon. Right now dealing with the finer points of USB VBUS detection and dynamically reprogramming the LDO power supply settings....

Re: tuning LPSPI
On the non-DMA SDK example with max SPI CLK 65.45mhz/2 and requesting SPI CLK of 40 mhz, SDK example sets DIV in CCR to 0, and calculates DBT to 0. Scope shows SPI CLK at 31.2 MHz. With FIFO high-water mark at 14 (lower ISR overhead), data rate is 14.4 mbs. Reducing the high-water mark to 8 increases data rate to 18.8 mbs. Finally, increasing the SPI frame size from 8-bits to 32-bits in TCR, the data rate increases to 30.4 mbs, getting close to SPI CLK rate. I haven't tested the DMA version with 32-bit frames presumably it would benefit from 32-bit frames.
 
Out of curiosity I ported the T3.0 Slave Library to run on the T3.2 and attached it the 1052 board using example master interrupt from the SDk. To get it work I did have reduce the framesize to 16, SPI clock to 20Mhz and the transfer size to 256 (limitation of the slave library). Data did come across from the 1052 to the T3.2 with the following results:
Code:
                 33222222222211111111110000000000
                 10987654321098765432109876543210
                 --------------------------------
SPIO_MCR:        00000000000000000000000000000000
SPIO_CTAR_SLAVE: 01111000000000000000000000000000
SPI0_SR:         01000010000000000000000000000000
SPI0_RSER:       00000000000000100000000000000000

Frame Size:      16
Data Length:     256
Packets:         1
Bytes Sent:      512
Time Elapsed:    155
uSecs/Byte:      0.30
Mbps:            3.30

From the 1052 side, using modified code from @manitou:
Code:
SPI CLK 20000000   master clock 261818172 261818172
fifo 16  watermark 8
txcount 256 257 us  15937 kbs

However the data printed didn't come over in sequential order - have to figure that one out. Been testing speeds and waveform but wanted to see if data transfers were actually working correctly
 
However the data printed didn't come over in sequential order - have to figure that one out. Been testing speeds and waveform but wanted to see if data transfers were actually working correctly
If you are not using 8-bit frame size, then you may have to worry about byte order. Conveniently, the 1052 LPSPI has a control bit to re-order bytes, BYSW in TCR.
 
I'm always amazed the sheer volume of extra stuff to read just to figure out what anything really does. I suppose programmers who write code in that style feel it's good practice. Or maybe NXP has corporate requirements documents & standards which require all code written & formatted a certain way? I'm sure it's all done with the best of intentions, but the end result is an excessive amount of verbiage to sift though, just to figure out what anything actually does.

I really don't like that highly verbose style. My preference could be summed up as "less is more".

Amen, brother!
 
Changed everything to go with 8 bit frame size as a second test, this is what is happening
Code:
data[0]: 231
data[1]: 232
data[2]: 233
data[3]: 234
data[4]: 235
data[5]: 236
data[6]: 237
data[7]: 238
data[8]: 239
data[9]: 240
data[10]: 241
data[11]: 242
data[12]: 243
data[13]: 240
data[14]: 244
data[15]: 245
data[16]: 246
data[17]: 247
data[18]: 248
data[19]: 249
data[20]: 250
data[21]: 251
data[22]: 252
data[23]: 253
data[24]: 254
data[25]: 255
data[26]: 252
data[27]: 13
data[28]: 14
data[29]: 15
data[30]: 12
data[31]: 16
data[32]: 17
data[33]: 18
After this it continues sequentially. Have a funny feeling it may be on the Teensy slave side.

EDIT: Yep - problem was on the slave side. Had to add 1 microsecond delay to the following function:
Code:
void T3SPI::rx16(volatile uint16_t *dataIN, int length){
	dataIN[dataPointer] = SPI0_POPR;
	dataPointer++;
	if (dataPointer == length){
		dataPointer=0;
		packetCT++;
		[COLOR="#FF0000"]delay(1);[/COLOR]
	}
	SPI0_SR |= SPI_SR_RFDF;
}
Not sure if there is a better way to accomplish this.
 
Last edited:
Just a followup to my SPI testing using I teensy as a slave. I attached a T3.5 and had set the 1052 to transmit at 60Mhz - data came across correctly. Was a bit surprised at this - but to make it work I had to overclock the T3.5 to 168Mhz, at 120Mhz would get data errors.

Code:
1052 settings:
SPI CLK 60000000   master clock 261818172 261818172
fifo 16  watermark 8
txcount 256 121 us  33851 kbs
 
In the latest version of EVKB SDK, mbedtls and wolfssl are using DCP to accelerate AES and SHA -- about a 3x speedup. The mbedtls and wolfssl libs also use the TRNG for entropy source.

With FreeRTOS included in SDK, the low power app is available, runs MCU at 600, 528, 132, or 24 MHz.
https://www.nxp.com/docs/en/application-note/AN12094.pdf

I measured milliamps while running the app on my EVKB board, and the results are in post #1
 
Last edited:
...
I measured milliamps while running the app on my EVKB board, and the results are in post #1

Funny the numbers are about as to be expected according to my calculator … except it doesn't mention the 90 ma base current :) The eval board has quite a few added components right? Can you add the board current when the MCU is sleeping?

To test power consumption at different frequencies, I ran the SDK power_mode_switch app, described here, with meter and hacked USB cable. At 600 MHz the app/board consumed 158.9 ma, @528mhz 150.3 ma, @132mhz 116.9 ma, and @24mhz 93.9 ma. The specs say the MCU should consume 0.11ma/MHz.

Cool news on the Teensy4 tree population 'First commit, WIP' … 4 days ago and updating ...
 
4 days ago and updating ...

Yup, I finally started bringing in the many C-only "experiments" into Arduino. :)

At this moment I have 11 high priority items on my list which need to be completed before we start shipping beta boards. None are in the core library, so don't expect to see too much more activity on github for the next few days.

Let's talk about 2 of those 11 items: testing the crystal accuracy (on the 24 MHz crystal and the 32.768 kHz crystal). We don't get software trim-able capacitors like on Teensy 3.x. As a result, we built the first batch of betas with a variety of capacitor values. Right now, each board is in its own anti-static bag with the capacitor values used. We also used 5 different inductors for the DC-DC step down power supply. Before we send these out, I want to make some uniform measurements on all of them. (power efficiency measurement makes 3 of 11....)

For the 24 MHz, I can use analogWriteFrequency and use a frequency counter. I recently bought a low-end 10 MHz GPS disciplined oscillator (still sitting unused) for my frequency counter's reference input. FWIW, the counter is a BK Precision model 1823A.

Still searching for ideas to verify the 32 kHz oscillator. Sadly, I didn't being out the AD_B0_00 pin (BGA location M14) to a test pad. D'oh! Hindsight....

Anyone have any ideas on the 32 kHz crystal measurement?
 
Anyone have any ideas on the 32 kHz crystal measurement?

On the NXP eval board i measured 24mhz crystal drift with both ntp and GPS pps. For the 32khz, i configured up the SRTC and each time the second value changed, I compared to micros(), so I measured 32khz drift relative to 24mhz drift. On the SDK, my "micros()" is the free-running GPT timer. I started with one of the SDK examples that tested the SRTC ... holler, if you want more details/code. My drift values are in post #1

Surprisingly ? the RTC drift was -46 ppm. BOM says 32khz should be 20 ppm. So the eval board may have sub-optimal capacitors for 32khz crystal.
Code:
uint32_t rtc_secs()
{
    uint32_t seconds = 0;
    uint32_t tmp = 0;

    /* Do consecutive reads until value is correct */
    do
    {
        seconds = tmp;
        tmp = (SNVS->LPSRTCMR << 17U) | (SNVS->LPSRTCLR >> 15U);
      //  tmp = (SNVS->HPRTCMR << 17U) | (SNVS->HPRTCLR >> 15U);
    } while (tmp != seconds);

    return seconds;
}
...
    uint32_t secs, us = 0, us0 = 0, secs0 = 0;
    secs = rtc_secs();
    while (1) {

        if (secs != rtc_secs()) {
            us = micros();
            secs = rtc_secs();
            if (us0 == 0) {
                us0 = us;
                secs0 = secs;
            } else {
                float ppm = 1000000. * (((secs - secs0) * 1000000.) - (us - us0))
                        / (us - us0);
                PRINTF("%d secs  %d ppm   %u us\n", secs - secs0, (int)ppm, us);
            }

        }
    }
added to SDK example boards/evkbimxrt1050/driver_examples/snvs/snvs_lp_srtc/
 
Last edited:
Fortunately the GPT timers can use the 32 kHz crystal, and two of the Arduino assigned pins are the input capture signals for GPT1. :)

Here's a little program I wrote to try directly measuring the crystal. Going to run it now on all the betas....

Code:
#include "debug/printf.h"

void setup() {
  // Connect GPS 1PPS signal to pin 30 (EMC_24)
  IOMUXC_SW_MUX_CTL_PAD_GPIO_EMC_24 = 4; // GPT1 Capture1
  IOMUXC_SW_PAD_CTL_PAD_GPIO_EMC_24 = 0x13000; //Pulldown & Hyst
  CCM_CCGR1 |= CCM_CCGR1_GPT(CCM_CCGR_ON) | 
    CCM_CCGR1_GPT_SERIAL(CCM_CCGR_ON);
  GPT1_CR = 0;
  GPT1_PR = 0;
  GPT1_SR = 0x3F; // clear all prior status
  GPT1_IR = GPT_IR_IF1IE;
  GPT1_CR = GPT_CR_EN | GPT_CR_CLKSRC(4) | 
    GPT_CR_FRR | GPT_CR_IM1(1) | GPT_CR_IM2(2);
  attachInterruptVector(IRQ_GPT1, capture);
  NVIC_ENABLE_IRQ(IRQ_GPT1);
}

#define LEN  124

void capture() {
  static uint32_t prior=0;
  static uint32_t list[LEN];
  static uint32_t count=0;
  static int index=0;
  uint32_t now = GPT1_ICR1;
  GPT1_SR = GPT_SR_IF1;
  uint32_t n = now - prior;
  prior = now;
  if (index >= LEN) index = 0;
  list[index++] = n;
  count++;
  if (count <= LEN) {
    printf("cature %u\n", n);
  } else {
    uint32_t sum=0;
    for (int i=0; i < LEN; i++) {
      sum = sum + list[i];
    }
    printf("cature=%u, sum=%u\n", n, sum);
  }
}

void loop() {
}

This probably can't work on NXP's eval boards, unless you cut the EMC24 trace going to the SDRAM chip...
 
This probably can't work on NXP's eval boards, unless you cut the EMC24 trace going to the SDRAM chip...

Sigh, and the GPT2 capture pins (EMC_40 41) go to the ENET on the eval board. With less precision, one could free-run the GPTx on the 32khz clock source, and read the GPTx counter on each GPS PPS interrupt.

EDIT: on the Teensy 4 beta unit, the GPT capture with GPT clocked by 32khz crystal worked on one of my GPS units (PPS pulse width 100ms) but not my other GPS (PPS pulse width 1 us). With GPT clocked at 32khz, the capture counts are correct only if the pulse width is at least 30 us (1/32768).

need asm("dsb"); in ISR
 
Last edited:
I'm seeing a strange difference between between the 2 oscillators. Or maybe not so strange... but still getting used to the little unexpected behaviors of this chip.

We built all the betas with a 24 MHz crystal rated at 12 pF. Except at least 2 got made with the wrong crystal, and maybe 2 more (haven't looked at those under the microscope yet), which makes for interpreting the data more interesting! The 32.768 kHz crystals are rated at 12.5 pF.

So far I haven't found any NXP specs on the pin capacitance. Most chips I've used have been in the 5 to 8 pF range per pin (ref to ground). Intuitively I was expecting the two oscillators to be similar. But that's definitely not what I'm seeing in the testing. We built the betas with capacitors ranging between 12 to 20 pF, expecting somewhere in the middle of that range would be needed. 12 pF turns out to be very close to ideal for the 24 MHz oscillator, and 20 pF looks pretty close on the 32.768 kHz. It's almost like NXP put little 7 pF capacitors inside the chip on the 24 MHz oscillator. Or maybe one set of pins really does have ~5 pF capacitance and the others about 13 pF?

I've measured both oscillators on 22 different boards. All those tests were done shortly after powering the board up, about 30 seconds for the 24 MHz and the first 4 minutes of running with the 32 kHz test. Now I'm starting a couple longer tests. On 2 different boards, I'm definitely seeing the 32 kHz oscillator starts running a bit fast and gradually slows over several minutes. With the board on my desk right now, the slowing from start to 15 minutes look to be ~3 ppm change.
 
Was looking for some specs last night on the crystals and came across this post: https://community.nxp.com/thread/478028, RT1050 Crystal Spec. Then it gets interesting in that it states:
since i.MX and Vybrid parts use many same IP modules and particular XTALOSC,
crystals mentioned on that thread also can be used with RT1050. Seems most full
characteristics are given in Table 9. Recommended External Crystal Specifications
i.MX25 Datasheet
https://www.nxp.com/docs/en/data-sheet/IMX25CEC.pdf

Don't know if this helps or not, its about the only thing that I could find.

EDIT: The original question also references this thread: https://community.nxp.com/thread/308944
 
Last edited:
Back
Top