Teensy 4.0 First Beta Test

Status
Not open for further replies.
Hi @mjs513 - Hope your wife is doing better!

Some of this stuff is still sort of voodoo for me as well. More or less I learned it from example... (@Paul's changes back for T3.x in trying to remove the dependencies on c++ constructor ordering and the like... So the constructor has the keyword constexpr
Which I don't know the exact requirements (I have looked them up and the like) and sometimes I have had to do some trial and error to make work...
But some of the basic things is that all member variables must be initialized or filled in by constructor...

What typically I would have liked to do with for example SPIClass is to simply pass in the address of the Hardware define as well as the pointer to the register structure... Which sort of works, but then when someone else references it in their class, example SerialFlash, you end up with some cryptic errors... Which happened recently.

So instead we pass in a generic pointer (ujintptr_t). So in the FlexIOHandler case we have, some of this, in the class like:

Code:
class FlexIOHandler {
public:
...
  constexpr FlexIOHandler(uintptr_t myport, uintptr_t myhardware, uintptr_t callback_list)
  	: port_addr(myport), hardware_addr(myhardware), _callback_list_addr(callback_list) {
  }
	IMXRT_FLEXIO_t & port() { return *(IMXRT_FLEXIO_t *)port_addr; }
	const FLEXIO_Hardware_t & hardware() { return *(const FLEXIO_Hardware_t *)hardware_addr; }

...
protected: 
	uintptr_t 		port_addr;
	uintptr_t		 hardware_addr;
	uintptr_t		_callback_list_addr;
	uint8_t         _used_timers = 0;
	uint8_t         _used_shifters = 0;
	bool			_irq_initialized = false;
  
};
So the constructor has three addresses passed in all of them as uintptr_t type, which then the constructor assigns the passed in values into member variables, which in my case are in the protected section. So port_addr = myaddr; ...

Then we use the the registers, we use the helper function port, which casts the address back to a reference to the underlying register set.
So in this case we would do something like port().PARAM

Edit: and then to create the objects, I have the lines, like
Code:
static FlexIOHandler flexIO1((uintptr_t)&IMXRT_FLEXIO1_S, (uintptr_t)&FlexIOHandler::flex1_hardware, (uintptr_t)flex1_Handler_callbacks);

Hope that helps
 
Thanks for the explanation Kurt. Like you said its voodoo - have to play a bit with it and experiment.

Wife is getting better - walking around now so all is good.

Mike
 
FlexIO SPI - Teensyview/SSD1306 display

Today I played around with a version of the Adafruit_SSD1306 driver. One that I hacked up earlier to allow multiple devices and no have hard coded height...
Hacked it over to use my FlexSPI code to see if it works.

It appears to be working... If anyone wishes to try it out. Again it requires my FlexIO code from the library up to date core (imxrt.h)...

There is one example program here for 32 pixel high display. In my case a Teensyview from sparkfun :D May try later with a 64 pixel high one.

Also may also update my SPI code to make MISO optional. This I believe would allow me to remove the 2nd ShiftBuf, but may add some complications on knowing when the actual shift has completed...
 

Attachments

  • Adafruit_SSD1306_FlexIO.zip
    13.8 KB · Views: 70
I've uploaded 1.46-beta8 (on msg #2), which rolls up the many recent pull requests into a single installer.

Sorry, I've still been pretty sick here the last several days - not able to concentrate on much dev work. Just now starting to feel a bit better. Might be next week before I'm back up to speed...
 
I've uploaded 1.46-beta8 (on msg #2), which rolls up the many recent pull requests into a single installer.
Thanks Paul, will try it out on a few machines, currently have it on my main PC. The Flex SPI -> SSD1306 example still compiles and runs :D

Sorry, I've still been pretty sick here the last several days - not able to concentrate on much dev work. Just now starting to feel a bit better. Might be next week before I'm back up to speed...
Glad to hear you are feeling a bit better. Take care of yourself!
 
I've uploaded 1.46-beta8 (on msg #2), which rolls up the many recent pull requests into a single installer.

Sorry, I've still been pretty sick here the last several days - not able to concentrate on much dev work. Just now starting to feel a bit better. Might be next week before I'm back up to speed...

Ditto to what Kurt said. Glad you are feeling better.
 
Last edited:
Installed td1.46b8 no problem on install or builds. The new 'unused_interrupt_vector()' code is in place and not WEAK. This WEAK code is called : HardFault_HandlerC() and works usably as expected currently dumps to Beta Serial4. The Debug Teensy Trace [debug_tt] code is testing okay with user specified &Serial# for output to use that and soon to be published when I get back to coding it.

Paul: I didn't watch first upload verbose - did T_loader update the T4 bootloader by any chance? Any change to T4 USB code?

I got it working again - though T_ports showed then lost the T4 - there is some oddity in T4 connectivity w/T_ports. After T_3.1 TLoader upload [updated EchoBoth Sketch] Did IDE upload to one then button push on the second - had to unplug and rePlug T4 and disable T_3.1's in some fashion for T4 to run first time. IDE_SerMon reconnects after T_Loader Auto upload, but not on Button upload has to be closed and re-opened.

Compile of updated 'EchoBoth' Serial<>USB on the two T_3.1's - saw an orange line flash by in console:
T:\arduino-1.8.8T4_146\hardware\teensy\avr\cores\teensy3\Stream.cpp: In member function 'bool Stream::findUntil(const char*, const char*)':
T:\arduino-1.8.8T4_146\hardware\teensy\avr\cores\teensy3\Stream.cpp:93:10: warning: unused variable 'tlen' [-Wunused-variable]
size_t tlen = (terminator==nullptr)?0:strlen(terminator);
^

FUNNY UPDATE to the following … leaving it as a caveat ...:: I closed IDE/T_Loader and TyComm as T_3.1 Sermon's - Unplugged 3 Teensy's - restarted IDE and powered up Teensy T4 then the T_3.1's and uploaded new code and now it is running unchanged code to catch sequential micros() as it should at 600 MHz. One note before the Degree symbol was showing as '°' now it is again showing as '°C=50.51' in SerMon from T4. Took out ° in the Interval timer version - it is now working with a 5us interval timer as well in that version - so something was odd … <EDIT:> After this restarting T_Ports presented and has maintained Teensy_4 for IDE T_Sermon Has completed 230 loops of 25,000,000 sequential micros no problem.
>> Did something change in run speed? Or what feeds the micros() clock math [systick_isr]? Some new background interrupt? For some reason my Syncron_Micros that could catch sequential micros() {i.e. within 600 cycles} [except on loop() pass #1 would see 2] is now <often> only able to see them every 2 or 3 micros() {i.e. 1200 to 1800 cycles}? It was working with a running 200K/sec IntervalTimer_isr as well - but that github code does not have that. The updated micros() CORES code is there and looks right and is still running with atomic [__LDREXW] protection in about 37-39 cycles as noted by that code. It fails at 600 and 700 MHz and even 800 MHz skips 2 or 3? Just asked first time for T4 for 900 MHz and it failed transition. @Mike - you ran this before to see TempMon - I didn't change github code - you might confirm what I'm seeing assuming yours worked without ERROR before?
More permissive version runs with this line ~67 edit:
Code:
#else
      else if ( dd <= [U]3 ) // Beta3 edit??? /// [/U]2 && 1 == Lcnt ) // extra us tick may miss loop#1 startup @600 MHz

Updated github GPIOwriteSpeed - added write of !read[fast] - removed debug stuff.
For ref shows these CycleCnts today (seems unchanged) with FastWrie [times in brackets are non-Fast] { still some CycCnts differ on write 1>0 versus 0>1 both not shown }:
flipped write >16 [192]
Global ! flip write >17 [192]
H L H 'const' write >5 [198]
Three write(!ReadFast) >194 [327]
Three write(!Read) >259 [378]
H H H var write >5 [184]
L L L var write >5 [175]
L H L 'const' write >5 [189]
! flip write >24 [197]
NOTE: The T4 FastWrite seems to act the same for const .vs. var - is there a reason not to always do the FastWrite method?
NOTE: This code could change to read the pin back versus a jumper to second input pin to show it can read an output.
 
Last edited:
Flex-SPI - For the heck of it, I put in code to handle MSBFIRST/LSBFIRST option as part of beginTransaction.

Question to self (and hopefully others) - Wondering how far to take playing with Flex IO pins? Example currently I don't support the different mode values. But if I am only doing some of this, for testing purposes, then probably good enough now. If however we want to pick up parts to maybe be more mainline, then probably need to flesh this out more.

Again assuming there is interest in using things like this, may want to figure out best way to have libraries that are modified to be used on Flex pins. Like the version of the Adafruit_SSD1306 I hacked up yesterday.

Who knows maybe I will try setting up a FlexWire object.. and then have support for the SSD1306...

Or maybe get back to SPI on T4 with DMA copy support. I was sort of holding off here, until I have a better understanding on how some of this stuff will work. Example:
with the SSD1306 code, I may allocate the buffer by using malloc for the screen buffer. But I believe the current configuration for memory allocated with malloc that the DMA operations may not actually transfer the data that you wrote to that memory as it may be cached out... With suggestions from @frank B - I had my DMA output of the ILI9341 code outputting properly, but that was overwritten today...

But if some simple code like:
Code:
uint8_t buffer = malloc(100);
memset(buffer, 0xff, 100);
SPI.transfer(buffer, NULL, 100, event);
Does not actually transfer ff characters, than maybe not worth doing.
Or is there some way to tell the cache to flush out everything in some range of memory?

...

Now back to playing
 
@KurtE - very cool - you've seen/shown enough to know it seems to be that FlexIO for SPI generally works. With the 1062 losing an SPI it seems creating a full alternate/second SPI will be important in the end - as long as 1062 works like 1052 - and doesn't offer some alternate with 3rd FlexIO path or other. Does it make sense to wait for true 1062 hardware to revisit - and look into the DMA sooner?

BTW Kurt: I hacked Serial.attachRts( 1 ) because it has a bool return to know head!=tail, and gotten the Faulted T4 to eject buffered data into the FIFO using the Tx guts of Serial.IRQHandler().

I'm wondering if it there would be a reason to incorporate it into Serial.flush()? Would be odd to add an interface given the rare case it is needed Something in the IRQHandler() fails to see the need to do this or the manual call to it should have worked.

<edit> Output will run real-time with lots of calls to fill the FIFO with buffered chars. Serial# input works FINE!{only doing single keys + newline} It isn't the FIFO that is the trouble - but lack of interrupts to keep the FIFO filled from buffered data.
Below is what I found to work - called like this -
Code:
    while ( pdbser1->attachRts( 1 ) ) delayMicroseconds(10);
Code:
[B]bool HardwareSerial::attachRts(uint8_t pin)[/B] // ABUSED FUNCTION for test
{
  uint32_t head, tail, n;
  uint32_t ctrl;
  head = tx_buffer_head_;
  tail = tx_buffer_tail_;
  ctrl = port->CTRL;
  if (((port->WATER >> 8) & 0x7) < 4)
    [U]//  ((ctrl & LPUART_CTRL_TIE) && (port->STAT & LPUART_STAT_TDRE))[/U] // Excluded this test
  {
    do {
      if (head == tail) break;
      if (++tail >= tx_buffer_total_size_) tail = 0;
      if (tail < tx_buffer_size_) {
        n = tx_buffer_[tail];
      } else {
        n = tx_buffer_storage_[tail - tx_buffer_size_];
      }
      port->DATA = n;
    } while (((port->WATER >> 8) & 0x7) < 4);   // need to computer properly
    tx_buffer_tail_ = tail;
    if (head == tail) {
      port->CTRL &= ~LPUART_CTRL_TIE;
      port->CTRL |= LPUART_CTRL_TCIE; // Actually wondering if we can just leave this one on...
    }
  }
// removed code that was here in the Tx part of IRQHandler()
// if ((ctrl & LPUART_CTRL_TCIE) && (port->STAT & LPUART_STAT_TC)) // …

  return ( head != tail ); // Added to know to keep calling or not
}
 
Last edited:
Good Morning Kurt,
There are some functions to do that - As far as I remember in imxrt.h (at the end of the file) (I'm away from my computer right now) and in CMSIS.
Don't know how they can be helpful for pixel operations like vertical lines. If you spend more then 3 cycles on cache handling per pixel, it will be slower than T36 (Am I right?)
For me, disabling the cache for the display buffer is still the best option.
And we will have the same issue with all other DMA operations from OCRAM.
@Manitou - can you measure the speed of a memcpy in normal ram, ocram - and ocram with disabled cache (writethrough and nocache)? and between both?

I'm struggling with I2S, I have output on TX and the clocks but no audible output from SGTL5000. I think I can solve that in the next days.
 
Last edited:
@Kurt, I had the idea to toggle chipselects with dma-writes to gpio and use scatter-gather. This way it must be possible to transfer a whole CS-DC-command-"dc-off'-data-"csoff" sequence in pure hardware. Had no time so far to try that. In theory u could use any pin as chipselects..
 
@Manitou - can you measure the speed of a memcpy in normal ram, ocram - and ocram with disabled cache (writethrough and nocache)? and between both?
As part of my benchmarking, I had done memcpy/memset measurements, but they were crazy fast. I decided something was amiss with cache or the core assembler code (memcpy-armv7m.S memset.S), although it was evident the data was getting copied/set. Also the 10-us micros() resolution didn't help. I'll revisit. (NXP SDK had a SCB_DisableDCache() )
Early beta test
Code:
      use  16*1024 instead of 1024
      2184.53 mbs  60 us   loop set
      2621.44 mbs  50 us   loop copy
      inf mbs  0 us   memset
      inf mbs  0 us   memcpy
 
As part of my benchmarking, I had done memcpy/memset measurements, but they were crazy fast. I decided something was amiss with cache or the core assembler code (memcpy-armv7m.S memset.S), although it was evident the data was getting copied/set. Also the 10-us micros() resolution didn't help. I'll revisit. (NXP SDK had a SCB_DisableDCache() )
...

Beta8 has 1us resolution on micros()! Of course CycCnt is like 600 times better.

Finally typed in the trace part for the debug lib - it was fun - luckily I had fault handling with debug lib while working on debug lib :) Auto doesn't work for upload when faulted - but you can compile sketch and when prompted hit 'b' on Serial# SerMon and it puts the T4 to bootloader for a fresh upload without button.

Thanks to KurtE patience the Serial has a hacked workaround - need a clean solution to print Serial when Faulted and interrupt priority is -3 (?) for system fault. Doesn't work on USB as it is write only, so no prompting.

Trace currently takes 3 values: extern "C" void dbTrace_tt( uint32_t aa, uint32_t bb, const char *cc); // "C" is as needed. Or include the debug_tt.h
For this test it was called like :: dbTrace_tt( ARM_DWT_CYCCNT, __LINE__, __func__ );
Call from anywhere - even PJRC startup or any interrupt like systick_isr [ 1k/sec is a lot to see though ].
Printed this log with printf rules where 2000 is max display count (pass 0 to clear the log):: dbTraceShow_tt( 2000, "CycCnt %u", "\tline %u", "\tfunc %s\n" );
You can see the HardwareSerial::IRQHandler is Very Busy. And you can see where the CycleCounter is enabled - a second Trace would have shown it not zero in configure_systick.
Last calls are on the top and the line 62 calls are in a loop - shows 28 cycles between calls so not real slow. I had to put an "if ( !TraceOn ) return;" in because the Trace calls were updating the list as I was printing … like I said - good to have a helpful Fault handler - the dump info didn't help me - but I knew what happened when output stopped - I could correct and hit 'b' in TyComm. Need to add user Stop/Start of trace so it can be idled - I could have hidden the IRQHandler.
Code:
#1: (ii=112): CycCnt 510634065	line 62	func setup
#2: (ii=111): CycCnt 510634037	line 62	func setup
#3: (ii=110): CycCnt 510634009	line 62	func setup
#4: (ii=109): CycCnt 510633981	line 62	func setup
#5: (ii=108): CycCnt 510633938	line 62	func setup
#6: (ii=107): CycCnt 510633901	line 60	func You r here
#7: (ii=106): CycCnt 510018188	line 77	func foo
#8: (ii=105): CycCnt 509879640	line 74	func foo
#9: (ii=104): CycCnt 509765329	line 71	func foo
#10: (ii=103): CycCnt 509765294	line 69	func foo
#11: (ii=102): CycCnt 509656769	line 67	func foo
#12: (ii=101): CycCnt 182153308	line 479	func IRQHandler
#13: (ii=100): CycCnt 182153114	line 479	func IRQHandler
  // … lots more … func IRQHandler
#97: (ii=16): CycCnt 175445118	line 479	func IRQHandler
#98: (ii=15): CycCnt 175344221	line 479	func IRQHandler
#99: (ii=14): CycCnt 175341118	line 479	func IRQHandler
#100: (ii=13): CycCnt 175299898	line 479	func IRQHandler
#101: (ii=12): CycCnt 175299481	line 479	func IRQHandler
#102: (ii=11): CycCnt 175296446	line 101	func Ser.begin
#103: (ii=10): CycCnt 175293201	line 479	func IRQHandler
#104: (ii=9): CycCnt 175292721	line 479	func IRQHandler
#105: (ii=8): CycCnt 175292265	line 479	func IRQHandler
#106: (ii=7): CycCnt 175289185	line 101	func Ser.begin
#107: (ii=6): CycCnt 12487560	line 56	func delay
#108: (ii=5): CycCnt 2768161	line 253	func reset_PFD
#109: (ii=4): CycCnt 6542	line 207	func usb_pll_start
#110: (ii=3): CycCnt 0	line 123	func configure_systick
#111: (ii=2): CycCnt 0	line 161	func configure_cache
#112: (ii=1): CycCnt 0	line 73	func ResetHandler
#113: (ii=0): CycCnt 0	line 253	func reset_PFD

Does this seem useful? Ideas? There is other stuff in the lib from debug_t3 that I find helpful too - I suppose it all still works. How much memory in the buffer array( it is circular ) and how many vars to pass? a signed int? Other debug_tt logging calls func()'s with a macro that auto adds the __func__ or __LINE__ to the calls. I could do that here too.
 
@Manitou - can you measure the speed of a memcpy in normal ram, ocram - and ocram with disabled cache (writethrough and nocache)? and between both?

OK, with beta8, using 16K buffers here are data rates for RAM and OCRAM (malloc), all aligned on 32-byte
Code:
         src 20005280 dst 20001280   DTCM
         2383.13 mbs  55 us   loop set
         2427.26 mbs  54 us   loop copy
         16384.00 mbs  8 us   memset
         18724.57 mbs  7 us   memcpy

          malloc aligned 32
          src 20200020 dst 20204040   OCRAM
          1191.56 mbs  110 us   loop set
          799.22 mbs  164 us   loop copy
          16384.00 mbs  8 us   memset
          10922.67 mbs  12 us   memcpy

reorder
          src 0x20200020 dst 0x20204040   malloc
          4681.14 mbs  28 us   memset
          4369.07 mbs  30 us   memcpy
          1202.50 mbs  109 us   loop set
          1065.63 mbs  123 us   loop copy

          src 0x20005280 dst 0x20001280  DTCM RAM
          14563.56 mbs  9 us   memset
          18724.57 mbs  7 us   memcpy
          2427.26 mbs  54 us   loop set
          2427.26 mbs  54 us   loop copy
loop set/copy using byte indexing. dst = src
 
Last edited:
Wow thank you. You're fast :)
So OCRAM has 1/2 speed. Not that bad!

The difference between loop and memcpy is amazing.
Because.. memcpy uses a loop, too.
This was with enabled cache, right?
Can you run that in reversed order? First memcpy, then loop? :)
I think we just see a cache doing its job, here..
 
Okay.. I don't understand the results :)
edit: ah.. you swapped the order of ocram-dtcm, too! :)

Now the same without cache, and with cache write-through?(ocram only)
 
Thread closed???
---------------------------------------------------------------


DMA and memory cache - Thanks Frank, yes there are functions like:
arm_dcache_flush(addr, size) - Flush the data but keep in cache
arm_dcache_delete(addr, size) - deletes stuff out of cache without updating ram
arm_dcache_flush_delete(addr, size) - flushes stuff from cache into memory and deletes the stuff from cache...

So probably the safest in something like: SPI.transfer(buf, retbuf, cnt, event) is to do a arm_dcache_flush(buf, size)
Which will slow things down a bit as this code does:
Code:
__attribute__((always_inline, unused))
static inline void arm_dcache_flush(void *addr, uint32_t size)
{
	uint32_t location = (uint32_t)addr & 0xFFFFFFE0;
	uint32_t end_addr = (uint32_t)addr + size;
	asm("dsb");
	do {
		SCB_CACHE_DCCMVAC = location;
		location += 32;
	} while (location < end_addr);
	asm("dsb");
	asm("isb");
}

---------------------
DMA screen update, using scatter/gather to touch IO pins, like CS/DC...
Maybe? But would need to properly handle the timings of when this happens.

That is with our without trying to use DMA for these pins. There is already an issue of the pin being updated while the SPI is still happening.
That is if you do something like:
Assert CS
Start DMA write transfer from your memory to SPI output, and have it interrupt/complete at end of transfer.
In ISR - Deassert CS.

The deassert will happen while SPI is still running on your data, that is your ISR is called when the output queue needs more data and your DMA has completed, but before the SPI shift register has actually output the last of the data.

To handle this on T3.x - I ended up also setting up DMA from SPI to memory, and in the case above SPI.transfer, if retbuf is NULL, to some dummy memory address, with the increment turned off. Then when this DMA completes, it says I have retrieved the Count bytes, which is when your transfer is done, and at that time it is safe to update the CS/DC settings... Not sure if that makes sense?
 
hm. or just transfer one more byte that gets ignored .. with watermark 0 there will be only that last dummy that gets ignored when cs deasserts? just an idea..
 
Simple sketch - started with Paul's #1205 last simple write - saw that using global or static versus local var versus high/low in func call changed the time - then noticed it was different whether the write was HIGH or LOW?

T_Loader verbose at this point save and cleared: View attachment 15722 - had to ZIP ...

Now My T4 again in the ODD state - even holding the button does nothing - after 20 secs it does a quick blink - but won't do the ON and Reset at 15 seconds?

Uploaded a few variations as this grew evolved to where I was going to try #define to replace WriteFast with Write but it failed upload and here I am. Had a T-3.1 on - watching the Debug Serial port - no code uploads to it this time.

Fails Auto and Button is T4 wholly ignored. Just realized the other T_3.1 was of course on Serial1 - from the same hub. Unplugged it and then the T4 LED blinked some - but no upload.

Red LED on (with a pulse near the off side) a short second then off about 2 secs?

Sermon won't connect - closed IDE - now no T_ports on ports - only two Serial ports on IDE are the two disconnected - LED is flashing like it is running the code now - but it isn't visible. TyComm doesn't show the device online - it is lost somewhere ...

Held button - saw flash - seemed over 15 secs - it did the long red flash? Did not present the 'unknown USB' - back to the flash 1 on 2 off cycle?

Okay pull off PC put on USB battery pack and 15 seconds blinked and then reset - put back on PC same port and nothing - moved to another port and it came up and is now working again.

Just so you know I'm not ignoring this... I've tried reading this 3 times now and I still can't make any sense of it.

I would really like to know a way to reproduce this "T4 again in the ODD state" problem.
 
Sounds like something that might be fun to experiment with. Probably to some fixed hardware, example ILI9341, where you can hopefully detect if it works for that hardware or not. That is if the SPI has output lets say 3 bits out of the 16 bits (in word mode for pixels), and the CS (or DC) pin is deasserted? How will the device respond?
Likewise if after this, it starts up new DMA SPI output, after changing DC how will this look and how will the device respond:
<DMA output of pixels SPI> <DMA change DC - IO PIN><DMA output command to SPI>

Might work great.... Or device might get totally confused? Maybe worth experiment...
 
from nxp community: https://community.nxp.com/thread/469064. Says just toggling a pin is at about 4.4Mhz.

@Paul.
Did you try changing the pad speed to 200Mhz, SPEED_3 ?

Tried a couple more tests just now.

The pad settings don't affect the toggle rate at all. On my 200 MHz scope with a normal ground clip (no effort for good high speed probing) I see massive overshoot and undershoot - which get slightly worse if the 200 MHz option is used. But even 50 MHz looks pretty bad. Sorry, I'm so far behind due to being sick these last few weeks, so not going to test the pad settings more than this quick check.

Here's the code I ran just now.

Code:
void setup() {
  pinMode(1, OUTPUT);
  CORE_PIN1_PADCONFIG |= IOMUXC_PAD_SRE; // fast slew rate
  //CORE_PIN1_PADCONFIG &= ~IOMUXC_PAD_SPEED(3); // 50 MHz
  CORE_PIN1_PADCONFIG |= IOMUXC_PAD_SPEED(3); // 200 MHz
}

void loop() {
  while (1) {
    digitalWriteFast(1, HIGH);
    digitalWriteFast(1, LOW);
    digitalWriteFast(1, HIGH);
    digitalWriteFast(1, LOW);
    digitalWriteFast(1, HIGH);
    digitalWriteFast(1, LOW);
    digitalWriteFast(1, HIGH);
    digitalWriteFast(1, LOW);
  }
}

Other note... the output is a 9.38 MHz square wave.

My best guess is writes go to some sort of buffer, which takes only 2 cycles if the buffer is empty. But if a previous write is still pending, it seems to take a very long time. Disappointingly slow. :(
 
Unfortunately Paul has overseen my earlier question (https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=195131&viewfull=1#post195131) on intended uSD connection.

No, I don't have any recommendation for SD card (SDIO, not SPI) connection at this time.

Truth is, I put those 8 pads on the bottom side to make sure I could get the 6 signal accessible, but no real work has gone into them so far.

Sorry, I'm just not able to think of every detail in advance...
 
Tried a couple more tests just now.

The pad settings don't affect the toggle rate at all. On my 200 MHz scope with a normal ground clip (no effort for good high speed probing) I see massive overshoot and undershoot - which get slightly worse if the 200 MHz option is used. But even 50 MHz looks pretty bad. Sorry, I'm so far behind due to being sick these last few weeks, so not going to test the pad settings more than this quick check.

Here's the code I ran just now.

Code:
…...

Other note... the output is a 9.38 MHz square wave.

My best guess is writes go to some sort of buffer, which takes only 2 cycles if the buffer is empty. But if a previous write is still pending, it seems to take a very long time. Disappointingly slow. :(

Thanks for checking the pad speed. Interesting that it doesn't really affect the toggle at all. Like you said something else is going on, e.g., the buffering or something else. That may also explain the response to the nxp forum question - limit is imposed by bus arbitration, etc.... They never did explain the etc. part of the response.

The more important thing is that you are feeling better. Just don't over do it in trying to catch up - may wind up with a relapse.

Mike
 
Status
Not open for further replies.
Back
Top