Teensy 4.0 First Beta Test

Status
Not open for further replies.
Just got mine - it works with TV remote - connected to open pin #9 with Demo.
Hi Tim, glad you got it working. It will good to decode remotes :) Already thinking ahead.

Got the Talkie updated - don't know if you saw the post. A lot of posts since then :). Its on my WIP page :)

Just tried easytransfer serial1 to serial1 from a T3.2 to the T4, didn't work. I then tried A T3.5 to the T3.2 and vice versa - it didn't work either. the structure in both are int16's so that should be ok. Have to dig a little deeper now.

By the way - do you know why we just didn't copy Stream.cpp over from the T3 core to the T4. Going to test i2c etc later or tomorrow getting tired now.
 
Taking a closer look through SPI and thinking of ili9341_t3 like code...

I (or we) may need to play around some more. Hopefully tomorrow morning I will have the updated Transfer(buf, retbuf, cnt) version implemented, plus a transfer16, that does it in one transfer...

Also a closer look at the SPI registers and FIFO queue. I believe both the TX and RX have 16 32 bit queues, which should be nice, for the above transfer.

I think I may have been somewhat wrong about not having some T3ish like support for hardware CS support.

Whereas the T3.x boards allowed you to encode the state of these registers as part of a PUSH operation, the new T4 processor does allow you some support, that it will be interesting to try out.

To control this you use the Transmit Command Register (TCR), which controls several things including the transfer speed, The word width (here is where I will experiment changing from 8 to 16).

But in addition to this you can control a PeripheralChipSelect, And also if you wish for it to be Continuous... So not sure if we can do some hacks to control turning it on and off on demand.

Note: in the ILI9341 case, would not try to use this for CS pin, but hopefully DC pin. My ili9341_t3n library works with only one CS pin for DC and I found that using standard IO pin for CS did not impact performance much.

The way this TCR is used is the FIFO Transmit queue can is 16 units deep and each unit can be either a command (TCR) or data (Transmit Data register TDR). Again looks some stuff that might be fun to experiment with.
 
Tim, would you like to try alignment for .BSS with our other targets (all Teensy 3.x and LC models)? I could well imagine that all targets would benefit (more free memory for variables). I'm at work, and can't test anything at the moment. Maybe it will work, maybe I'm wrong.. I would try ALIGN(4) in the *.ld linker files.
Code:
 .bss [COLOR=#ff0000]ALIGN(4) [/COLOR]: {
My theory is, without that, the alignment is 4096 (default).
 
Frank: I find this in the T_3.6.ld seems already done? - or does it need the top line edited?

I added that and the builder reports the same RAM on the sketch I ran. Also saw the same when I ran this MEMORYcheck is there one that would show the difference?

T:\arduino-1.8.8T4_146\hardware\teensy\avr\cores\teensy3\mk66fx1m0.ld ::
.bss : {
. = ALIGN(4);
_sbss = .;
__bss_start__ = .;
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
__bss_end = .;
__bss_end__ = .;
} > RAM
 
< I just hit edit post on the last and this one :) - but I stopped myself >

It may be working - I'm not sure what it would take to see the change? My sketch was small, and I wasn't sure if it would report on the Builder Console line?
 
There are more issues:
Code:
200015f0 00000028 B __malloc_current_mallinfo
2000[COLOR=#ff0000]1800[/COLOR] 000002c0 B _VectorsRam
20002000 00000020 B endpoint0_transfer_data
more gaps.

And I have the gut feeling that the "free RAM" display for Teensy 3.x is wrong. I think that at least the vectortable (maybe more) is not considered.
Platform.txt:
Code:
recipe.size.regex.data=^(?:\.usbdescriptortable|\.dmabuffers|\.usbbuffers|\.data|\.bss|\.noinit)\s+([0-9]+).*
I will investigate this when I'm back home.
 
To control this you use the Transmit Command Register (TCR), which controls several things including the transfer speed, The word width (here is where I will experiment changing from 8 to 16).
You can actually set the TCR frame size to 32-bits. I hacked the TCR for Paul's current T4 SPI, and the data rate increases. Otherwise the inter-frame gap dominates for 8-bit transfers. If byte-order is an issue with 16-bit and 32-bit transfer, there is a control bit to reverse the byte order.

The EVKB eval board tests showed the same kind of speed up.
https://forum.pjrc.com/threads/5426...B-(600-Mhz-M7)?p=192387&viewfull=1#post192387
The EVKB SDK SPI example also used FIFO and ISR, adjusting the hiwater FIFO mark effected data rate. It also used hardware controlled CS. I never played with that on T3 or T4, so with a scope I was surprised to see the CS signal toggling for every frame. In my tests, the SDK SPI DMA example would outperform the non-DMA example. CS remained low for the duration of the transfer for the SDK DMA example.
 
I've asked Robin to send you a replacement. You should hear from her tomorrow. After you've got a new one, I want to get the board back here so I can look into what went wrong.

I just tested the replacement board here. It definitely auto-reboots on Ubuntu 18.04 64-bit.
:(
Sigh, T4 won't load at all this morning. After pushing program button, i get slow flashing red LED, but no progress bar and uploads fail. no /dev/ttyAMC0. Holding down program button for 13 secs, gets steady red LED for a few seconds ... but doesn't fix anything

It was working last night with beta7. I think the last thing i did was I2C test of setClock() with scope -- looked good at 100k, 400k, and 1mhz.

5v on Vin, 3.3v on 3v3 pin.

I've tried on other desktops, other USB cables. T3* stuff still works.
 
Teensy 3.x:
Displayed needed ram for vars is too small (error depends on usb-descriptor length, for my testketch with "USB-Serial" 160 bytes too small), because:
Alignment of usb-descriptors and vectortable are both 512. usb-descriptors is smaller than 512 bytes,but due to vectortable there exists a gap which is not taken into account.
fix: not easy or impossible with the way it works.

Edit: HEX 0x160, not 160 decimal.
 
Last edited:
Sigh, T4 won't load at all this morning. After pushing program button, i get slow flashing red LED, but no progress bar and uploads fail. no /dev/ttyAMC0. Holding down program button for 13 secs, gets steady red LED for a few seconds ... but doesn't fix anything
@manitou
Had this happen to me on and off depending on what I was testing. It was usually something in my sketch. After doing 13 sec reset I will compile the Blink sketch (won't autoload because I loose the usb connection in most cases) and then hit the prgm button once and it works. Is this what you did?
 
@manitou
Had this happen to me on and off depending on what I was testing. It was usually something in my sketch. After doing 13 sec reset I will compile the Blink sketch (won't autoload because I loose the usb connection in most cases) and then hit the prgm button once and it works. Is this what you did?

yep, tried that and more ... no luck.
 
EASYTRANSFER LIBRARY
Tested RX and TX on the T4 on Serial1,2,3,4 and 5. Worked no problem.

EASYTRANSFER-I2C
Tested TX on the T4 using Wire and Wire1. Worked like a charm.
Couldn't test RX on the T4 as saw that wire.begin(Slave Address) is on the todo list.
 
This morning I started playing with reworking the SPI.transfer(buff, rxbuf, cnt).

Have a first pass:
Code:
void SPIClass::transfer(const void * buf, void * retbuf, size_t count)
{

	if (count == 0) return;
    uint8_t *p_write = (uint8_t*)buf;
    uint8_t *p_read = (uint8_t*)retbuf;
    size_t count_read = count;

	// Pass 1 keep it simple and don't try packing 8 bits into 16 yet..
	// Lets clear the reader queue
	//port->CR = LPSPI_CR_RRF;

	while (count > 0) {
		// Push out the next byte; 
		port->TDR = p_write? *p_write++ : _transferWriteFill;
		count--; // how many bytes left to output.
		// Make sure queue is not full before pushing next byte out
		do {
			if ((port->RSR & LPSPI_RSR_RXEMPTY) == 0)  {
				uint8_t b = port->RDR;  // Read any pending RX bytes in
				if (p_read) *p_read++ = b; 
				count_read--;
			}
		} while ((port->SR & LPSPI_SR_TDF) == 0) ;

	}

	// now lets wait for all of the read bytes to be returned...
	while (count_read) {
		if ((port->RSR & LPSPI_RSR_RXEMPTY) == 0)  {
			uint8_t b = port->RDR;  // Read any pending RX bytes in
			if (p_read) *p_read++ = b; 
			count_read--;
		}
	}
}
Note: in this code instead of looking at how many transfers are in the FIFO registers, I instead had the init code set the write watermark to queue size - 1, so I then just check the flag that gets set/cleared if we hit the watermark. Likewise on read I am using the FIFO empty status...

Note: this code has shown up more things needing fixing/setting...
screenshot.jpg

That is we are not initializing any of the delay values, like SCK to PCS and PCS-to-SCK... Those maybe don't need as so far we don't have SPI controlling the CS pin. But maybe need the DBT (delay between transfers). You did not notice it before when fifo was not used. You can see that in the first few bytes here that were transferred as individual transfers, but in the later ones in this image, there is no gap (1 clock ) between clocks of the start and end of each byte...
 
ResponsiveAnalogRead Library

Would say its working. Attached it to my sharp distance sensor and changed some of the parameters.

But there is an error in tee library. The getRawValue should return the result of the analogRead but doesn't. Simple fix actually but the library has to be up updated. The error does not affect the output or validity of the output.

EDIT: Just did an issue (https://github.com/dxinteractive/ResponsiveAnalogRead/issues/19) with the proposed fix to GitHub repository https://github.com/dxinteractive/ResponsiveAnalogRead.
 
Last edited:
32 bit audio library?

moving forward with SAI interface, I wanted to put data on record queue.
There was only the need to attach the software_isr as in
Code:
bool mAudioStream::update_setup(int prio)
{
  if (update_scheduled) return false;
[B]  attachInterruptVector(IRQ_SOFTWARE, software_isr);[/B]
  NVIC_SET_PRIORITY(IRQ_SOFTWARE, prio*16); // 255 = lowest priority
  NVIC_ENABLE_IRQ(IRQ_SOFTWARE);
  update_scheduled = true;
  return true;
}

note, my application wants 32 bit audio datablocks so I forked my own AudioStream object (called mAudioStream, etc.)
But is should work also with the 16-bit stock Audiostream

This leads me to the Question:
should there not be a Audio32 library, or should 32 bit continue to be custom?
the 16 bit DSP advantage is for a T4 not really an issue.
 
@Walter: IRQ_SOFTWARE is already defined (70)
I think the DSP advantage is the same as on 3.0..3.6, there is no difference (in fact the Arm7 should be even a bit faster with that) - you can still process two samples with one instruction.
 
Status
Not open for further replies.
Back
Top