Teensyduino 1.56 Beta #1

Status
Not open for further replies.
My new test:
Code:
//Use a Terminal that can log to a file.
//Check if the line numbers fit.
//Windows: There is a memory bug in the USB CDC driver, so you'll probably see strange results.

const int n = 100000;
volatile uint32_t t;

void setup() {
  Serial.begin(9600);
  while (!Serial);
  delay(1000);
}

void loop() {
  t = micros();
  for (int i = 1; i < n + 1; i++) {
    while ( Serial.availableForWrite() < 10 ) {}
    Serial.printf("%09d\n", i);
  }
  t = micros() - t;
  Serial.printf("%d Bytes in %d us = %0.2f Megabits per second", n * 10, t, (double)n * 10 * 8 / t);
  delay(3000); //If your computer is not that fast, increase this delay! Windows seems to need it, as the Teensy is too fast for the Windows CDC driver
}

H-Term shows stable 26MBits, without misssing lines or garbage.

So, this makes your speedtests work.
 
Last edited:
Shutting down the ~2 day running test to restart machine.

FWI : Here are two stats notes from USB1 send debug stream. The first Just stopped Autoscroll and took last print I could capture - pasted into TyComm USB1 and sent.

Then did Clear output and did the same which seems to have advanced the shown count value below at 2,783,168,014 versus 2,835,543,625
Code:
174721.51==secs last lps calc USB1::	count=[B][U]2783168014[/U][/B], lines/sec=420363
count=2830016997, lines/sec=403628	Send sec =174722.08
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =25000 = 155502
lps hits to =50000 = 316
lps hits to =75000 = 301
lps hits to =100000 = 549
lps hits to =125000 = 384
lps hits to =150000 = 386
lps hits to =175000 = 355
lps hits to =200000 = 345
lps hits to =225000 = 313
lps hits to =250000 = 239
lps hits to =275000 = 250
lps hits to =300000 = 252
lps hits to =325000 = 282
lps hits to =350000 = 226
lps hits to =375000 = 468
lps hits to =400000 = 1478
lps hits to =425000 = 3279
lps hits to =450000 = 9759
lps hits to =475000 = 37
174752.51==secs last lps calc USB1::	CLEARED :: count=[B][U]2835543625[/U][/B], lines/sec=364755
count=2842474527, lines/sec=430898	Send sec =174752.73
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =25000 = 155502
lps hits to =50000 = 316
lps hits to =75000 = 301
lps hits to =100000 = 549
lps hits to =125000 = 384
lps hits to =150000 = 386
lps hits to =175000 = 355
lps hits to =200000 = 345
lps hits to =225000 = 313
lps hits to =250000 = 239
lps hits to =275000 = 250
lps hits to =300000 = 252
lps hits to =325000 = 282
lps hits to =350000 = 226
lps hits to =375000 = 473
lps hits to =400000 = 1488
lps hits to =425000 = 3279
lps hits to =450000 = 9775
lps hits to =475000 = 37

Seems hitting Clear output helps to remove the lagging buffer ...
 
My understanding is that there should be a ""dmb:::memory" (why is dsb used ?) after each tx_noautoflush = 1; in the code. this is not the case.
Likewise, certainly before each tx_noautoflush = 0;

Let me ask again, is this OK, as it is? For me, it looks like there may races be possible. I'm not that sure, though... an timer interrupt can happen before tx_noautoflush got written? Not sure about the inner workings of the pipelines... and yes, that's a question, no offense
Tim, what do you think? (usb_serial.c)
 
Let me ask again, is this OK, as it is? For me, it looks like there may races be possible. I'm not that sure, though... an timer interrupt can happen before tx_noautoflush got written? Not sure about the inner workings of the pipelines... and yes, that's a question, no offense
Tim, what do you think? (usb_serial.c)

found that code and Glanced when posted - but didn't look enough to see what the effect on what might be? Time to log off now ...
 
The intention is to block autoflush, as bad things can happen to the serial tx if it runs at the wrong time. Missing sent chars, etc. Could be tested by disabling autoflush completely and see if it changes something.
I'm not that interested to dig into this, maybe someone else wants. Just wanted to show a (in my eyes) possible issue. Quite possible that everything is OK, too.
 
Sounds like a good theory. :)

I tried moving the memory barriers after each write to tx_noautoflush = 0. I'm running the original usb_serial_print_speed.ino without the availableForWrite() check. Haven't seen any garbage yet, but it's only been a few minutes. Will leave it running several hours...
 
Sounds like a good theory. :)

I tried moving the memory barriers after each write to tx_noautoflush = 0. I'm running the original usb_serial_print_speed.ino without the availableForWrite() check. Haven't seen any garbage yet, but it's only been a few minutes. Will leave it running several hours...

If that does anything good, it might be useful in yield too:
Code:
	if (running) return; // TODO: does this need to be atomic?
	running = 1;
after running = 1.


The garbage comes from windows. But it might help against missing lines and similar issues.
The garbage was seen by STM users, too.
 
This double negative is irritating.. :)
Shouldn't it be after tx_noautoflush=1 ?

Ok, tried it too. Indeed, no missing lines with the dsb:::memory and without the availbleForWrite - I added the barrier everywhere after tx_noautoflush=1.
Tested with the 10MB sketch, h-term and looking at the line numbers.

Code: https://github.com/FrankBoesing/cdcbench/blob/main/usb_serial.c

I still saw garbage - but that's the Windows problem..
Added some more barriers after ...flush=0, partly without dsb, only to make sure gcc does not move them around..
 
Last edited by a moderator:
I played with the memory barriers here. At first it seemed to solve the garbage problem. But then I undid them one by one, and even with everything the way is was earlier this week, I couldn't reproduce the garbage spew like I saw only a few days ago.

Then I tried just commenting them all out. Again, similar results on Windows, at least visually looking at the serial monitor.

I plugged it back into my Linux desktop, and wow, what an amazing speedup! With the transmit timeout code disabled, so we never drop any data, and with memory barriers as on github today, I was getting around 520K lines/sec. Without the memory barriers, I get about 665K lines/sec and no signs of any problems.

If I overclock to 912 MHz, without the memory barriers, I see over 1 million lines/sec received on Linux!
 
I played with the memory barriers here. At first it seemed to solve the garbage problem. But then I undid them one by one, and even with everything the way is was earlier this week, I couldn't reproduce the garbage spew like I saw only a few days ago.

Then I tried just commenting them all out. Again, similar results on Windows, at least visually looking at the serial monitor.

I plugged it back into my Linux desktop, and wow, what an amazing speedup! With the transmit timeout code disabled, so we never drop any data, and with memory barriers as on github today, I was getting around 520K lines/sec. Without the memory barriers, I get about 665K lines/sec and no signs of any problems.

If I overclock to 912 MHz, without the memory barriers, I see over 1 million lines/sec received on Linux!

No skipped lines?
 
More experimenting.... with MacOS receiving, and Teensy 4.1 running at 912 MHz without memory barriers on writes to tx_noautoflush, the transmit buffer size make a huge difference. I see about 704K lines/sec sustained with 4 2K buffers. Using 6 8K buffers lets MacOS on a low-powered Macbook Air receive 933K lines/sec... almost as fast as Linux on running on a fast desktop.

But I am seeing some signs of trouble on MacOS at these speeds. Difficult to know what's causing this.

sc.png
 
That p#111 MacOS image is what shows often on TyComm ... but in recent testing T_sermon never showed lines like that.

It was scrolling fast, sluggish, or dumping garbage. Garbage seemed reduced with the .availableForWrite() usage.
 
Paul, Yes... it skips lines. With 100 lines skipped like in your screenshot, it's easy to reach those high numbers.
As I said, it overwrites buffers that are currently transmitting.
You can reach 150MHZ UART if you ignore that the FIFO wasn't sent and instead keep writing to the registers as fast as possible.

Can't imagine that you want this.
Test with a tool that logs to a file. The Arduino MOnitor is not the right tool - unless you add the log capability.

Again, we have TWO effects:

1) Windows Bug - garbage
2) Skipped Data - two possible reasons: a) Not everything is sent b) PC throws data away.
The effect that it gets faster wihtout the barriers is a strong indicator that a) is the issue.

Edit: and when this is solved, you need a add flow control to make sure the the Teensy slows down when the PC-SOftware can't keep up.
Otherwise you can measure the speed of the NULL Device - has the same significance.


Tim: Yes, because it gives Windows a little more time for memory handling. But I was wrong when I said that a delay is bad. It seems to be even better than .availableForWrite()
But that's a Windows problem...
Someone should report this (again).
 
Last edited:
Took out .Avail4write() against last posted T_sermon, BUMPED up IDE memory INI to:
Code:
-Xms512M
-Xmx1024M

Looking fast typically fast with well formed lines and no garbage observed:
tsermon_noAv4Wr.png

The USB instant count shows:
Code:
390.64==secs last lps calc USB1::	no garbage yet ... typically fast scroll : This and Screenshot again
count=234302760, lines/sec=677961	Send sec =390.99
 >> Call by USB==1 or USB1==2 :: 2

Image taken a sec after the USB dump shows counts : 233287196 .vs. 234302760

Output of lps stats after about 12 minutes: Not seen garbage and nothing under 50K lps - and speed it much faster:
Code:
21.65==secs last lps calc USB1::	start no avail4write
count=11963808, lines/sec=107881	Send sec =22.62
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 2
lps hits to =125000 = 5
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 1
65.64==secs last lps calc USB1::	start no avail4write
count=23718245, lines/sec=466336	Send sec =66.58
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 2
lps hits to =250000 = 4
lps hits to =275000 = 10
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 1
lps hits to =425000 = 1
lps hits to =450000 = 1
lps hits to =475000 = 1
114.64==secs last lps calc USB1::	SLOW ????
count=53183312, lines/sec=808229	Send sec =115.40
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 2
lps hits to =425000 = 3
lps hits to =450000 = 4
lps hits to =475000 = 7
lps hits to =500000 = 1
lps hits to =525000 = 2
lps hits to =550000 = 3
lps hits to =575000 = 8
lps hits to =600000 = 2
lps hits to =625000 = 1
lps hits to =675000 = 1
lps hits to =700000 = 3
lps hits to =825000 = 14
214.65==secs last lps calc USB1::	no garbage yet ... typically fast scroll
count=120413044, lines/sec=544302	Send sec =215.11
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 2
lps hits to =425000 = 3
lps hits to =450000 = 4
lps hits to =475000 = 7
lps hits to =500000 = 3
lps hits to =525000 = 3
lps hits to =550000 = 6
lps hits to =575000 = 17
lps hits to =600000 = 21
lps hits to =625000 = 6
lps hits to =650000 = 10
lps hits to =675000 = 2
lps hits to =700000 = 10
lps hits to =725000 = 5
lps hits to =750000 = 7
lps hits to =775000 = 7
lps hits to =800000 = 9
lps hits to =825000 = 29
372.64==secs last lps calc USB1::	no garbage yet ... typically fast scroll : This and Screenshot
count=222654964, lines/sec=745612	Send sec =372.66
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 2
lps hits to =425000 = 3
lps hits to =450000 = 4
lps hits to =475000 = 7
lps hits to =500000 = 5
lps hits to =525000 = 5
lps hits to =550000 = 8
lps hits to =575000 = 47
lps hits to =600000 = 49
lps hits to =625000 = 19
lps hits to =650000 = 19
lps hits to =675000 = 15
lps hits to =700000 = 20
lps hits to =725000 = 15
lps hits to =750000 = 15
lps hits to =775000 = 18
lps hits to =800000 = 29
lps hits to =825000 = 29
390.64==secs last lps calc USB1::	no garbage yet ... typically fast scroll : This and Screenshot again
count=234302760, lines/sec=677961	Send sec =390.99
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 2
lps hits to =425000 = 3
lps hits to =450000 = 4
lps hits to =475000 = 7
lps hits to =500000 = 6
lps hits to =525000 = 6
lps hits to =550000 = 10
lps hits to =575000 = 50
lps hits to =600000 = 52
lps hits to =625000 = 20
lps hits to =650000 = 19
lps hits to =675000 = 15
lps hits to =700000 = 21
lps hits to =725000 = 15
lps hits to =750000 = 16
lps hits to =775000 = 19
lps hits to =800000 = 33
lps hits to =825000 = 29
711.64==secs last lps calc USB1::	Posting summary - not seen garbage yet
count=437601676, lines/sec=764160	Send sec =712.44
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 2
lps hits to =400000 = 2
lps hits to =425000 = 3
lps hits to =450000 = 7
lps hits to =475000 = 9
lps hits to =500000 = 10
lps hits to =525000 = 13
lps hits to =550000 = 28
lps hits to =575000 = 117
lps hits to =600000 = 103
lps hits to =625000 = 47
lps hits to =650000 = 44
lps hits to =675000 = 33
lps hits to =700000 = 37
lps hits to =725000 = 36
lps hits to =750000 = 29
lps hits to =775000 = 35
lps hits to =800000 = 66
lps hits to =825000 = 29
 
Just got a Long Pause in Tsermon - then BLANK before I could screenshot - and stayed that way -but USB1 showed still running.
AutoScroll Off didn't help, but 'Clear output' did seem to wake and return scrolling.
Still not seen garbage? All displayed lines are 'well formed' full expected string to a new line.
Still running ...
Code:
1103.65==secs last lps calc USB1::	PAUSED ??? Now blank
count=706002448, lines/sec=776156	Send sec =1104.34
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 6
lps hits to =325000 = 6
lps hits to =350000 = 1
lps hits to =375000 = 4
lps hits to =400000 = 2
lps hits to =425000 = 4
lps hits to =450000 = 10
lps hits to =475000 = 19
lps hits to =500000 = 24
lps hits to =525000 = 24
lps hits to =550000 = 41
lps hits to =575000 = 149
lps hits to =600000 = 127
lps hits to =625000 = 60
lps hits to =650000 = 63
lps hits to =675000 = 54
lps hits to =700000 = 51
lps hits to =725000 = 48
lps hits to =750000 = 38
lps hits to =775000 = 44
lps hits to =800000 = 251
lps hits to =825000 = 29

Paul, Yes... it skips lines. With 100 lines skipped like in your screenshot, it's easy to reach those high numbers.
As I said, it overwrites buffers that are currently transmitting.
You can reach 150MHZ UART if you ignore that the FIFO wasn't sent ... or incredible speeds on PC if you copy a file to NUL

Can't imagine that you want this.
Test with a tool that logs to a file.

Again, we have TWO effects:

1) Windows Bug - garbage
2) Skipped Data - two possible reasons: a) Not everything is sent b) PC throws data away.
The effect that it gets faster wihtout the barriers is a strong indicator that a) is the issue.

Tim: Yes, because it gives Windows a little more time for memory handling. But I was wrong when I said that a delay is bad. It seems to be even better than .availableForWrite()
But that's a Windows problem...

It must be losing/skipping data as the screen scroll rate is no faster than ever before - but is doing LPS faster than ever with the 'Stall' message lost fix, and added JavaVM RAM - now settled at 700MB where it was capped at 512 MB before with -Xmx512M versus -Xmx1024M.
 
Started second parallel T_4.1 with TYComm - it is using ~23% CPU verus ~11% for T_sermon.

It started MUCH slower - but is running faster now - No garbage seen and and so far only well formed lines.

Windows 11 was started fresh an hour ago ... TyComm better than it was when last used.

Tsermon still running - no garbage and no more long pauses 'seen'.

This is TyComm stats
Code:
58==secs last lps calc USB1::	
count=12428934, lines/sec=174307	Send sec =21.21
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 8
lps hits to =100000 = 2
lps hits to =125000 = 2
lps hits to =175000 = 6
lps hits to =200000 = 1
lps hits to =275000 = 1
70.54==secs last lps calc USB1::	
count=17429803, lines/sec=150102	Send sec =70.76
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 24
lps hits to =100000 = 18
lps hits to =125000 = 6
lps hits to =150000 = 6
lps hits to =175000 = 12
lps hits to =200000 = 3
lps hits to =275000 = 1
146.55==secs last lps calc USB1::	
count=25381097, lines/sec=151485	Send sec =146.78
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 45
lps hits to =100000 = 45
lps hits to =125000 = 9
lps hits to =150000 = 14
lps hits to =175000 = 23
lps hits to =200000 = 9
lps hits to =275000 = 1
362.53==secs last lps calc USB1::	
count=130838686, lines/sec=639747	Send sec =362.97
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 52
lps hits to =100000 = 51
lps hits to =125000 = 10
lps hits to =150000 = 15
lps hits to =175000 = 23
lps hits to =200000 = 9
lps hits to =275000 = 5
lps hits to =300000 = 5
lps hits to =325000 = 2
lps hits to =350000 = 3
lps hits to =375000 = 4
lps hits to =400000 = 7
lps hits to =425000 = 19
lps hits to =450000 = 22
lps hits to =475000 = 34
lps hits to =500000 = 9
lps hits to =525000 = 6
lps hits to =550000 = 14
lps hits to =575000 = 8
lps hits to =600000 = 8
lps hits to =625000 = 10
lps hits to =650000 = 19
lps hits to =675000 = 7
lps hits to =700000 = 3
lps hits to =725000 = 1
lps hits to =750000 = 1
lps hits to =775000 = 1
lps hits to =800000 = 10
lps hits to =825000 = 4
524.53==secs last lps calc USB1::	
count=239786674, lines/sec=479895	Send sec =524.55
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 52
lps hits to =100000 = 51
lps hits to =125000 = 10
lps hits to =150000 = 15
lps hits to =175000 = 23
lps hits to =200000 = 9
lps hits to =275000 = 5
lps hits to =300000 = 5
lps hits to =325000 = 2
lps hits to =350000 = 3
lps hits to =375000 = 4
lps hits to =400000 = 7
lps hits to =425000 = 19
lps hits to =450000 = 23
lps hits to =475000 = 39
lps hits to =500000 = 10
lps hits to =525000 = 9
lps hits to =550000 = 24
lps hits to =575000 = 13
lps hits to =600000 = 38
lps hits to =625000 = 20
lps hits to =650000 = 25
lps hits to =675000 = 15
lps hits to =700000 = 12
lps hits to =725000 = 9
lps hits to =750000 = 5
lps hits to =775000 = 7
lps hits to =800000 = 66
lps hits to =825000 = 4
 
Indeed that is the only way to verify - though not easy with millions of lines.
TyComm logs - but it seems to not be keeping up cleanly.

No, quite easy.
Begin with #1 - check after 100million lines if the number matches the line number in your favourite text editor.
If everything is OK, it must match.
 
No, quite easy.
Begin with #1 - check after 100million lines if the number matches the line number in your favourite text editor.
If everything is OK, it must match.

Except TyComm keeps a rolling log across one or more files - so it isn't 'linear' with purged data and jumps.

Tsermon just did a LONG PAUSE and self recovered:
Code:
2770.65==secs last lps calc USB1::	
count=1812817124, lines/sec=590327	Send sec =2771.48
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 2
lps hits to =100000 = 3
lps hits to =125000 = 9
lps hits to =150000 = 2
lps hits to =175000 = 3
lps hits to =200000 = 5
lps hits to =225000 = 3
lps hits to =250000 = 4
lps hits to =275000 = 11
lps hits to =300000 = 7
lps hits to =325000 = 8
lps hits to =350000 = 8
lps hits to =375000 = 26
lps hits to =400000 = 56
lps hits to =425000 = 59
lps hits to =450000 = 85
lps hits to =475000 = 104
lps hits to =500000 = 60
lps hits to =525000 = 56
lps hits to =550000 = 74
lps hits to =575000 = 193
lps hits to =600000 = 164
lps hits to =625000 = 103
lps hits to =650000 = 113
lps hits to =675000 = 88
lps hits to =700000 = 84
lps hits to =725000 = 73
lps hits to =750000 = 73
lps hits to =775000 = 884
lps hits to =800000 = 375
lps hits to =825000 = 29

Both Tsermon above and TyComm below showing good data. Though TyComm is flashing horizontal scroll bar at times - suggesting it passed a line longer than narrow display?
Code:
lps hits to =825000 = 4
1071.53==secs last lps calc USB1::	
count=610376290, lines/sec=645372	Send sec =1071.65
 >> Call by USB==1 or USB1==2 :: 2
lps hits to =75000 = 52
lps hits to =100000 = 51
lps hits to =125000 = 10
lps hits to =150000 = 15
lps hits to =175000 = 23
lps hits to =200000 = 9
lps hits to =275000 = 5
lps hits to =300000 = 5
lps hits to =325000 = 2
lps hits to =350000 = 3
lps hits to =375000 = 4
lps hits to =400000 = 7
lps hits to =425000 = 21
lps hits to =450000 = 24
lps hits to =475000 = 44
lps hits to =500000 = 15
lps hits to =525000 = 23
lps hits to =550000 = 46
lps hits to =575000 = 26
lps hits to =600000 = 166
lps hits to =625000 = 37
lps hits to =650000 = 50
lps hits to =675000 = 41
lps hits to =700000 = 43
lps hits to =725000 = 30
lps hits to =750000 = 29
lps hits to =775000 = 26
lps hits to =800000 = 260
lps hits to =825000 = 4

tsermon_noAv4Wr_2ndwTycomm.jpg
 
Then use a tool which is better than TY and does not roll the log.

This thread reaches exactly a point now...
Guys, just make your thing.
 
from a dozn posts ago:
Paul, Yes... it skips lines. With 100 lines skipped like in your screenshot, it's easy to reach those high numbers.
As I said, it overwrites buffers that are currently transmitting.
You can reach 150MHZ UART if you ignore that the FIFO wasn't sent and instead keep writing to the registers as fast as possible.

Can't imagine that you want this.
Test with a tool that logs to a file. The Arduino MOnitor is not the right tool - unless you add the log capability.

Again, we have TWO effects:

1) Windows Bug - garbage
2) Skipped Data - two possible reasons: a) Not everything is sent b) PC throws data away.
The effect that it gets faster wihtout the barriers is a strong indicator that a) is the issue.*

Edit:and when this is solved, you need a add flow control to make sure the the Teensy slows down when the PC-SOftware can't keep up.
Otherwise you can measure the speed of the NULL Device - has the same significance.
* But this doe not mean its the only issuee - there may be more, or a completely different thing.

I'll practice playing guitar now.
 
Last edited:
Indeed relatively small chunks of data are missing on MacOS. At this point I do not know why. But it appears to be a relatively small portion, like a few dozen to about 100 lines, which occurs many tens or hundreds of thousands of lines apart. Yet another bug for my list....

I'm also starting to believe we will probably need an API to make the transmit timeout configurable, or maybe find a smarter way than just a simple timeout. We've had the transmit timeout since the earliest days of Teensy and it's always worked pretty well with 12 Mbit USB speed. The timeout is important because Arduino is a giant ecosystem of code built on the assumption that Serial has no flow control. If a receiver isn't listening, the bits are supposed to transmit anyway and just not be heard. In the early days of Teensy 2.0, many people complained because any sort of problem on the PC side where the software wasn't listening or consuming the data would cause Serial.print() to stall forever. For most of Teensy's history, a timeout in the 70 to 100 ms range has worked very well, even when data was consumed by very slow software like Labview and Puredata programs (not that those systems are necessarily slow - but the practical way people use them tends to incur massive overhead).

Currently we have a 120 ms timeout. If the transmit buffers are full and the PC hasn't consumed any data for 120 ms, we assume the software has crashed or otherwise isn't going to receive our data. We go into a mode where everything given to Serial.print() is immediately discarded if all the transmit buffers are still full. The idea is to mimic the behavior of real hardware serial without flow control, where the data is just lost if transmitted when nobody is listening, but we do enjoy the benefit of flow control as long as the receiver doesn't stop accepting input longer than the timeout. Stalling Serial.print() indefinitely is a behavior (almost) nobody really wants. And if anyone does want that behavior, they can get it easily by polling Serial.availableForWrite() forever.

I really do not know what we should do now that we have 480 Mbit speed. It's been huge challenge and still continues to be difficult. We released Teensy 4.0 in late 2019, and the first couple months of software work after release went into making the Arduino IDE not crash when receiving at the less optimized speed we had at release. Some work went into optimizing the USB speed. Then came Teensy 4.1 which released in May 2020, and over a year of mostly stalled software development. Now we're at this point where we have these sorts of difficult unanswered design questions, and not a lot of precedent from the rest of the Arduino world to follow (eg, the STM32 boards with 480 Mbit USB have inefficient USB serial that can't even manage 12 Mbit speed).

I'm looking for constructive input and feedback on this difficult design decision. I know many of us have strong feelings on these matters, and we're all under some stress from enduring a global pandemic, but I really want and hope to talk about this constructively. Please try to keep this conversation civil and focused. These are difficult problems involving tricky trade-offs, and dev time is going into other work (like MTP) so please understand this is a long journey and not to expect quick fixes. I'm not hopeful anything can be done to make Windows much faster, but I'm sure Teensy can have code which handles its limitations much more gracefully.
 
My two cents is this is not a high-priority issue, and I hate to think of you spending your time on peculiarities of Windows/Mac/Linux rather than Teensy itself. Reliable comm at 12 Mbps, or whatever is the lowest common denominator for the platforms you need to support, is fine by me. Slow and reliable is better, and for my purposes 12 Mbps is not slow! Even for high-frequency events, debug output could be designed to fit within 12 Mbps.
 
Status
Not open for further replies.
Back
Top