Serial.print interferes with Serial2.available on Teensy 4.0

petesoper

Member
Attached find two sketches that when run together with the "sender" Teensy writing to the "receiver" Teensy using Serial2 on both sides, shows Serial2.available never returning a nonzero result if Serial.print is used soon after the call. More details in the header comments of the two files. In a nutshell we can get info out of the program with Serial.print or we can have Serial2 work reliably, but we can't seem to do both in the more complex, target application the test cases were derived from.
 

Attachments

  • availablebug-receiver.ino
    2.2 KB · Views: 61
  • availablebug-sender.ino
    1.1 KB · Views: 84
The 2 second wait is in the sender, spacing out bunches of 20 writes, so not on the same teensy as the one with the issue!

The receiver code is saturating the Serial interface at 500kbaud though, so its likely to have its TX buffer permanently
full, which might be a clue.
 
The 2 second wait is in the sender, spacing out bunches of 20 writes, so not on the same teensy as the one with the issue!

The receiver code is saturating the Serial interface at 500kbaud though, so its likely to have its TX buffer permanently
full, which might be a clue.

If Serial.begin(19200) is substituted on either or both sides the receiver program fails just the same.
 
As far as I can see, there is no failure here. It is doing precisely what you're telling it to do, which is to spend more than 99.9% of the time repeatedly printing W on a line of its own. This happens so fast that you won't be able to see the other output which only occurs once every two seconds.

Pete
P.S. Changing Serial to 19200 won't affect it because the transfer occurs at USB speeds anyway, independently of what is set in Serial.begin().
 
The 2 second wait is in the sender, spacing out bunches of 20 writes, so not on the same teensy as the one with the issue!

The receiver code is saturating the Serial interface at 500kbaud though, so its likely to have its TX buffer permanently
full, which might be a clue.

Yes, though, no matter what the Serial baud rate is, will be constantly overrunning the transmit buffer. But the nature of the bug is that something about Serial operations hosing the behavior of Serial2.available. The way this manifested in the more complex program was that the Serial2 would receive for "a while" and then simply stop. But it was very difficult to get a clue what the cause was until it was made ridiculously explicit with this test case.
 
As far as I can see, there is no failure here. It is doing precisely what you're telling it to do, which is to spend more than 99.9% of the time repeatedly printing W on a line of its own. This happens so fast that you won't be able to see the other output which only occurs once every two seconds.

Pete
P.S. Changing Serial to 19200 won't affect it because the transfer occurs at USB speeds anyway, independently of what is set in Serial.begin().

Yes, this test case is flawed. The original code has while(!Serial2.available()) {} and control never got past the while loop, somehow because of operations with Serial. I'll try again.
 
Your receive test program has a very fast loop with no delays. Could it be that your loop needs an occasional call to yield() or to a delay that will internally call yield? I don't have the hardware with me on my trip to test this, but your might try putting a delay(1) inside the while(!Serial2.available()) loop.

It's also possible that doing Serial.println("W") as fast as possible is saturating whatever program you are using to receive the test data. If that's the case, a delay() in the loop might solve the problem.
 
The test program I submitted is perfectly bogus. It does not demonstrate the failure I observed, which was failure to escape from an empty loop like this: wihle(!Serial2.available()) {} that was somehow related to Serial.print operations (turning them off caused Serial2.available to work as expected).
I've rewritten my application to avoid all use of Serial2.available, just using Serial2.read and arranging for 0xff to not occur in my message packets. With that change I've had no problems. If/when I'm in a position to try again to make a test case I'll be more careful to avoid wasting people's time.
 
I added some port output pulses to show the timing of your receive program. They seem to show that the problem is not that Serial2.available() is not responding, but that the USB Serial output is saturating the receiving display program. Your program, as written, sends about 7 million characters per second while waiting for Serial2.available(). (3.5 million 'w' characters and 3.5 million LF characters). I don't think many serial terminal programs can handle that, especially if they have to scroll the display up a line 3.5 million times per second!

I think it is important to remember that the high-speed USB serial on the T4.X can transmit more than 20 million characters per second (I get file upload rates of about 18MB/second with a carefully-crafted upload program--and that involves the time to read the SD card as well as the time to transmit the data via Serial.

After about 30 minutes of run time, the LEDON pulses show that Serial2.available() is still responding every two seconds. The Arduino serial monitor shows nothing but a column of 'w' characters. You should revisit your original program to make sure that you were not using a similar high-speed output loop that was saturating the Serial output link.

Code:
/*
  This sketch  demonstrates an apparent failure of Serial2.available().
  If the #define is commented out the program prints what it reads from Serial2
  as hex with a cumulative received byte count. If the #define is compiled
  control never escapes from the first call of Serial2.available().

  To reproduce:

  1.Connect two Teensy 4.0 board grounds and pin eight of the sending Teensy to
  pin seven of this receiving Teensy. (It is only necessary that this sketch
  run on a 4.0: anything capable of spewing serial at 9600 can run the sender
  sketch).
  2 Start this sketch. It will put "W" on a new line over and over again.
  3.Run availablebug-sender.ino on the other board. If the failure reproduces
  the stream of "W" output will not stop. If it fails to reproduce a cumulative
  input byte count and the two hex digits of the received byte value come out
  on one line per byte.

  If the Serial2 baud rate is jacked up it's possible some "W" lines might
  come out as well as the expected hex values even after the sender starts.

  Test environment used: Teensyduino 1.53/Arudino 1.8.13/Ubuntu 18.04/AMD CPU
  on "receiver" side and 1.53/1.8.13/Ubuntu 16.04/Intel CPU on "sender" side.

  If the Serial2 baud rate is jacked up it's possible some "W" lines might
  come out as well as the expected hex values even after the sender starts.

  Test environment used: Teensyduino 1.53/Arudino 1.8.13/Ubuntu 18.04/AMD CPU
  on "receiver" side and 1.53/1.8.13/Ubuntu 16.04/Intel CPU on "sender" side.

  Note: This is a distillation of a more complex program that was sabotaged
  by some interaction between Serial and Serial2. This is just the smallest test
  case to demonstrate at least one interaction that stymied progress by making
  "getting debug information" and "running the target application" mutually
  exclusive. It's possible there are more bugs involved, as in some cases it
  was as if Serial2.available had a "bank account of about 100k to 10meg calls
  after which it simply got stuck and the program stopped making progress.

  pete@apexprotofactory.com 8/27/2020

  Added oscilloscope markers  9/2/2020  mborgerson

*/


const int ledpin = 13;
const int wpin = 14;

#define LEDON digitalWriteFast(ledpin, HIGH);
#define LEDOFF digitalWriteFast(ledpin, LOW);

// Marker to show transmission of 'w'
#define WON digitalWriteFast(wpin, HIGH);
#define WOFF digitalWriteFast(wpin, LOW);

// Comment this out to cause program to work properly

#define FAIL

static uint32_t count = 0;

void setup() {
  pinMode(ledpin, OUTPUT);
  pinMode(wpin, OUTPUT);
  Serial.begin(9600);
  while (! Serial) {
  }
  Serial2.begin(9600);
  while (! Serial2) {
  }
}

static uint8_t getByte() {

  LEDOFF
  while (!Serial2.available()) {

#ifdef FAIL
WON
     Serial.println("W"); 
WOFF
#endif

  }  
  LEDON
  // insert a delay of about 0.1 bit time so the
  // pulse is easier to find on the oscilloscope
  delayMicroseconds(10);
  return Serial2.read();
}

void loop() {
  count += 1;
  Serial.print(count);
  Serial.print(" ");
  Serial.println(getByte(), HEX);

}
 
Dear mborgerson,
Thank you for the effort you put into this explanation. I hope to be able to offer you and the others help some day to show my gratitude.
But I am still not communicating properly with you. The test case I submitted was flawed, bogus, not at all a demonstration of the failure that I observed in my application. I was in a hurry, slapped this together without stopping to think that the Serial.print would "hide" correct behavior, etc, etc. As I said above, I may submit another test case showing available either stop returning nonzero values or fail to ever return a nonzero value iff Serial object methods are being invoked *elsewhere* in some code (with a connection to the failure that is of course the big mystery). By "elsewhere" I mean *not in the !available() path*. I actually had a great test case, but didn't commit it and I'm now swept by a different current. I do want to demonstrate the REAL problem I saw, because it was so severe I had to eliminate use of available to get relief. But now I'm plumbing the depths of the IntervalTimer::begin semantics to reach closure on something. Please, don't explain the test case again. I realized the moment I looked at it a second time in conjunction with Kurt's reply that it was probably screwed up, and by the time Paul's message come through I was already convinced it was worthless. Sorry to waste time and energy!
 
@mborgerson - adds some good general 'rate' info. USB Serial to SerMon on a PC runs well limited to about 16MB/sec - ( 500K lines of 32 chars ) - it can surge to 'reported' 800K lines ... in excess of 20MB/sec. Paul saw a bit more on Linux than a PC could maintain - so the T_4.x's HS USB running near half speed. There might be room for T_4.x improvement but without a reliable way to validate that much data from a mere PC it is hard to measure or test Teensy side improvements.

In doing those speed tests the USB must eat noticeably more processor time - but that wasn't measured given it was fitful and hard on the PC (hang/garbage/restart).

@petesoper - I've seen anomalies in Serial.available() which aren't clear - the HS USB buffers deal with blocks of 512 or a few larger buffers so the management of 'what is where' at high speed seems to get a bit confused. But on Serial2.available() there isn't such a reason - except if maybe the USB is interrupting too much? The T_4.x has extra UART Serial - but FIFO size is only 4 bytes so it needs more frequent interrupts to buffer the data. If/when you had a sketch showing the issue that would be interesting to see.
 
I guess this all goes to show the advantages and disadvantage of the "Always post complete source code" warning at the top of the forum. If you condense the source to something that doesn't demonstrate the original problem, responders can't help with your original problem. OTOH, sometimes the original code has proprietary elements or requires special hardware and you have to try to simplify the original code.

I took a look at the Serial.available() function and it just looks at the head and tail positions in the software queue to determine if data has arrived. However, if something blocks interrupts long enough so that the internal FIFO of the uart overflows, data will be lost. I suppose that this could happen if the USB Serial port gets clogged up enough that it takes more than ~4 milliseconds (4 bytes at 9600 Baud) to service the UART2 interrupt. I think the problem isn't intrinsic to using Serial2.available() in conjunction with Serial.write(), but depends a lot on HOW the USB serial is being used. We just didn't have enough information to analyze that problem. Arranging your USB Serial usage to stay out of the way of critical input via Serial2 might solve the problem.\

OTOH my demo code did show that Serial2.available() continued to behave properly--the O'scope flag popped up at regular 2-second intervals despite the saturation of the outgoing USB Serial link. I suppose that you could go back to your original code and use a hardware flag to determine whether the problem was that the while(! Serial2.available()) loop was failing, or whether it was something else that cause you to think that it was failing when the real problem was elsewhere in the code.
 
In a hobby context you have a point. In a professional context your assertion to "always post the complete source code" is a nice goal but sometimes not possible (e.g. one can't always post "everything" without violating a client NDA).
But your point about this possibly being all about receiver overrun and missing bytes in the Serial2 receive FIFO is interesting and might explain how I might have shot myself in the foot so effectively I couldn't see who was holding the gun.
 
As I said above, I may submit another test case showing available either stop returning nonzero values or fail to ever return a nonzero value iff Serial object methods are being invoked *elsewhere* in some code (with a connection to the failure that is of course the big mystery).

There is a chance you've found a previously unknown bug, so I hope you'll find some time to create a test case that reproduces the problem. If this really is a bug in Teensy's hardware serial or USB serial code, I do want to investigate and fix it.

Many times on this forum we've heard similar reports, and sometimes it really has been an obscure bug. One I recall involved Serial1.begin() hanging, but only under a bizarre set of circumstances - after baud rates had been mismatched and certain data had been erroneously received, and then Serial1.end() had been called, and only when a pretty narrow range of time had elapsed. If Serial1.end() and Serial1.begin() were done immediately to recover from mismatched baud rates sooner, or a few milliseconds later, no crash would occur. It would only crash if the attempt to recover was delayed and fell within a window of a couple milliseconds. In the end... after a *lot* of head scratching and tedious investigation, it turned out to be a bug where the UART's FIFO was retaining status of a previous framing error due to the wrong signal level during the a stop bit. Even though we were resetting the FIFO, we weren't properly handling & clearing the error status. The bug would only strike if the error entered the FIFO within a narrow time window, and if not too much time had elapsed for a FIFO receive timeout within the UART. This was on Teensy 3.x, after years of successful use by many thousands of people. It would have been easy to assume such mature code to be bug free. Turned out that bug had been with us for years, but so few people use Serial.end() and delaying error response is so unusual that it went undiscovered all that time. So I try never to assume code is bug free.

But many other times we've seen problems where complex code would have a buffer overflow or other memory management error which corrupts the serial port state in memory. Typically that sort of error manifests *much* later, giving the impression that a bug must exist within the serial code.

The only way to know for sure is really investigating. That has to start with a test case that reproduces the problem, so I hope you'll find a little time to make that test case.
 
I appreciate that, Paul, believe me. But you'll understand that my priorities are not all of my choosing. I just (hopefully) got to the end of an adventure with IntervalTimer that cost me a ton of time I hadn't expected to spend and I haven't actually tested what I hope is the workaround. But my current "don't allow 0xff bytes" workaround for the Serial2 issue is completely silly and I hope to eventually undo it and see the earlier failure again to help with the effort.
 
Back
Top