@Paul @pramilo and ...
Thought I would try to fill in more of the info from that last Logic Trace I showed above, Here is another one taken showing the whole set of messages.
Again as this is Serial3 which does NOT have FIFO, you see an interrupt per character.
The different ROWS of the LA are(Serial data, Transmit Enable, Error toggled, brackets the wait Microseconds, ISR, ISR TCIE, ISR TC, brackets serial3_putchar)
I also hacked up the ISR function to remember the S1 and C2 register into an array for a few cases (TCIE queue not empty, TCIE queue empty, TC), I then had two main sketch that duplicates this, listen to IO pin that secondary program that looks for these errors, then when it sees an error happened prints out the saved data.
So mostly in error condition I see lines like: Error reported: 00000080 000000ac 000000c0 000000ac 000000c0 0000006c
So there is some very interesting timing and issue. The timing has to be just right that it is probably just starting the stop bit of the output or the like, when our delayMicrososends completes, we then call Serial3.write(60).
First ISR happens, with S1=0x80 TDRE as expected, C1=0xac (TIE, RIE, TE, RE) - SO interrupt for TDRE as expected).
We then remove the 60 that was written out and put it into D register and return...
Second ISR happens: S1=0xc0 - yikes (TIE AND TC???) Again C1=ac ... So we go into TDRE area as expected, Find that our queue is empty, we then turn on the enable interrupt for TC and return. We did nothing to clear the TC flag from S1, as that should not happen?
So when we return, the ISR code is immediately called again with: S1=0xc0 again and C1=0x6c(TCIE, RIE, TE, RC) - so we go into the TC path and disable the Transmitter enable and then change what we interrupt on to: probably just (RIE, TE, RE)...
So probably like you (@pramilo) I have no clue on if/how to be able to detect this case and fix.
For example how do you detect the difference between, the user only output one character to the serial port, and by the time the user got back, which would again properly generate a TCIE type interrupt, but by the time the ISR was called (maybe processor was busy with higher priority interrupt), it finished outputting the one character and as such TC would also be set?
Also assume we can detect the difference? Now what? The only ways I know to clear the TC bit in S1 is:
a) Write a new byte out to D register (we don't have anything)
b) Queue a preambe by clearing and setting TE in C2... What happens to the data already trying to be output?
c) Sending a break character... Again not sure what happens...
So again not sure what to do here?
The only other completely random thought would be do we detect idle? And if so can we use that?
Edit: Other random thought is to detect how long between the last time you put something in D register and when you get the TC, should be >= bit time * 10... more or less...
Edit2: Idle only detects RX idle I think, does not work for this...