Possible race Condition on Teensy 3.2 Serial when using Transmit Enable pin

Just want to let you know I have this thread on my list of issues to investigate. At the rate things are going, there's no chance I will manage to seriously look at this until at least October. But it is on my list, so I will not forget it. Sorry, that's the best I can do at this moment.
 
Note: My main usage for these API's is to support for communicating with Robotis Dynamixel Servos at Half duplex. Sometimes I have done it without needing this and simply setting up TX pin to do both TX and RX, but other times I use a buffer or buffer chips to do the half duplex plus up the voltage to 5v.

@KurtE As a side note, my usage is in Dynamixel compatible devices as well.

We had never seen this before because we mostly build our communications packet into an array and then use Serialx.write() to send it all in one go. Although I suspect I had seen this in past, because this approach did not trigger it frequently, I wrote it off as a line issue/noise.

However, we're now working on a high speed usb to bus device (supporting both TTL and RS485). During our stress tests and likely due to the randomness of USB timings this started to show up.
In this case we're using a Nexperia 74HC2G125 chip which outputs 5V signal levels but is extremely tolerant in terms of input levels, for more robust communication.

However, we also use the LOOPS technique on some other boards (using just one wire in teensy serial). We normally modify Serial2.c to switch into TX mode in the call to transmit_assert/ and back to RX on de-assert.

Therefore, even though I haven't tested with the LOOPS mode, I would expect to see the same communication problems using this implementation. This would be because we would follow the same timing issues with Tx-assert.

I am quite happy with how far I've come in fixing this on Serial2.c and we'll port it back to the, diminishing, number of boards that still use the LOOPS technique. This should improve performance.
(we're moving towards the tri state buffers bc the LOOPS mode outputs only 3.3V)

All this is to say that, even when using LOOPS (one wire for serial), it still worth getting this fixed, I believe.

As a final note, from the way I write, I think it's clear we don't use Serialx.flush(); we do hard real time computing and just rely on teensy and teensyduino library to manage data in and out in the "background" (if one can call it that).

@Paul glad to know this is getting in your list of issues to look into.

All the best,
Pedro
 
Last edited:
@pramilo - Yes mostly before I used the loops stuff as well. But then wondered if maybe some of the issues with AX-12 servos losing their ID was due to 3.3v TTL instead of 5v, so started adding support for TX Enable pin. Code for all of this is up at: https://github.com/KurtE/BioloidSerial
This was based off of the earlier Bioloid library for TrossenRobotics/Interbotix...

Note: When I do LOOPS, the code does call flush....

@Paul and ... I have also hooked up a quick and dirty test program to see if I can find when/if we hit this problem:
Code:
#define TX_INPUT_PIN 2
#define TX_ENABLE_INPUT_PIN 3
uint32_t error_count = 0;
uint32_t tx_enabled_count = 0;
void setup() {
  while(!Serial && millis() < 5000) ;
  Serial.begin(115200);
  Serial.println("Serial TXEnable checker");
  pinMode(TX_INPUT_PIN,INPUT); // Hook up to TX pin
  pinMode(TX_ENABLE_INPUT_PIN, INPUT); // hook up to TX Enable pin
  pinMode(13, OUTPUT);
}

void loop() {
  // Quick and dirty test.  
  bool error_reported = false;
  while (!digitalRead(TX_ENABLE_INPUT_PIN)) {
    // TX enable not set, error if we see TX pin go low
    if (!digitalRead(TX_ENABLE_INPUT_PIN)&& !digitalRead(TX_INPUT_PIN) && !error_reported) {
      error_count++;
      Serial.printf("Error: %d, TX enabled count: %d\n", error_count, tx_enabled_count);       
      digitalWrite(13, !digitalRead(13));
      error_reported = true;
    }
    delayMicroseconds(1);
  }
  if (digitalRead(TX_ENABLE_INPUT_PIN)) {
    tx_enabled_count++;
    while (digitalRead(TX_ENABLE_INPUT_PIN)) delayMicroseconds(1);

  }
}
By connecting up the teensy being tested to another Teensy.

And confirmed that the disable interrupt and assert plus set C2 approach may help but does not fully solve...
I did this for Serial3 as I left the Software I am in write approach in on Serial2...
Code:
void serial3_putchar(uint32_t c)
{
	uint32_t head, n;

	if (!(SIM_SCGC4 & SIM_SCGC4_UART2)) return;
	if (transmit_pin) transmit_assert();
	head = tx_buffer_head;
	if (++head >= SERIAL3_TX_BUFFER_SIZE) head = 0;
	while (tx_buffer_tail == head) {
		int priority = nvic_execution_priority();
		if (priority <= IRQ_PRIORITY) {
			if ((UART2_S1 & UART_S1_TDRE)) {
				uint32_t tail = tx_buffer_tail;
				if (++tail >= SERIAL3_TX_BUFFER_SIZE) tail = 0;
				n = tx_buffer[tail];
				if (use9Bits) UART2_C3 = (UART2_C3 & ~0x40) | ((n & 0x100) >> 2);
				UART2_D = n;
				tx_buffer_tail = tail;
			}
		} else if (priority >= 256) {
			yield(); // wait
		}
	}
	tx_buffer[head] = c;
	transmitting = 1;
	tx_buffer_head = head;
	if (transmit_pin) {
		__disable_irq();
		transmit_assert();
		UART2_C2 = C2_TX_ACTIVE;
		__enable_irq();
	}
	else UART2_C2 = C2_TX_ACTIVE;

}
The Serial3 code above was giving an error maybe every second. Note: my other test program only waits 2ms between groups...
I have not brought in your full Serial2 yet, just the changes I mentioned earlier (just changed putchar, write, and the TCIE...

That test has not shown any failures for a few minutes. Similar one ran for maybe an hour without seeing any. But did it leave anything in queue? Not sure this would catch yet.

Could do another test, which checks that the output of the TX pin matches the output on the DXL pin, which again is the whole point...

@Paul and others. When instead of running this serial code from T3.x I tried it on T4, figuring it would fail the same and could fix it for all Serial ports in one place there...

But the interesting thing, is the T3.2 I have hooked up running the test above, kept giving me bad results. It would think that the TX Enable pin went low, even though the code and Logic analyzer clearly showed it high.... So lots and lots of false positives...

Wonder if maybe another issue on T4 with Drive Strength? Will try more later.

Kurt

EDIT: just got a few errors ...
 
Yes mostly before I used the loops stuff as well. But then wondered if maybe some of the issues with AX-12 servos losing their ID was due to 3.3v TTL instead of 5v, so started adding support for TX Enable pin.

@Kurt, I'm getting off topic here, but I'll be brief:

From my experience, the loss of ID occurs when you have a brownout (even though ATMEL, the processor in AX-12, claim they have brownout detection)
There are a few things that may contribute to this. I wasn't able to pin down the exact reason but here are our suspect culprits:
- The connectors, in they way they are designed, don't always guarantee that GND is connected before Vcc and Data. This is not great and may cause some form of ghost powering if Vcc and Data connect before GND.
- The other - very frequent - cause is cables becoming damaged and the copper exposed. If you have a metal frame, whenever the copper touches the frame, it gets grounded with all sorts of weird issues (from shorts, to data communication interruption,....). Damaged cables touching each other or crimps protruding from the connectors also yield similar results.
Most drama I've experienced comes from cabling issues.

These are just some thoughts from my own personal experience. Nevertheless, going for 5V buffers should definitely help with communication reliability, so it's a best practice in any case.
 
@pramilo - Agreed, again had similar issues back using Trossen Robotics Arbotix boards (AVR Atmega644p).... But as mentioned decided to go with the 5v to be driving the servos as the documented voltages...
 
@PaulStoffregen and @pramilo - Quick update.

I am running two Arduino sketches: One more or less the same as @pramilo posted:
Again nothing special, only main differences is larger delayMicroseonds range as to also get into range for T4... And I set Pin 13 high when entering wait and low after to get idea of where that is in relation to output stream.
Code:
#define TTL_SERIAL Serial3
void setup()
{
  Serial.begin(115200);
  while (!Serial && millis() < 4000) ;
  Serial.println("Start test");
  Serial.flush();
  pinMode(13, OUTPUT);
  TTL_SERIAL.begin(1000000);
  TTL_SERIAL.transmitterEnable(5);
}
void loop()
{
  while (1) {

    for (uint16_t delay_us = 50; delay_us < 120; delay_us++) {
      // send a burst of bytes
      for (byte i = 50; i < 60; i++) {
        TTL_SERIAL.write(i);
      }

      // add a small interval so that we get next byte in, when the UART is already finsishing
      // the transmission of the block above (i.e. UART1_C2 is in state TX_COMPLETING)
      // On a Teensy 3.2 at 96Mhz 75 at 1Mbps did it for me.
      // this is not deterministic so you need to look at several samples in the logic analyser
      // some will be OK, others will exhibit the behaviour described
      digitalWriteFast(13, HIGH);
      delayMicroseconds(delay_us);
      digitalWriteFast(13, LOW);

      TTL_SERIAL.write(60);

      delay(2); // wait a bit to separate samples in the Logic analyser
    }
  }
}
Second sketch, again just checks for the Transmitter Enable pin being not asserted and see if transmitter pin goes low....
Code:
#define TX_INPUT_PIN 2
#define TX_ENABLE_INPUT_PIN 3
uint32_t error_count = 0;
uint32_t tx_enabled_count = 0;
void setup() {
  while (!Serial && millis() < 5000) ;
  Serial.begin(115200);
  Serial.println("Serial TXEnable checker");
  pinMode(TX_INPUT_PIN, INPUT); // Hook up to TX pin
  pinMode(TX_ENABLE_INPUT_PIN, INPUT); // hook up to TX Enable pin
  pinMode(13, OUTPUT);
}

void loop() {
  // Quick and dirty test.
  while (1) {
    bool error_reported = false;
    while (!digitalRead(TX_ENABLE_INPUT_PIN)) {
      // TX enable not set, error if we see TX pin go low
      if (!digitalRead(TX_INPUT_PIN) && !error_reported) {
        // double check that maybe we paused between checks...
        if (!digitalRead(TX_ENABLE_INPUT_PIN)) {
          digitalWrite(13, !digitalRead(13));
          error_count++;
          Serial.printf("Error: %d, TX enabled count: %d\n", error_count, tx_enabled_count);
          error_reported = true;
        }
      }
    }
    if (digitalRead(TX_ENABLE_INPUT_PIN)) {
      tx_enabled_count++;
      while (digitalRead(TX_ENABLE_INPUT_PIN)) delayMicroseconds(1);

    }
  }
}

When I ran these with the first one on T4, did not get any failures. I would think it should be able to, but maybe hard to get just the right timing for it to happen...
With T3.6 which I am currently running, I get errors with all of the approaches I have tried, including yours (@pramilo) Even ones I would think should be somewhat foolprof.

Example I tried this on Serial3. That is that in addition to asserting transmit in putchar function, Also do so in status isr. Like:
Code:
void uart2_status_isr(void)
{
	uint32_t head, tail, n;
	uint8_t c;

	d[COLOR="#008000"]igitalWriteFast(0, HIGH);[/COLOR]
	if (UART2_S1 & UART_S1_RDRF) {
		if (use9Bits && (UART2_C3 & 0x80)) {
			n = UART2_D | 0x100;
		} else {
			n = UART2_D;
		}
		head = rx_buffer_head + 1;
		if (head >= SERIAL3_RX_BUFFER_SIZE) head = 0;
		if (head != rx_buffer_tail) {
			rx_buffer[head] = n;
			rx_buffer_head = head;
		}
		if (rts_pin) {
			int avail;
			tail = tx_buffer_tail;
			if (head >= tail) avail = head - tail;
			else avail = SERIAL3_RX_BUFFER_SIZE + head - tail;
			if (avail >= RTS_HIGH_WATERMARK) rts_deassert();
		}
	}
	c = UART2_C2;
	if ((c & UART_C2_TIE) && (UART2_S1 & UART_S1_TDRE)) {
		digitalWriteFast(1, HIGH);
		head = tx_buffer_head;
		tail = tx_buffer_tail;
		if (head == tail) {
			UART2_C2 = C2_TX_COMPLETING;
		} else {
			[COLOR="#FF0000"]if (transmit_pin) transmit_assert();[/COLOR]
			if (++tail >= SERIAL3_TX_BUFFER_SIZE) tail = 0;
			n = tx_buffer[tail];
			if (use9Bits) UART2_C3 = (UART2_C3 & ~0x40) | ((n & 0x100) >> 2);
			[COLOR="#FF0000"]uint32_t tmp __attribute__((unused)) = UART2_S1;[/COLOR]
			UART2_D = n;
			tx_buffer_tail = tail;
		}
		[COLOR="#008000"]digitalWriteFast(1, LOW);[/COLOR]
	}
	if ((UART2_TCFIFO == 0) && (c & UART_C2_TCIE) && (UART2_S1 & UART_S1_TC)) {
		[COLOR="#008000"]digitalWriteFast(2, HIGH);[/COLOR]
		transmitting = 0;
		if (transmit_pin) transmit_deassert();
		UART2_C2 = C2_TX_INACTIVE;
		[COLOR="#008000"]digitalWriteFast(2, LOW);[/COLOR]
	}
	[COLOR="#008000"]digitalWriteFast(0, LOW);[/COLOR]
	asm("dsb");
}
Just to make sure that the S1 had been read before UART2_D was set, I made that second change in RED as well.
Here shows example error:
screenshot.jpg

I may punt for a bit, or until I think up other solutions... Note: these changes were tried on 1.8.10 Teensyduino Beta

EDIT: updated to show more instrumented ISR and the like, note: I digitalWriteFast(4 in the putchar function as well).

Updated LA output, showing these:
screenshot.jpg
It is interesting that the last 4 lines of trace are the 0-3 pins. Where the TC interrupt is right after the new last byte has started output and no other ISR call for this group.
 
Last edited:
@Paul @pramilo and ...

Thought I would try to fill in more of the info from that last Logic Trace I showed above, Here is another one taken showing the whole set of messages.
screenshot.jpg

Again as this is Serial3 which does NOT have FIFO, you see an interrupt per character.
The different ROWS of the LA are(Serial data, Transmit Enable, Error toggled, brackets the wait Microseconds, ISR, ISR TCIE, ISR TC, brackets serial3_putchar)

I also hacked up the ISR function to remember the S1 and C2 register into an array for a few cases (TCIE queue not empty, TCIE queue empty, TC), I then had two main sketch that duplicates this, listen to IO pin that secondary program that looks for these errors, then when it sees an error happened prints out the saved data.

So mostly in error condition I see lines like: Error reported: 00000080 000000ac 000000c0 000000ac 000000c0 0000006c

So there is some very interesting timing and issue. The timing has to be just right that it is probably just starting the stop bit of the output or the like, when our delayMicrososends completes, we then call Serial3.write(60).

First ISR happens, with S1=0x80 TDRE as expected, C1=0xac (TIE, RIE, TE, RE) - SO interrupt for TDRE as expected).
We then remove the 60 that was written out and put it into D register and return...

Second ISR happens: S1=0xc0 - yikes (TIE AND TC???) Again C1=ac ... So we go into TDRE area as expected, Find that our queue is empty, we then turn on the enable interrupt for TC and return. We did nothing to clear the TC flag from S1, as that should not happen?

So when we return, the ISR code is immediately called again with: S1=0xc0 again and C1=0x6c(TCIE, RIE, TE, RC) - so we go into the TC path and disable the Transmitter enable and then change what we interrupt on to: probably just (RIE, TE, RE)...


So probably like you (@pramilo) I have no clue on if/how to be able to detect this case and fix.

For example how do you detect the difference between, the user only output one character to the serial port, and by the time the user got back, which would again properly generate a TCIE type interrupt, but by the time the ISR was called (maybe processor was busy with higher priority interrupt), it finished outputting the one character and as such TC would also be set?

Also assume we can detect the difference? Now what? The only ways I know to clear the TC bit in S1 is:
a) Write a new byte out to D register (we don't have anything)
b) Queue a preambe by clearing and setting TE in C2... What happens to the data already trying to be output?
c) Sending a break character... Again not sure what happens...

So again not sure what to do here?

The only other completely random thought would be do we detect idle? And if so can we use that?

Edit: Other random thought is to detect how long between the last time you put something in D register and when you get the TC, should be >= bit time * 10... more or less...

Edit2: Idle only detects RX idle I think, does not work for this...
 
Last edited:
Back
Top