Teensy Hardware Flow Control RTS/CTS

duff

Well-known member
I'm starting this thread as a continuance of where this thread has gone. I have the Teensy 3.1 Serial1 RTS/CTS feature working and wanted to share my results and code changes to the Teensy core. This hardware flow control can be used as way to tame high speed serial communications between modules. Specifics for this can be found in the K20P64M72SF1RM.pdf section 47.3.14 Uart Modem Register page: 1224. I haven't tried Serial 2, 3 on the 3.1 or looked if it is possible on the Teensy LC yet. I wanted to use this as starting point on how to integrate this into the Teensy core.

To show a specific use on the teensy I have two teensy (3.1) using Serial1 at 6MHz with Teensy 1 sending to Teensy 2 as fast as possible and then sending that data to the Arduino serial monitor (Teensy 2). Without using the CTS feature on Teensy 1(sender), Teensy 2's usb can't keep up printing the received data. These examples don't use the Hardware RTS feature but a software hack because the RTS feature doesn't stall comm's for the usb. Here are two sketches and the mods to serial1.c, HardwareSerial.h and Kinetis.h:

Kinetis.h - add to
Code:
[COLOR=#78492A][FONT=Menlo]#define UART0_MODEM    (KINETISK_UART0.MODEM)[COLOR=#008400]// UART Modem Register - already defined! [/COLOR][/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#define UART_RXRTSE     [COLOR=#272ad8]0x08[/COLOR]                 [COLOR=#008400]// Receiver request-to-send enable[/COLOR][/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#define UART_TXRTSPOL   [COLOR=#272ad8]0x04[/COLOR]                 [COLOR=#008400]// Transmitter request-to-send polarity[/COLOR][/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#define UART_TXRTSE     [COLOR=#272ad8]0x02[/COLOR]                 [COLOR=#008400]// Transmitter request-to-send enable[/COLOR][/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#define UART_TXCTSE     [COLOR=#272ad8]0x01[/COLOR]                 [COLOR=#008400]// Transmitter clear-to-send enable[/COLOR][/FONT][/COLOR]

HardwareSerial.h add to "Class HardwareSerial and c implementation part"
Code:
[COLOR=#008400][FONT=Menlo]// C language implementation[/FONT][/COLOR]
[FONT=Menlo][COLOR=#bb2ca2]void[/COLOR] serial_set_cts([COLOR=#703daa]uint8_t[/COLOR] pin);[/FONT]
[FONT=Menlo][COLOR=#bb2ca2]void[/COLOR] serial_set_rts([COLOR=#703daa]uint8_t[/COLOR] pin, [COLOR=#703daa]uint8_t[/COLOR] polarity);

// HardwareSerial Class[/FONT]
[FONT=Menlo][COLOR=#bb2ca2]virtual[/COLOR] [COLOR=#bb2ca2]void[/COLOR] ctsEnable(uint8_t pin) { serial_set_cts(pin); }[/FONT]
[FONT=Menlo][COLOR=#bb2ca2]virtual[/COLOR] [COLOR=#bb2ca2]void[/COLOR] rtsEnable(uint8_t pin, uint8_t polarity) { serial_set_rts(pin, polarity); }[/FONT]

serial1.c - add
Code:
[FONT=Menlo][COLOR=#bb2ca2]void[/COLOR] serial_set_cts([COLOR=#703daa]uint8_t[/COLOR] pin)[/FONT]
[FONT=Menlo]{[/FONT]
[COLOR=#78492A][FONT=Menlo][COLOR=#bb2ca2]  if[/COLOR][COLOR=#000000] (!([/COLOR]SIM_SCGC4[COLOR=#000000] & [/COLOR]SIM_SCGC4_UART0[COLOR=#000000])) [/COLOR][COLOR=#bb2ca2]return[/COLOR][COLOR=#000000];[/COLOR][/FONT][/COLOR]
[FONT=Menlo][COLOR=#bb2ca2]
 if[/COLOR] (pin == [COLOR=#272ad8]18[/COLOR]) CORE_PIN18_CONFIG = [COLOR=#78492a]PORT_PCR_MUX[/COLOR]([COLOR=#272ad8]3[/COLOR]);[/FONT] 
[FONT=Menlo][COLOR=#bb2ca2]  else[/COLOR] [COLOR=#bb2ca2]if[/COLOR] (pin == [COLOR=#272ad8]20[/COLOR]) CORE_PIN20_CONFIG = [COLOR=#78492a]PORT_PCR_MUX[/COLOR]([COLOR=#272ad8]3[/COLOR]);
[/FONT][COLOR=#BB2CA2][FONT=Menlo]  else return[COLOR=#000000];[/COLOR][/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo] UART0_MODEM[COLOR=#000000] = [/COLOR]UART0_MODEM[COLOR=#000000] | [/COLOR]UART_TXCTSE[COLOR=#000000];[/COLOR][/FONT][/COLOR]
[FONT=Menlo]}[/FONT]
[FONT=Menlo]
[/FONT]
[FONT=Menlo][COLOR=#bb2ca2]void[/COLOR] serial_set_rts([COLOR=#703daa]uint8_t[/COLOR] pin, [COLOR=#703daa]uint8_t[/COLOR] polarity)[/FONT]
[FONT=Menlo]{
[/FONT][COLOR=#78492A][FONT=Menlo][COLOR=#bb2ca2]  if[/COLOR][COLOR=#000000] (!([/COLOR]SIM_SCGC4[COLOR=#000000] & [/COLOR]SIM_SCGC4_UART0[COLOR=#000000])) [/COLOR][COLOR=#bb2ca2]return[/COLOR][COLOR=#000000];
[/COLOR][/FONT][/COLOR][FONT=Menlo][COLOR=#bb2ca2]
 if[/COLOR] (pin == [COLOR=#272ad8]6[/COLOR]) CORE_PIN6_CONFIG = [COLOR=#78492a]PORT_PCR_MUX[/COLOR]([COLOR=#272ad8]3[/COLOR]);
[/FONT][FONT=Menlo][COLOR=#bb2ca2]  else[/COLOR] [COLOR=#bb2ca2]if[/COLOR] (pin == [COLOR=#272ad8]19[/COLOR]) CORE_PIN19_CONFIG = [COLOR=#78492a]PORT_PCR_MUX[/COLOR]([COLOR=#272ad8]3[/COLOR]);
[/FONT][COLOR=#BB2CA2][FONT=Menlo]  else return[COLOR=#000000];
[/COLOR][/FONT][/COLOR][COLOR=#bb2ca2][COLOR=#78492A][FONT=Menlo]
 if[/FONT][/COLOR][/COLOR][COLOR=#000000][COLOR=#78492A][FONT=Menlo] (polarity == [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]HIGH[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo]) [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART0_MODEM[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo] = [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART0_MODEM[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo] | [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART_TXRTSPOL[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo] | [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART_RXRTSE[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo];[/FONT][/COLOR][/COLOR][COLOR=#008400][COLOR=#78492A][FONT=Menlo]//UART_TXRTSE;
[/FONT][/COLOR][/COLOR][COLOR=#bb2ca2][COLOR=#78492A][FONT=Menlo]  else [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART0_MODEM[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo] = [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART0_MODEM[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo] | [/FONT][/COLOR][/COLOR][FONT=Menlo][COLOR=#78492a]UART_RXRTSE[/COLOR][/FONT][COLOR=#000000][COLOR=#78492A][FONT=Menlo];[/FONT][/COLOR][/COLOR] 
[COLOR=#008400][FONT=Menlo]  //else UART0_MODEM = UART0_MODEM | UART_TXRTSE;[/FONT][/COLOR]
[FONT=Menlo]}[/FONT]

Teensy 1 (sender) - This has the cts feature enabled.
Code:
#define HWSERIAL Serial1
char data[1000];
char character = 0x41;


void setup() {
  pinMode(LED_BUILTIN, OUTPUT);
  memset(data, character, 998);
  data[998] = '\n';
  HWSERIAL.begin(6000000);
  HWSERIAL.ctsEnable(18);
  Serial.println("Starting High Speed Test (Sender)");
}


void loop() {
  HWSERIAL.write(data);
  //Serial.write(data);
  if (++character > 0x5A) character = 0x41;
  memset(data, character, 998);
  data[998] = '\n';
  //delay(100);
}

Teensy 2 (receiver) - This uses software rts feature for stalling Teensy 1(sender) serial1 sending for usb printing (Teensy 2).
Code:
#define HWSERIAL Serial1
#define SOFT_RTS 19 // Using as a software rts signal


void setup() {
  pinMode(LED_BUILTIN, OUTPUT);
  pinMode(SOFT_RTS, OUTPUT);
  digitalWriteFast(SOFT_RTS, HIGH);
  while(!Serial);
  delay(200);
  HWSERIAL.begin(6000000);
  Serial.println("Starting High Speed Test (Receiver)");
  digitalWriteFast(SOFT_RTS, LOW);
}


void loop() {
  if (HWSERIAL.available() > 0) {
    char buffer[64];
    int bytes = HWSERIAL.readBytesUntil(0xFF, buffer, 64);
    digitalWriteFast(SOFT_RTS, HIGH);// comment out to not use software rts 
    Serial.write(buffer, bytes);
    digitalWriteFast(SOFT_RTS, LOW);// comment out to not use software rts 
  }
}

This is a pic of the setup I'm using: CTS is only available for Serial1 on pin 18 or 20 with the Teensy 3.1.
FullSizeRender copy.jpg

This is a pic of the data being printed to the Arduino Serial monitor without signaling Teensy 1 CTS, note: the data is not looking nice:
Screen Shot 2015-08-20 at 12.32.32 PM.png

This is with CTS signaling, you can see the data looks nice now:
Screen Shot 2015-08-20 at 12.32.01 PM.png


RTS can also be configured to replace the transmitterEnable code also but i'll talk about that later.
 
Hi Duff, this looks good.

I realize this may be a dumb question, but I have to ask it anyway: Is there a hardware-only implementation?

If there is, could you post an example of how to set it up?
 
Hi Duff, this looks good.

I realize this may be a dumb question, but I have to ask it anyway: Is there a hardware-only implementation?

If there is, could you post an example of how to set it up?

Can you explain what you need and I can maybe point you in the right direction? There are a couple of different configs you can do.

I started a question on Arduino's developer list, about the API.

https://groups.google.com/a/arduino.cc/d/msg/developers/l6pafASZpQU/xbesjlpWBgAJ

My goal is to stay compatible with whatever Arduino might do. Let's give that a few days to see if the Cristian, Federico or other Arduino devs have any clear opinions.
Sounds good with me.
 
What I want to do is publish a ROS topic from Teensy to the ROS master machine (Beaglebone Black). It is laser scan data, so it might be bulky.
I'm not a ROS guru and so haven't quite figured out how the data is processed at the receiving end. I'm imagining that the serial protocol
handler is receiving that data into a buffer, and that the buffer might get full, in which case CTS would be turned off, signaling the Teensy to stop sending data.

It may be that the SoftCTS will be adequate, I need to get a better grip on how ROS handles the incoming data.
 
Ok, I've added C language flow control functions.

https://github.com/PaulStoffregen/cores/commit/c2f550d3780cba3c49cfc34bab724b268a11b7a6

Let's wait a little while longer to see if the Arduino devs have any opinions on the API before added this to the C++ side. My guess is they don't care and probably won't ever do it on Arduino... but you never know?

Sweet, if you haven't already tried running the Serial1 @ 6 MHz it really does work quite well, at least between two teensy's fairly close but I bet with hardware flow control and probably some form of error checking can make a nice high speed comms between Teensy's. Another thing to check out is setting the Transmitter Request To Send Register (TXRTSE) for RS485 operation in the serial_set_transmit_pin code. With this register set it asserts the rts pin 1 bit time before the first start bit and deasserts 1 bit time after all the characters are sent. I tested it and works great.

Also I don't see where the LC has any hardware flow control?
 
What I want to do is publish a ROS topic from Teensy to the ROS master machine (Beaglebone Black). It is laser scan data, so it might be bulky.
I'm not a ROS guru and so haven't quite figured out how the data is processed at the receiving end. I'm imagining that the serial protocol
handler is receiving that data into a buffer, and that the buffer might get full, in which case CTS would be turned off, signaling the Teensy to stop sending data.

It may be that the SoftCTS will be adequate, I need to get a better grip on how ROS handles the incoming data.

If you have some links to point to, maybe we can help, I've never heard of "ROS" before? If so probably good to start its own thread so to keep this focused on flow control?
 
duff, I used your rts cts code to send 2k data to esp8266 at 4608000 baud. This is what the signal looks like. I set the teensy rx and tx buffer back to default 64 bytes.

It seems the cts signal comes up approximately about every 102 characters and goes down after 7.3us. 20 cts pulses over 2048 bytes data.

I am able to send at 4608000 baud now with no error.
2015-09-19_14-07-38.png
 
I see the diffs from post #6 in the sources for TeensyDuino 1.25

If I read them correctly a call to:

int serial_set_cts(uint8_t pin) or int serial_set_rts(uint8_t pin)

Return ONE for a recognized pin on the serial# port chosen with CTS or RTS function enabled
Return ZERO for any pin not listed below, disabling the CTS or RTS function on that port.

The working port and pins I see are as follows:

Serial1: RTS ( 6 or 19 ) CTS ( 18 or 20 )
Serial2: RTS ( 22 ) CTS ( 23 )
Serial3: RTS ( 2 ) CTS ( 14 )
 
duff, I think the UART_TXRTSPOL should be set on the set_cts. It tells the teensy the polarity of the sender's RTS line, which is connected to the teensy's CTS.
 
duff, I think the UART_TXRTSPOL should be set on the set_cts. It tells the teensy the polarity of the sender's RTS line, which is connected to the teensy's CTS.
you could be right about that i haven't looked at this since i posted about it last, i think paul said he is going to implement the high level code in the next release and already has the code in place for the low level drivers in 1.25 beta 2. Probably best to work from that since that is going to be official.
 
The transmission with CTS flow controls works properly, and the receiver does not lose any data.
However, the teensy serial receiver does not work properly with RTS flow control. With flow control, there should be no loss of data, FIFO or no FIFO. Yet the current Serial library (I think including the latest one) can still lose data due to the RDRF handler draining the FIFO and if head catches up to tail, it simply tosses out the data. It should work without losing any data even with no rx buffer. I think the serial library needs an update for it to really support hardware handshake. Simply enabling rts and cts pin is not enough.

The datasheet shows this diagram. I think the serial library needs to do the same.
2015-09-20_11-22-38.png
 
Yet the current Serial library (I think including the latest one) can still lose data due to the RDRF handler draining the FIFO and if head catches up to tail, it simply tosses out the data. It should work without losing any data even with no rx buffer. I think the serial library needs an update for it to really support hardware handshake. Simply enabling rts and cts pin is not enough.

I believe you may be correct on this.

As a practical matter, a reliable test case needs to be made before this work can begin on the serial code. Any ideas?
 
I agree test case is needed. I am still trying to understand how the uart works. I read the uart chapter like 10x already.

the only test case I have now involves using esp8266. If I run the udp ntp test program, it receives 48 bytes + 28 bytes overhead so that's over 63 bytes, and without flow control, I lose the last 13 bytes every time. Which is exactly 48+28-63=13, as the receive buffer size of 64 really only has a capacity of 63. I'm sure if you have 2 teensies connected, and one burst sends >63 bytes, anything above 63 will be lost without hardware handshake.

The MK20 chip that teensy use has 8 word fifo buffer right?
is there any particular reason RWFIFO is set to 4 and not 7? I suppose with hardware handshake enabled, it can be set to 7.
 
I have my DUAL serial sketch on a pair of T_3.1's and also a pair of T_3.2's and started with my latest qBlink base example. At 6MBps was very stable at 4,800 messages per second of 26 characters on both Serial #1 & #2. Bumped up the output message size to exceed the default buffers. Much more and it fails so I backed off to Serial1 86 chars and Serial2 60 chars [I assume the block for transmit causes the other untended port receive to overflow]. Running at Serial1.begin(6000000) - still seeing 1,100+ messages per second each port, which makes sense as the wait to transfer to the transmit buffer is blocking - in addition to larger data stream of 52B .v.s 146B.

This seems stabilized and will wire in the RTS/CTS to see if I get an example with data loss.

RE: I'll wire up CTS and RTS

Q?: I enabled the CTS/RTS with no added wires it is running the same as without them enabled, is this right?

Q?: To confirm when I wire the CTS on one port is crossed to the RTS on the other, and visa-versa?

Testing RTS/CTS to prevent and not cause data loss::
Once this is working to test the RTS/CTS integrity I can delay() the incoming byte read, causing the receive buffer to fill and stall the transmit?
Also I expect I can increase the data transmit size and when it blocks to empty on one Serial port, I should not see the problem I did with "Much more and it fails" because I wasn't pulling receive bytes from the second serial port?
 
yes, rts and cts are crossed, just like rx and tx. if you look at duff's test program, the second teensy does not use hardware rts, but the program sets the rts pin high when it does a serial.println, then sets rts low after the print.

I tested with rts enabled, and set the rx watermark to 7, and you can see rts pulse occurs every 7 characters. That is when RDRF interrupt occurs. I think once the first character is read, the rts signal goes down. So in this case, there was no interruption of the sender. I think in real case application program, the program could be doing other stuff and not be able to consume the buffer as fast. I'll have to create a test sketch to do that to see the rts pulse stay high longer.
rts.png

this is at 4,608,000 baud
 
Perhaps run with interrupts disabled for a bit (such as when writing to a long stream of ws2812/neopixel LEDs) to see whether the hardware really is honoring rts/cts.
 
is there any particular reason RWFIFO is set to 4 and not 7?

Yes, there is indeed a reason.

Like most engineering decisions, FIFO thresholds involve making a trade-off. A higher threshold allows the FIFO to receive more bytes before triggering an interrupt, which is more CPU efficient. But if the FIFO is nearly full, you get less resilience to interrupt latency. If the interrupt is triggered when the FIFO is completely full, you end up with the same sensitivity to latency as a non-FIFO UART, where delaying the interrupt response by 1 byte time will cause the next arriving byte to be lost.

Common strategies are setting the threshold at half the FIFO size, or setting the threshold at 2 bytes before it's full. The 2 byte case optimistically assumes other code isn't blocking interrupts much, which can be a pretty good way to go for a monolithic embedded project (where you write or are deeply familiar with the code in all parts of your project).

For Arduino sketches and projects, where people combine lots of different libraries from numerous sources, I felt the half-FIFO threshold would be more appropriate. But to keep things in perspective, even a fairly high threshold is still better than a no-FIFO serial port.


I'm sure if you have 2 teensies connected, and one burst sends >63 bytes, anything above 63 will be lost without hardware handshake.

Any chance you could set up a 2nd Teensy to confirm this?

If there is a bug or limitation in the serial driver code, a reliable 2-Teensy test will prompt me to dig into the problem and fix it.
 
I don't have 2 teensies to setup a test right now. I will have to remove one from a board and rewire my current test circuit.

Once I get my current test setup to work and not lose data when rx buffer is full with rts/cts, then I can rewire it into a two teensy setup.
 
as the recieve buffer size of 64 really only has a capacity of 63. I'm sure if you have 2 teensy's connected, and one burst sends >63 bytes, anything above 63 will be lost without hardware handshake.
If the receiving Teensy is not blocked doing other things and can process the incoming data through the read() function to a buffer you setup in the sketch it will be able to handle a burst of any size packet even at 6 MHz Baud. Also since the priority of the Serial ISR is higher (64) than almost all others it shouldn't be preempted for that long so I think if you setup you're receiving code carefully this won't be a problem either. The problem comes in when you don't read the buffer fast enough because your code is doing something else but running at 24-96MHz is plenty fast enough to process the data as long as you are reading that data from the buffer. Where i see the use of the RTS/CTS is when you can't read the buffer because you're doing other processing and not reading the buffer which can overrun in this case. In the example I showed I was sending much more than the receive buffer (64 bytes, ok 63), the problem was printing that data to the serial monitor (USB) was blocking the reading of the Serial buffer fast enough, but if I didn't print that received data the teensy could handle it. I was running at 96MHz.
 
ok, ^^^^^^
that was quoted out of context. :)
Obviously, if you read fast, head will never catch up to tail. :)

I tried using "soft" RTS just like duff's experiment. And I can confirm CTS on ESP8266 does stop it from sending data. But no matter what I try, I still could not get the teensy hardware RTS to do anything similar.
2015-09-23_21-17-15.png

rx and rts are on teensy side, and are connected to tx and cts of esp8266 respectively.

I changed my main program to, instead of reading and writing 1 character at a time, to while Serial1.available is true to read and write to Serial and during this while loop, RTS is set to high.
 
I'm thinking maybe I have to do a bit more drastic code change, like disable RDRF interrupt in the status_isr if head catches up to tail, then re-enable RDRF interrupt in the getchar function once some bytes are read.
 
Remember the processor has no knowledge of the Hardware Serial buffer ->"rx_buffer[RX_BUFFER_SIZE]" so it can't possibly know if or when that buffer will overrun. It only can handle the Hardware FIFO. So setting FIFO to 7 really doesn't give you any head room like paul said just less interrupts. As for a possible issue with the serial driver and Hardware RTS i'll try to mock up something tomorrow and figure out what its actually doing but disabling the receive ISR is probably not the way to go about it I would think.
 
No it does not, but the isr code does. The reason for wanting to set the threshold to maximum with flow control enabled is to maximize efficiency. Why set it to anything less if you can control switching off the sender when you have no more room to store incoming data. And it makes sense that the datasheet says the max value of threshold is 1 less the fifo capacity, which is 7, because at the time the isr is triggered, there can be one character in the shift register. And at the time the isr is triggered, the RTS line will be high, so only one character will be "in flight". So if RDRF interrupt is disabled, that one character will then take the 8th place in the fifo. and during this time, RTS will remain high until space is freed up in rx buffer via getchar, then RDRF interrupt can be re-enabled, and the isr can resume pulling data out of fifo. At least that's the idea.
 
Back
Top