Low latency serial communication with Linux

Status
Not open for further replies.

nikolas

Member
I am trying to use Teensy 3.2 for a timing-sensitive application and I face a lag of ~5ms.

Briefly, I do the following:
* A python program, running on an Ubuntu 14.04 laptop (part of a complicated process) sends occasionally one byte with the following command:
HTML:
teensy.write(b'1')
. I have also tried to do
HTML:
teensy.flush()
after each write command but it does not help.
* Teensy is running a Firmware uploaded by Teensyduino that just compares the byte it received with 49 and in that case triggers one PIN to be 1 for 5 microseconds.

* In the ideal case, the latency between the (actual) time when Python sends the byte and the moment that the pin becomes 1 would be <1ms however here it is ~5ms.

It is worth noting that I get similar latency with the following alternatives:
a. Arduino doing the same
b. Using a serial ExpressCard on the laptop and using the Transmit Data pin as a TTL out (no microcontroller involved).

To note also that I tried it also after setting the serial port to low_latency with
Code:
setserial PORT low_latency

All this, tells me this might be some specific about Ubuntu, since I saw online some comparisons of Teensy latency that for Windows is <1 ms but for Linux ~4-5ms. Unfortunately I cannot test this on Windows because the more complicated process that runs the python code only works in Linux.

Any ideas about either reducing the latency of serial communication in Ubuntu or any other alternative way of getting this TTL out of the PC in a timely manner are highly appreciated.

Thank you in advance.

Nikolas
 
Believe it or not, USB can be pretty low latency. On my Ubuntu 16.04 test system, I get (using USB serial):
Round trip latency with USB 2 hub:
- 0.15ms average
- 0.9ms max latency
Without hub:
- 2ms average
- 2ms max

The test program sends a number as ASCII string (with line feed) and the Teensy returns it. I didn't check, but the line feed may lower latency and trigger flushing in the Linux serial handling.

The hub serves as protocol translator and the communication between PC and hub is 480MBit, between hub and Teensy 12MBit.

Otherwise, a parallel port will be fastest in terms of latency, but finding hardware that still has 'real' parallel port support (and not just printer emulation) may be difficult. The ExpressCard ones may actually use USB internally.
 
Hi,
thank you for the answer.
Unfortunately I am nowhere near these values.
I did try using line feed (if I understand what you mean, I just send a '/r/n' after the trigger byte).
I also tried to explicitly flush and also using different options for the serial communication (rtscts, dtr, xonxoff).

Trying with a hub did not help and indeed the ExpressCard seems to use USB (at least it is recognized as ttyUSB0).

The lag I see is ~6-7 ms and it seems to be somewhat depended on the CPU load.
 
I have been playing around with different Serial communications between some linux boxes and servo controllers such as Arbotix Pro, or USB2AX, or my Teensy.

Sometimes I have found that some things I have tried help in some cases and hurt in other cases.

First off ttyUSB0 implies FTDI, which often has latency issues. On windows I would suggest that you go to the device manager, go to the properties for the FTDI port, go to advanced properties and probably set Latency to 1...

But when I am talking to it from Linux program lets say from one of my Odroids, I find it helps when I do the equivalent of a flush (tcdrain). However my program is talking to other Serial ports, such as a Teensy 3.2 or USB2AX which is Atmega32u2 processor, that create the device like ttyACM0 or to one of the actual hardware serial ports on the board, I find adding the call to tcDrain really slows things down. My gut tells me that tcdrain calls into some IOCTL of the device driver, when in the case of FTDI actually knowns when it actually completed sending out the last bits over the port and return then, and on other drivers maybe there is that support built in and it simply adds in a several MS delay.

Also delay in USB is probably also influenced on your machine on how your machine does USB. Does it all funnel through one USB hub or are there multiple hardware paths to the CPU (not sure of the right names for this)...

My advice is to experiment and see what works for you
 
I connected the Teensy via USB (using the USB serial emulation), so the device for that would be /dev/ttyACM0. The driver (cdc_acm) is quite likely different from whatever your serial port adapter uses. So, you should try a direct USB connection. If you want to send something from the Teensy side, make sure to use "Serial.send_now();".

Your USB serial adapter is likely buffering stuff, e.g. FTDI:
https://projectgus.com/2011/10/notes-on-ftdi-latency-with-arduino/
 
I also get the /dev/ttyACM0 for Teensy.

The /dev/ttyUSB0 that I mentioned was for the Serial ExpressCard that I tried.

Regarding the USB schema for this laptop, I do not know exactly how it is wired internally, but I did try different usb sockets (and different laptops, as well as Ubuntu 14.04 and 16.04).
 
As I mentioned above, then check to see if your code on the PC is going through tcdrain or the like. Could be a higher level flush function. Try removing it and see what happens to your timings. As I mentioned, it really HURT the speed of my trying to speed up Dynamixel packets. So my code tries to guess if I am using FTDI (i.e. ttyUSBx) and if so calls the drain else not.
 
Hi Kurt,
thanks for the suggestion. I guess that the equivalent in the Python code is the .flush(). I have tried both with and without it but it doesn't seem to make things worse.
 
My application is a closed loop data acquisition - stimulation software.
So I plot a trace with the TTL pulse the moment the Python sends the byte. The Teensy is receiving this byte and delivers a TTL to the data acquisition device.
So I calculate the latency between the TTL pulse generated by Python and the onset of the TTL pulse generated by the Teensy.
 
I'm not clear from your description how / where the time measurement is triggered / performed. Anyway, the concern is that you may have an issue with your latency measurements.

As a sanity check, have you tried timing a USB serial round trip, with the Teensy simply responding to some sort of ping message (do a flush on the Teensy side with "Serial.send_now();")?
 
My gut tells me it is python code (pyserial) issue. At least on some controllers I have used, it appears to eat up lots of resources. Have you tried a C version on the laptop to see what the latency is?
 
The first thing I would do is ensure that you send 64 bytes each time, since the USB packets for Serial contain 64 bytes the OS / USB Stack might delay for a few milliseconds in the hopes of filling up an entire packet, instead of transmitting a packet that only contains one byte.
 
I haven't done the test with the ping message yet. Will do soon.

The code I use is indeed pyserial but it is run as Cython so it is in fact compiled.

As for the idea of sending 64 bytes, I have tried that but it didn't help.
 
PySerial is not too slow. Here is some test code.

Sketch:
Code:
void setup() {
    Serial.begin(115200);
    Serial.flush();
    delay(1000);
}

const size_t buf_len = 512;
uint8_t buffer[buf_len];

void loop() {
    size_t buf_pos = 0;
    for(size_t buf_pos = 0; buf_pos < buf_len; buf_pos++) {
        while(Serial.available() <= 0) ;
        int c = Serial.read();
        buffer[buf_pos] = c;
        if(c == 13 || c == 10) {
            Serial.write(buffer, buf_pos+1);
            Serial.send_now();
            break;
        }
    }
}

Python 3 script:
Code:
#!/usr/bin/env python3

import serial
import sys
import io
import time

message_count = 1000

if len(sys.argv) != 2:
    print("Usage: test-lat <Serial Port>")
    raise SystemExit

port_name = sys.argv[1]

ser = serial.Serial(port_name, 9600, xonxoff=False, rtscts=False, dsrdtr=False)
ser.flushInput()
ser.flushOutput()

sio = io.TextIOWrapper(io.BufferedRWPair(ser, ser, 1), encoding='ascii', newline='\n')
sio._CHUNK_SIZE = 1

max_time = 0
total_time = 0
for i in range(message_count):
    start_time = time.perf_counter()
    out_msg = str(i) + "\n"
    sio.write(out_msg)
    sio.flush()
    in_msg = sio.readline()
    if out_msg != in_msg:
        print("Error: invalid response: ", in_msg)
    end_time = time.perf_counter()
    elapsed = end_time - start_time
    if elapsed > max_time:
        max_time = elapsed
    total_time += elapsed

print("Average time: ", total_time / message_count)
print("Max time: ", max_time)

ser.close()

Result with Ubuntu 16.04 with USB2 hub (I get similar results on Windows 7):
Average time: 0.0002483248630022672
Max time: 0.000531464000005144

Edit:
The Linux kernel is:
4.4.0-34-generic #53-Ubuntu SMP Wed Jul 27 16:06:39 UTC 2016 x86_64

Edit2:
The times are in seconds. So the results from above are 0.25 milliseconds and 0.53 milliseconds.
 
Last edited:
For those of you playing along at home, in Python time.perf_counter() returns fractional seconds.

PySerial is not too slow. Here is some test code.

Sketch:
Code:
    start_time = time.perf_counter()
...
print("Average time: ", total_time / message_count)
print("Max time: ", max_time)

Result with Ubuntu 16.04 with USB2 hub (I get similar results on Windows 7):
Average time: 0.0002483248630022672
Max time: 0.000531464000005144

Edit:
The Linux kernel is:
4.4.0-34-generic #53-Ubuntu SMP Wed Jul 27 16:06:39 UTC 2016 x86_64
 
I gave it a try with your code and it seems I get about half the latency you report.

So I guess the issue here is not the latency due to the serial protocol, but rather due to the application itself.
Unfortunately everything I tried (in terms of the serial communication) gives approximately the same latency, so I guess the bulk of this latency is due to another factor.

Thank you very much for all the help.
 
For my application, Teensy is not sending anything to the computer, so there was no need for it.
However, I did try to add a write command (from Teensy to the computer) followed by send_now (as in the sketch by tni above) in the hope it would reduce the latency.
 
The application I am running is acquiring data through an FPGA and is receiving the data every 3ms. Python is processing each batch of 3ms worth of data and they are plotted. When the conditions are fulfilled, Python is sending the one byte to Teensy. In that instance, it is plotting one of the signals as 1 (the rest of the time it plots it as zero).
So when the byte was sent, the 3 ms piece of data plotted is 1.
Now, Teensy is receiving this byte and it switches on one pin for 1μs, which triggers another device that delivers a 5V, 10 ms TTL pulse to another input of my acquisition system, so the moment this happens another plotted channel shows this pulse.
So by comparing the offset of the "python modified signal" and the onset of the "ttl input signal" I see there is a ~6-7 ms delay on average. Looking at the distribution of the latencies, it can be as low as 2ms and as high as 10 ms.

P.S. I have already measured and there is no latency between the triggering of the other device and the onset of the TTL.
 
I dug up a PCI serial card with a real PCI UART 16650 chipset. It has more latency than USB (Ubuntu 16.04):

Average time: 0.0007613420510006108s
Max time: 0.0007923850000679522s

(This result is for sending single character messages, for longer messages it's worse.)
 
Status
Not open for further replies.
Back
Top