Serial USB Tx/Rx interval

Status
Not open for further replies.

nox771

Well-known member
I have a project which has a PC-side program issuing commands over a virtual COM port to a Teensy3 (handled by Serial on the device side). The commands trigger some I2C activity, and return a byte of data indicating error status (Slave not responding, and such). On the PC-side it will send a command, await the error byte, then immediately send the next command. There are no delays on the PC-side other than presumably some OS scheduling for USB traffic.

I noticed on my device that the commands are slower than expected, and each command appears to have ~100ms delay between it. I'm trying to locate the source of these 100ms delays. Here is an example picture of what I'm seeing - it is a series of commands, you can see that each burst of I2C activity is separated from the rest by anywhere from 102ms to 115ms:
screenshot.149.jpg

My guess is that this is due to the USB traffic. I've done some low-level USB work before, and I know some devices can have descriptors set to indicate a polling rate, but I've not pulled apart the Serial code to figure out what is going on.

Is Serial the problem here? and if so is there anything I can tweak to modify this behavior?
 
The commands trigger some I2C activity, and return a byte of data indicating error status (Slave not responding, and such). On the PC-side it will send a command, await the error byte, then immediately send the next command.

This type of design is the simplest but slowest way. A much better approach involves transmitting all messages with some sort of identifier on each command, and then parsing all the replies. Ideally, as many messages as possible should be composed into a huge buffer and written all at once (see below).

However, something else is very wrong, because it should not be nearly as slow as you're seeing. Even with this type of slow protocol, you ought to see latency of about 1 to 2 ms if Teensy replies quickly, and on Windows occasionally up to about 15 ms due to some sort of operating system scheduling delays.

I noticed on my device that the commands are slower than expected, and each command appears to have ~100ms delay between it. I'm trying to locate the source of these 100ms delays.

My best guess is your code might be writing the data 1 byte at a time. Or perhaps you're using a library which does so? I've heard some reports that at least one library for Visual Basic (probably written in the days before USB, maybe back in the days of DOS where libraries accessed the 8 UART registers) does this even if you give it an array of bytes. But most times this has come up, it's been pretty apparent the code was looping over an array and generating the output 1 byte at a time.

There's 2 problems to transmitting data 1 byte at a time.

#1: Single byte USB packets have a huge amount of overhead. Only Mac OS-X is smart enough to recognize dozens or hundreds of single-byte transfers are queued up and consolidate the data into larger efficient packets. Windows and Linux will happily send every write you issued in its own transfer. Each USB packet has about 13 bytes of overhead, so placing only 1 byte in a packet is very inefficient.

#2: Windows has a terrible driver design, where only a single transfer can happen in each USB frame. If you send 1 byte at a time, that 1 byte uses up your one and only opportunity to transmit within that 1 ms frame. You can never send more than 1 kbyte/sec this way on Windows. Mac and Linux do not have this limit, but Windows does.

The solution is simply to write your entire message all at once. The entire message becomes only a single transfer to the Windows driver. You can write very large transfers. The USB host controller chip will automatically split the transfer into USB packets (this happens entire in hardware without any interrupt overhead), where all but the last packet will use the maximum 64 byte size.

My guess is that this is due to the USB traffic. I've done some low-level USB work before, and I know some devices can have descriptors set to indicate a polling rate, but I've not pulled apart the Serial code to figure out what is going on.

Serial uses USB bulk protocol for data transfer, so polling intervals aren't an issue. All of the USB bandwidth not used by other devices is available.
 
A much better approach involves transmitting all messages with some sort of identifier on each command, and then parsing all the replies.

I did use this approach on some of the commands, but only when it did not involve a turnaround in data direction (eg. writes only). Unfortunately on some of them I need to do a read-modify-write sequence, so I really need to get the data back before sending the next command.

My best guess is your code might be writing the data 1 byte at a time. Or perhaps you're using a library which does so?

I am indeed using a comm library. I send data to it in blocks, but I have no idea if it is pushing it a byte at a time (or if it has an internal 100ms delay just for the heck of it).

At one point (year or so ago) I switched the comm lib to Boost::Asio, but ran into all kinds of problems with Arduino boards (when Boost would open a port, it would inadvertently trigger the bootloader sequence on the Arduino). I could never figure out how to fix that.

I see in one of your links the code for your latency test program. I'll study that and see if I can rework my comm system. It certainly sounds like this could be an artifact of the comm lib.

A long time ago I used to use native USB comm, via libusb, and that worked well, but I went to virtual COM ports thinking it would alleviate other end-users from having to mess with driver installs (I think only LUFA ever gave me a solution which didn't require a driver install). I notice in your code library there are other USB protocols, specifically these two:
http://www.pjrc.com/teensy/usb_serial.html
http://www.pjrc.com/teensy/rawhid.html

Are either of those better for this type of back and forth communication?
 
Or perhaps you're using a library which does so?

You nailed it. Sorry for wasting your time here, this is my fault actually. A long time ago I was working on a project that had a latency between commands, and on a write command followed by a read, the PC-side code would invariably attempt to read before data was present - since the read routine was non-blocking I put an arbitrary delay (cough *100ms*), then retry wrapper on the read routine.

I fixed it and get this now:
screenshot.150.jpg

I suppose I should ask if 10-20ms is reasonable or still too slow (or stated another way, would the other USB comm methods improve this any)?
 
No delays should be necessary. Instead of waiting a fixed delay and then reading all the data, you can read the data with a timeout, so your code can respond the instant it's available.

That latency benchmark code might be a good reference?

One thing that does matter is the send_now() function on Teensy. It's in the code for the latency test. When you write data, normally the Serial code on Teensy waits up to a few milliseconds, in case you write more, so it can pack as many bytes efficiently into USB packets (like Mac does, but Linux and Windows can't). The send_now() function causes it to transmit any partial packet ASAP. Without that, you'll get a few extra milliseconds of latency.
 
I looked through your latency test code. I like the brevity of it, simple and effective. I pulled the comm lib I was using out and recoded routines based on your latency test code (I like the way it can work for all platforms too, nice!).

The result is a good improvement, now it's down to 5-9ms between commands. I'm keeping this one I think:
screenshot.151.jpg

You should post that latency program code on your USB pages. That's good stuff!
 
One thing that does matter is the send_now() function on Teensy. It's in the code for the latency test. When you write data, normally the Serial code on Teensy waits up to a few milliseconds, in case you write more, so it can pack as many bytes efficiently into USB packets (like Mac does, but Linux and Windows can't). The send_now() function causes it to transmit any partial packet ASAP. Without that, you'll get a few extra milliseconds of latency.

Hah, thanks for that tip. I put some send_now() functions into the Teensy side and now it's even better: 2-4ms between commands!

screenshot.153.jpg

Thanks for all the help, this is a huge improvement :)
 
Hey guys, I know I'm a little late on this post. However, I'm doing work with USB ports and protocols. nox771, what program are you using in the pictures? It would be very helpful. Thanks.
 
I'm using a Zeroplus 16128+ logic analyzer. The program is the interface software that comes with it (I believe it is called Smart+ or something like that, they probably use it on all their logic analyzer models now). If you search for Zeroplus on Amazon you will see their products. The software only works with their hardware as far as I know.
 
Status
Not open for further replies.
Back
Top