USB-Packet size and buffering/streaming

Status
Not open for further replies.

maelh

Member
I read in other threads that the USB packet size is 64 bytes, and therefore it would be ideal to send at least 64 bytes in one go.
Is this bytecount the payload, or the size of the entire packet, including USB-protocol metadata?

I tried setting reading/writing buffer sizes using SetupComm() under Windows, but that seemed to have no effect. Is that to be expected?
I am aware that the baud rate setting is ignored, but does that apply to all settings of the virtual COM port of the Teensy?

Manually buffering by writing more bytes into an array, and then sending that array at once (as suggested in other threads) had an effect and significantly improved the speed. I find it somewhat confusing though, why increasing the buffer over 64 bytes still had quite noticeable effects. There will still be overhead for each function call, sure, still surprising it's making such a big difference. Maybe entire memory pages get copied (1024/4096 bytes) from user space to the USB driver.

I guess my point is that I'd like to base the design on some rational decisions, on how transfers actually take place. Benchmarks are a good tool, but I may hit technical specialties of special OS versions, that may turn out to be untrue in future.
 
Maybe entire memory pages get copied (1024/4096 bytes) from user space to the USB driver.

Actually, up to 5 full 4K pages, or any 16K that's contiguous within the virtual address space, starting at any arbitrary offset within a 4K page.

For details, here's the EHCI spec. Look at section 3.5 "Queue Element Transfer Descriptor (qTD)" on pages 40-41. There's also a detailed explanation in section 4.10.6 "Buffer Pointer List Use for Data Streaming with qTDs" on page 86-88.

https://www.intel.com/content/www/us/en/io/universal-serial-bus/ehci-specification.html

This data doesn't actually get copied to the driver. Instead the driver configures the bus-master EHCI controller to fetch it directly from memory as needed, using the qTD structure (which is also fetched as needed, via a copy in the QH structure).


I guess my point is that I'd like to base the design on some rational decisions, on how transfers actually take place.

Here's a copy of the USB 2.0 spec.

https://www.pjrc.com/teensy/beta/usb20.pdf

This is a huge PDF, but the critically important info is in chapter 4 and the first part of chapter 5, especially the stuff about data flow and packet overhead.

USB is a complex protocol which many very specific terms. To really talking meaningfully, it's critical to be familiar with the lingo explained in chapter 4, so we're speaking the same language.
 
Thanks for your reply. Looks like I have quite some reading to do.

Could you comment on SetupComm() in the mean time?
I tried setting reading/writing buffer sizes using SetupComm() under Windows, but that seemed to have no effect. Is that to be expected?
I am aware that the baud rate setting is ignored, but does that apply to all settings of the virtual COM port of the Teensy?
 
Could you comment on SetupComm() in the mean time?

That's really a question for Microsoft. I don't have any special insight to how their WIN32 functions work internally.

What little info I can see looks like it pertains to traditional serial, not USB virtual serial. If this function even returns true when used on a HANDLE for modern USB communication (you're checking the return value?), I'd be surprised if it actually had any effect.
 
However, I can tell you the amount of possible buffering on the Teensy side is controlled by NUM_USB_BUFFERS in usb_desc.h. If you try editing this, please be aware there are many copies, each corresponding to one of the options in the Tools > USB Type menu. As a quick sanity check, make a note of the "Global variables use X bytes" message Arduino prints before and after you change the number of buffers. You should also know the USB memory code uses a 32 bit mask to manage the memory, so no more than 32 of these buffers can actually be used.
 
I can also tell you the Teensy side tries to efficiently pack your outgoing data into the 64 byte packets. This code can be found in usb_serial.c. Look for "TRANSMIT_FLUSH_TIMEOUT". The actual timing is based from the SOF interrupt, in usb_dev.c.


On the Windows side, I'm guessing you probably won't find anyone from Microsoft who will meaningfully answer any of these sorts of technical questions, not to mention share their source code you point you to the specific locations relevant to your question......
 
SetupComm() succeeds as long as the buffers are set to be at least 2 bytes large.

I did more testing. With an Arduino UNO and Teensy 3.6. They surprisingly behave the same regarding buffer sizes, that is: no matter what values I set, the default values will stay in place.
GetCommProperties() allows you to query the current Tx and Rx buffer sizes used by the drivers.
In my tests dwCurrentTxQueue is always 0, and dwCurrentRxQueue is always 16384 (again no matter what I set using SetupComm()).
Maybe the results would be different on a real serial port, though the following link gets the same behavior for the transmit buffer, i.e., dwCurrentTxQueue is always 0.
https://stackoverflow.com/questions/7881061/serial-port-output-buffer-size-in-windows-7

To quote from SetupComm():
The device driver receives the recommended buffer sizes, but is free to use any input and output (I/O) buffering scheme, as long as it provides reasonable performance and data is not lost due to overrun (except under extreme circumstances). For example, the function can succeed even though the driver does not allocate a buffer, as long as some other portion of the system provides equivalent functionality.

My conclusion:

Based on my own experiments, and the returned values for dwCurrentTxQueue, I think it is safe to assume that the buffers for transmission are always of size 0. Non built-in drivers might differ. (see also italic part on bottom of the post)


Interestingly, GetCommProperties also return this value:
wPacketLength = 64 // Doc says: The size of the entire data packet, regardless of the amount of data requested, in bytes.

While this is just a guess, it might reflect the size of a USB packet. A real serial port and a micro that uses a non-builtin driver (both of which I currently don't have) would be useful to test this assumption.

I researched other sources, including Wine, and IOCTL_ requests, but ultimately as you said, without driver source code, you don't get much further. The contract as defined by the doc lets the driver take quite some freedom, which in the end means anyways, that you have to assume no buffering is done, since you can't possibly know which driver will end up on a system nor know them all.
 
Last edited:
Status
Not open for further replies.
Back
Top