Teensy USB Host connecting to multiple devices


Well-known member
I've made enough progress in my USB Test and Measurement Class (USBTMC) host and device drivers to start the initial hardware testing with one T4.1 host connected to one T4.1 device. My initial tests will skip the implementation of the USBTMC protocol and just transfer packets of variable size at variable rates to establish the USB link bandwidth constraints.

Sometime in the future, I hope to expand the host driver to connect to multiple T4.1 devices connected to a powered hub. A recent post on X by @PaulStroffgen indicates that this should be possible.
USBHost to 3 devices

I have some questions about the multiple device setup:

1. How do the devices enumerate?
2. What is the syntax for defining the multiple devices?
3. Are the connected devices limited by the six available host endpoints (2..7), or does each device get its own set of endpoints?
4. If the three devices are all sending data to the host, how does the host figure out which data comes from which device?
(My first guess is that the host can look at the incoming transfer->Pipe->Device field to figure it out).
5. Can the USB host handle dynamic connection and disconnection of devices?

Multiple devices sending lots of data to the host at once means the host will need separate circular queue buffers for each device to handle potential delays while waiting up to 150-200mSec for SDC writes to complete. I'd like to relieve the user of the buffer management chores by putting buffer management in each device instantiation. That way, there will be less demand on host memory, and the host will just stop reading from the devices while waiting for the SD card. That means that the device code has to handle buffering---but just for itself. That allows each device to use as much DTCM and DMAMEM as needed. Using EXTMEM for device buffers complicates things in that the USB controller can't send data directly from EXTMEM. EXTMEM apparently can't keep up with the DMA requirements for USB transfers.

I'd appreciate any answers you might have to the quesions above.
Here's an update on the USBTMC project:
After a few weeks lost to vacation time in a place sunnier than Western Oregon, I've gotten back to the USBTMC project. Progress has been better than I expected:

1. I have connected three Teensy USBTMC devices (1 T4.1 and 2 T4.0) to a T4.1 USBTMC Host through a powered USB Hub.

2. The data transfers follow the USBTMC transfer mode where the Host sends requests to each attached device and the devices respond with a data packet of the requested size (plus a 12-byte header mandated by the USBTMC protocol).

3. Each attached device communicates with a separate instance of the host USBTMC driver. My current host software allows up to four host instances to run at one time. (My powered hub has 4 ports.)

4. Each host driver USBTMC instance uses about 2.5KB for variables, the largest of which is a 2KB command/request transmit buffer for each instance. A larger chunk of memory is used for the buffers for the USBSerial_BigBuffer drivers which come along with the USBTMC drivers. The USBSerial link is used primarily for sending debug information to the host.

Here's the memory usage for the test program:

Memory Usage on Teensy 4.1:
FLASH: code:125432, data:14396, headers:8648 free for files:7977988
RAM1: variables:50816, code:120632, padding:10440 free for local variables:342400
RAM2: variables:28800 free for malloc/new:495488
EXTRAM: variables:8388608

5. The device USBTMC driver programs are pretty small in the current test configuration:

Memory Usage on Teensy 4.0:
FLASH: code:43028, data:7116, headers:8220 free for files:1973252
RAM1: variables:27552, code:41240, padding:24296 free for local variables:431200
RAM2: variables:12416 free for malloc/new:511872

The device uses two 8KB buffers in ping-pong fashion to send requested data blocks to the host. The USB Serial port on the device also requires some buffer space.

6. The device programs use a simulated data collection function to return data as requested by the host. A sample can be from 16 to 48 bytes. The program uses the first four UINT32_t values in the sample to send the following values for verifying sample collection integrity: Sample number, CPU Clock cycles since last sample, and two simulated sine waves. The samples collection occurs at the rate requested by the host (usually from 10K to 100KSamples/second). The samples are generated in response to an IntervalTimer interrupt at priority 64.

7. The device transmits data when an 8KB buffer is filled and a host request is pending. Transmitting the data to the host takes about 350 microseconds. The transfer is done by the USB controller using DMA. The processor time to set up the transfer is only a few tenths of a microsecond. That leaves a lot of the 10uSec between 100KHz samples for such chores as filtering inputs or recognizing events.

8. The host USB driver receives device response packets and stores them in a circular buffer using DMA. As each packet is received, the host USB received callback function checks for sequence errors and queues a new data request packet, using a buffer pointer from the circular buffer handler. The callback function uses the incoming transfer data to determine which driver queues the request packet.

9. My goal a few weeks ago was to collect from two devices sending 4MB/second each. That goal has been met and exceeded. I can now collect from three devices sending 100,000 samples/second each with a sample size of 36 bytes. The aggregate data rate saved to the SD card is 3 * 3.625MB/second, or about 10.8MB/second. Larger sample sizes or higher sample rates start to produce overrun errors in the circular buffer. (One circular buffer is used to store packets from all devices. The header information in the packets will allow sorting them out in post-processing.)

Here's the host output from a recent test. I've added some comments after '//' markers

100KSamples/second  36 bytes per sample  New Sandisk ultra 128GB SD Card
Using full 8MB of EXTMEM for circular buffer
Device 1 Collection complete.   // T4.1
238.174 MBytes read from 29074  blocks
Read time:  65709 mSec for  3.625 MB/sec
Response Errors: 0 // response errors occur if the host
                   // receives a packet with markers not
                   // matching it's last request

Device 2 Collection complete.    // T4.0
238.232 MBytes read from 29081  blocks
Read time:  65725 mSec for  3.625 MB/sec
Response Errors: 0

Device 3 Collection complete.     // T4.0
238.232 MBytes read from 29081  blocks
Read time:  65725 mSec for  3.625 MB/sec
Response Errors: 0
OverRun Errors: 0  // Over run errors are noted by the
                   // host circular buffer driver