Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 7 of 7

Thread: USB 2.0 Teensy port at 8000 Hz poll rate?

  1. #1

    USB 2.0 Teensy port at 8000 Hz poll rate?

    Hello,

    Is there way to custom-modify the Teensy USB implementation to permit an 8000 Hz poll rate for HID transmissions?

    USB 2.0 supports this poll rate with Windows 10 (0.125ms per poll), and Teensy 4.0 has a USB 2.0 port.

  2. #2
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,213
    The descriptor in usb_desc.h sets the poll rate already at the minimal value 1 which corresponds to 8kHz (125s) for an USB-2.0 device. To test it, I did the following quick experiment with a win10 host. The Teensy firmware (below) provides Raw HID reports to be polled by the PC as fast as possible and writes the start time into the first 4 bytes of the report.

    (T4.0, Mode RAW HID):
    Code:
    uint8_t report[64];
    
    void setup(){
    }
    
    void loop()
    {
        uint32_t ticks = micros();
        memcpy(report, &ticks, sizeof(uint32_t));
    
        usb_rawhid_send(report, 100);
    }
    The PC Software (here c#) reads in reports as fast as possible, extracts the send time from the data and displays the duration between two frames (s). Ideally this time would be 125s.
    Code:
    using HidLibrary;
    using System;
    using System.ComponentModel;
    using System.Diagnostics;
    using System.Linq;
    
    namespace receiver
    {       
        class Program
        {
            static HidFastReadDevice teensy;
            static BackgroundWorker worker = new BackgroundWorker();               
            static UInt32 oldTime = 0;
            static UInt32 dtTeensy = 0;
                    
            static void Main(string[] args)
            {
                // set up a  worker thread to read in HID frames as quickly as possible. 
                // read out the data (send time, filled in by the Teensy) and calculate dt
    
                worker.DoWork += async (o, s) =>
                {                                    
                    while (true)
                    {                  
                        var report = await teensy.FastReadReportAsync();                          
                                                             
                        var newTime = BitConverter.ToUInt32(report.Data, 0);
                        dtTeensy = newTime - oldTime;
                        oldTime = newTime;
                    }
                };
                            
                // open the RAW Hid interface
                var enumerator = new HidFastReadEnumerator();
                teensy = (HidFastReadDevice)enumerator.Enumerate(0x16C0, 0x0486)    // 0x486 -> usb type raw hid
                         .Where(d => d.Capabilities.Usage == 0x200)              // usage 0x200 -> RawHID usage
                         .FirstOrDefault();
    
                if (teensy != null)
                {
                    teensy.OpenDevice();
                    worker.RunWorkerAsync();
    
                    while (!Console.KeyAvailable)
                    {
                        Console.WriteLine($"dt: {dtTeensy} s");
                    }
                    teensy.Dispose();
                }
            }
        }
    }
    Here the output:
    Code:
    dt: 125 s
    dt: 125 s
    dt: 375 s
    dt: 125 s
    dt: 125 s
    dt: 250 s
    dt: 250 s
    dt: 250 s
    dt: 250 s
    dt: 1875 s
    dt: 250 s
    dt: 500 s
    dt: 125 s
    dt: 133 s
    dt: 374 s
    dt: 124 s
    dt: 250 s
    dt: 375 s
    dt: 375 s
    dt: 250 s
    dt: 375 s
    dt: 250 s
    dt: 125 s
    dt: 125 s
    dt: 250 s
    dt: 375 s
    dt: 250 s
    dt: 375 s
    You can see that the max HID-Report rate is 8kHz (125s between HID reports) indeed. However, sometimes it takes more than one USB frame (multiples of 125s). Difficult to tell if this is caused by the Teensy or by the PC. I suspect that PC is not polling fast enough and not the Teensy sending too slow. But this is a gut feeling only.
    Last edited by luni; 10-20-2020 at 10:54 AM.

  3. #3
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,039
    I'm so glad you tested this! Whether 125us response is actually possible from PCs been on my list of things to someday check. Almost all of my testing has involved looked at the USB packets with a protocol analyzer displaying the communication on another machine, which only confirms that it should theoretically be possible, but not whether it can actually be achieved given all the limitations the PC's operating system imposes.

    To get more consistent performance, you would probably need to do something special so this program runs with "real time" or otherwise higher priority scheduling.

  4. #4
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,213
    To get more consistent performance, you would probably need to do something special so this program runs with "real time" or otherwise higher priority scheduling.
    I did some more experiments on that with a (at least for me) surprising result: I always assumed, that some PC application needs to actively read the HID reports to 'remove them from the bus'. This is not the case.
    Instead, the HID driver always reads available reports in the background and buffers them for potential user applications.

    I therefore rewrote the test in a more reasonable way. The firmware now simply copies a running number into the report and prints the time between sending of two reports. The running number is used to check for missed reports on the PC side.
    Code:
    uint8_t report[64];
    
    void setup(){
    }
    
    uint32_t t0 = 0;
    uint32_t cnt = 0;
    
    void loop()
    {
        memcpy(report, &cnt, sizeof(uint32_t));  // copy the report number into the report
        usb_rawhid_send(report, 1000);
    
        uint32_t now = micros();                 // measure time between sending of two reports
        Serial.println(now - t0);
        t0 = now;
        cnt++;
    }
    This results in a very stable rate of 125s per loop.

    I also rewrote the windows part of the test. It now reads 5000 reports (5000*64 = 320'000 byes) from the bus without printing and analyzes the results later. Here the analysis result:
    Code:
    Reports:    5000 
    Data:       320000 bytes
    Total time: 0.62 s
    Throughput: 501.3 kB/s  (target: 64B/125s = 500 kB/s)
    t_min:      3.5 s (@ 12)
    t_max:      1689.7 s (@ 2177)
    It is interesting to see that the througput (501kB/s) i sa little bit higher than the theoretical rate of 500kB/s. There where no reports lost.

    Looking at the beginning of the time series one sees the buffering effect:
    Code:
    Nr: 1 delta: 60.3 s
    Nr: 2 delta: 4.9 s
    Nr: 3 delta: 4.1 s
    Nr: 4 delta: 3.8 s
    Nr: 5 delta: 4.1 s
    Nr: 6 delta: 3.6 s
    Nr: 7 delta: 3.7 s
    Nr: 8 delta: 3.9 s
    Nr: 9 delta: 3.9 s
    Nr: 10 delta: 3.9 s
    Nr: 11 delta: 3.7 s
    Nr: 12 delta: 3.6 s
    Nr: 13 delta: 3.5 s
    Nr: 14 delta: 3.6 s
    Nr: 15 delta: 3.5 s
    Nr: 16 delta: 3.6 s
    Nr: 17 delta: 3.8 s
    Nr: 18 delta: 108.7 s
    Nr: 19 delta: 123.4 s
    Nr: 20 delta: 125.5 s
    Nr: 21 delta: 125.0 s
    The first frames obviously come from a buffer which is quite fast. After starving the buffer they arrive with the expected data rate. Buffering also works if the application can't read the current reports. See #2178 after which the code catches up the large delay of 1689.7 s.

    Code:
    Nr: 2174 delta: 135.7 s
    Nr: 2175 delta: 107.4 s
    Nr: 2176 delta: 59.9 s
    Nr: 2177 delta: 125.9 s
    Nr: 2178 delta: 1689.7 s  <======
    Nr: 2179 delta: 6.9 s
    Nr: 2180 delta: 3.9 s
    Nr: 2181 delta: 11.8 s
    Nr: 2182 delta: 78.9 s
    Nr: 2183 delta: 5.7 s
    Nr: 2184 delta: 3.9 s
    Nr: 2185 delta: 3.7 s
    Nr: 2186 delta: 3.7 s
    Nr: 2187 delta: 3.6 s
    Nr: 2188 delta: 3.5 s
    Nr: 2189 delta: 3.7 s
    Nr: 2190 delta: 3.6 s
    Nr: 2191 delta: 3.6 s
    Nr: 2192 delta: 130.5 s
    Nr: 2193 delta: 106.5 s
    Nr: 2194 delta: 107.9 s
    Nr: 2195 delta: 161.7 s
    Nr: 2196 delta: 108.6 s
    All in all: On Win10, the HID transfer works very stable at the advertised rate of 8000 reports per second (500kB/s). Even with a high level language like C# which is not famous for speed :-). I also changed the poll rate in the descriptor to 2 and 4 which throttles the transmission exactly as it should.


    Here the C# code in case someone wants to repeat the experiments:
    Code:
    using HidLibrary;
    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    
    namespace receiver
    {
        class Program
        {
            class dataPoint
            {
               public UInt32 count;
               public TimeSpan time;
            }
    
            const int nrOfReports = 5000;
                    
            static void Main(string[] args)
            {                                  
                var enumerator = new HidFastReadEnumerator();                      
    
                var teensy = (HidFastReadDevice)enumerator.Enumerate(0x16C0, 0x0486)   // Get all devices with vid/pid 0x16C0/0x486 -> usb type raw hid
                             .Where(d => d.Capabilities.Usage == 0x200)                // filter by usage 0x200 -> RawHID device (not serEmu!)
                             .FirstOrDefault();                                        // take the first of the found devices or return null
    
                if (teensy != null)
                { 
                    var data = new List<dataPoint>();                             
                    var stopwatch = new Stopwatch();
                    stopwatch.Start();
                    teensy.OpenDevice();                             // open the device
                                
                    for (int i = 0; i < nrOfReports; i++)            // read the reports, store report number and timestamp
                    {                    
                        var report = teensy.FastReadReport();
                        data.Add(new dataPoint
                        {
                            time = stopwatch.Elapsed,
                            count = BitConverter.ToUInt32(report.Data, 0)
                        }); 
                    }
    
                    int errors = 0;
                    
                    for (int i = 1; i < data.Count; i++)             // print data
                    {
                        var dt = data[i].time - data[i - 1].time;
                        var cnt = data[i].count;
                        if (data[i].count - data[i - 1].count != 1) errors++; // check for missed reports
    
                        Console.WriteLine($"Nr: {cnt - data.First().count} delta: {dt.TotalMilliseconds*1000:F1} s");
                    }
    
                    var totalTime = data.Last().time - data.First().time;
                    var totalData = data.Count * 64;
                    var througput = totalData / totalTime.TotalSeconds / 1024;
                    
                    Console.WriteLine();
                    Console.WriteLine($"Reports:    {data.Count} s");
                    Console.WriteLine($"Data:       {totalData} bytes");
                    Console.WriteLine($"Total time: {totalTime.TotalSeconds:F2} s");
                    Console.WriteLine($"Throughput: {througput:F1} kB/s  (target: 64B/125s = 500 kB/s)");
    
                    var times = Enumerable.Range(1, data.Count - 1).Select(i => data[i].time - data[i - 1].time).ToList();
                    var tMin = times.Min();
                    var tMax = times.Max();
                    var maxIdx = times.IndexOf(tMax);
                    var minIdx = times.IndexOf(tMin);
    
                    Console.WriteLine($"t_min:      {tMin.TotalMilliseconds*1000:F1} s (@ {minIdx})");
                    Console.WriteLine($"t_max:      {tMax.TotalMilliseconds*1000:F1} s (@ {maxIdx})");
    
                    teensy.Dispose();
                    while (Console.ReadKey().Key != ConsoleKey.Escape);
                }
            }
        }
    }

  5. #5
    Quote Originally Posted by PaulStoffregen View Post
    I'm so glad you tested this! Whether 125us response is actually possible from PCs been on my list of things to someday check.
    It certainly is. Reliable 0.125us is possible with an optimized Windows 10 computer (cherrypicked hardware, clean install, and setup). Some commercial gaming peripherals are now testing 4000 Hz and 8000 Hz operations already!

    I'll have to figure out where the buffering is occuring though; for best realtime performance I do find it best to make sure the high-pollrate device is on its own dedicated USB chip though. For example, if you use a PCI Express USB card, you have to make the 8KHz device the only USB device plugged in, for the maximum reliability for timing-critical applications, since poll reliability can be sensitive to contention by other USB devices.

  6. #6
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,213
    It certainly is. Reliable 0.125us is possible with an optimized Windows 10 computer (cherrypicked hardware, clean install, and setup). Some commercial gaming peripherals are now testing 4000 Hz and 8000 Hz operations already!
    Actually I tested this on my old computer which I bought used some 3 years ago (i5, 2.9GHz, 6GB). It doesn't even have USB3 and is crammed with stuff accumualted over the years. Zoom and Teamviewer run in the background, the Teensy was connected on a cheap (<5 EUR) hub together with another Teensy... Together with the HID transfer the same Teensy spit out the time stamps over USB-Serial to TyCommander. So, nothing special at all.

  7. #7
    Quote Originally Posted by luni View Post
    Actually I tested this on my old computer which I bought used some 3 years ago (i5, 2.9GHz, 6GB). It doesn't even have USB3 and is crammed with stuff accumualted over the years. Zoom and Teamviewer run in the background, the Teensy was connected on a cheap (<5 EUR) hub together with another Teensy... Together with the HID transfer the same Teensy spit out the time stamps over USB-Serial to TyCommander. So, nothing special at all.
    Interesting!

    Either way, computer side can go 0.125us (with acceptable microjitter) for sustained periods without multi-poll clumping, if recent USB implementation and you optimize the computer properly (and keep only one high-pollrate branch per USB trunk, no background software).

    In theory the Teensy 4.0 is powerful enough that it should not be the weak link if it already supports 0.125us polling on USB 2.0. Some gaming peripherals testing out 4000 Hz and 8000 Hz are using weaker microcontrollers than the Teensy 4.0

    You may have heard of the Razer 8000 Hz gaming mouse and the AtomPalm Hydrogen 8000 Hz gaming mouse. (Some testers are showing a worthwhile human-visible difference, since it co-operates better with less jittering/rounding with new ever higher-refresh-rate monitors getting too close to 1000 Hz mouse poll rate)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •