Project Advice - USB Tester

Looking for any advice on my project to help improve it in any way.

The project is leveraging the USBHost_t36 library to create a USB HID latency tester. It basically works by triggering a pin connected to a pad on a joystick, mouse, or keyboard, and then timing how long it takes for the USB host to register a button change. I compared the results with a USB protocol analyzer and found specific skews for joystick and mouse (I don't have a gaming keyboard yet) and the results are within 1us with the skew subtracted. The timer is activated by a pin interrupt tied to the same line as the HID pad and then stops on button change.
My assumption that the skews were consistent with each device type was wrong. I built another T41 to act as various HID types and the skews were off for each one from my test devices. I'm stuck on what exactly is causing the skew in the first place. Each time the pin is changed the loop completes several millions of times, but somehow the measurements are off 120us to 900us (game pads are the worst) from the Beagle.

The next idea I had was to leverage teensythreads (branch here - but the results there come back essentially the same as not using threading.

If anyone would like to try for themselves - the testing setup is connecting pins 6 and 9 together and to the button on a keyboard, mouse, or game pad.

With a Razer Viper 8KHz mouse the testing sketch measures 261us but the Beagle is at 131us.

Data comes to your Teensy via USB only when the Host requests it. The host breaks USB transfers up into micro Frames which request data from the connected devices at 125uSec intervals. I'm not sure of the difference between your two devices, but it may be that the 8KHz mouse adds one microframe of delay before it signals the USB host at the next interval, whereas the Beagle sets up the USB in time to have the host detect the input at the next microframe.