this question is about the achievable delay of a Teensy device.

From my measurements it seems that the minimum lag I can achieve is 1-2x the polling interval.
I am trying to reduce this to about 0-1x polling interval.

Here is the (pseudo) code:

void loop() {
  // Endless loop
  while(true) {
    // There should be exact no packet in the queue.
    ASSERT(usb_tx_packet_count(JOYSTICK_ENDPOINT) == 0);
    // Prepare USB packet (note: this should immediately return as the packet queue is empty at this point.
    // Wait on USB poll
    while(usb_tx_packet_count(JOYSTICK_ENDPOINT) > 0) {

    // Wait until the poll interval is almost over (this is a routine which waits on a timer)

    // Get joystick buttons and axis (sets the usb_joystick_data[])
E.g. if the JOYSTICK_INTERVAL USB poll interval is 10ms the joystick buttons are checked 2ms before the poll.
So this should result in a total lag of 2ms to 12ms depending on when the joystick button is pressed.

But what I'm measuring is 12 to 22ms. So exactly one more poll interval.

Is this to be expected. Is the usb packet maybe queued once more inside the HW?