Just a blind guess, but is it possible this Bluefruit module also depends on short delays between commands? Maybe the AVR chips always pause several microseconds due to normal code execution speed, but Teensy 3.6 is able to run the code much faster and start another communication sooner.
We saw this sort of problem last year with a popular Lidar module. If you waited 20ms (as the default examples did), it always worked. But if you read its status register to detect when it was ready, so you could get the data as soon as possible, it turned out the Lidar had a bug where it wasn't really ready for a few more microseconds and reading it immediately after it said it was ready would return bad data. As I recall, that particular Lidar has another even worse bug where asking it to make another measurement immediately after reading the data (without the 20ms wait) would crash its internal processor. Adding a several microsecond delay solved the problem. My guess is they only ever tested with a slow chip like AVR doing the I2C commands, where delays of a dozen microseconds are common while the AVR code does things like print to Serial buffers.