Anyone knows if the Teensy 3.6 USB is faster?

Status
Not open for further replies.
Isn't the Due discontinued ?

For some time, Arduino.cc was calling Due "retired". Arduino.org continued to sell it.

Arduino.org has never contributed any meaningful software work for Due, and they probably never will. Even after discontinuing Due, the Arduino.cc devs published some small updates to the Due's core library, but it's pretty clear they're now focusing on SAMD architecture for Arduino Zero & MKR1000.

Even if the hardware is still selling, for all practical purposes, Arduino Due's software support appears to be frozen.

No update, I was expecting some help to find where the code dealing with this secondary USB port is

I've actually been working on this quite a lot in the last few weeks, but so far it's just not working. There haven't been public updates, because there just isn't anything very newsworthy. But if you want to look at the work-in-progress, here it is:

https://github.com/PaulStoffregen/k66_usbhost

Believe me, this lack of progress isn't for lack of trying. The EHCI hardware is incredibly powerful, but very complex, and when things aren't right it gives very little indication why. I am working on this and I hope to have useful news to share soon. But to be honest, at this moment I'm really stuck. As you can see if you read that code, it's trying to send a control transfer, but when I test the EHCI seems to just ignore it. There's probably some minor detail not right.


"regular Arduinos" does not mean anything to me, please consider the fact that the Arduino Due (Sam3X) has a native High Speed USB port (480 Mb/s) and the Arduino Leonardo (ATmega32U4, which is used on the Teensy 2 too) has native Full Speed USB (12 Mb/s).

Indeed Arduino has many boards now with native USB ports.

However, for native USB, if you need good performance you'll quickly discover the quality of software makes a tremendous difference. A few years ago (before Teensy 3.1) when people were just starting to do large LED projects, I wrote a benchmark to measure how fast the native USB could actually receive data. The software on the PC side also can make a huge difference. Here are those old benchmarks:

https://www.pjrc.com/teensy/benchmark_usb_serial_receive.html

It's be really interesting to see how all the boards now available actually perform with this benchmark. I've done a few casual tests that show the situation is still pretty similar, with improvements in Due's transmit speed but not receive, and Arduino Zero has some substantial USB performance issues.

Maybe I'll get them all and redo the tests... but last time I spend quite a bit of time doing those tests and writing up the results. Probably doesn't make sense to put time into that when I should be (and am) spending my time on getting EHCI to work.
 
Last edited:
Thanks for the update Paul.
To be honest, the mess between Arduino.cc and Arduino.org has anoyed me very much, especially when I discovered the story behind Arduino (https://arduinohistory.github.io/). This story, the lack of software evolution on the cores and the IDE of the Arduino, the quality of Teensy boards, of your work and of the community's work on Teensy's cores made me decide to choose Teensy boards in priority on all my future projects.
I found a bug in the Due pwm functions and pulled a request about it, but so far nothing happened and it's never been corrected (it's just a line that needs to be deleted !)...

Anyway I'll have a look at your code Paul and will try to help, thanks for sharing.
I believe and I'm glad you've been working on this, as High Speed USB device and host support is a very useful feature in my opinion.
High speed USB device support is a key feature for some of the projects I'm working on at the moment (Force feedback steering devices and motion simulators control).
I've been working myself on several USB stacks in the past few month, and I know it's very hard to make it work, due to the lack of documentation and samples, and because it's very hard to debug as well (for example stopping the code during the device recognition phase in order to debug leads to errors on the host).

Can I ask you what tool(s) you are using for USB debugging ? I've been considering using USBlyzer software, it's quite expensive but seems to be the only "serious" software for that kind of problems.

Your benchmarks on usb serial are very interesting, I wouldn't have expected so much differences between Teensy 2.0 and Leonardo (considering it's using the same hardware) !
It would be interesting to do the same kind of benchmarks with raw HID and with gaming devices HID.

For sure you should be spending your time on getting EHCI to work :)
Even if I have a lot of other projects in parallel at the moment, I'll try to help and hope I can (but to be honest I'm really a noob when it comes to USB protocols, and it's sooo complex).
(I've designed a few boards based on Teensy LC and 3.2 and I'm going to open an online shop to sell them in a few weeks, I'll probably post about that here when it's ready ;)
 
Last edited:
Can I ask you what tool(s) you are using for USB debugging ?

I have a Beagle480 from Total Phase. It makes the USBlyzer software seem very affordable!

Of course there's no PC-only software way to monitor the USB communication between Teensy 3.6 and a USB device. A hardware analyzer is required. Here's a photo of it on workbench right now:

usb.jpg
(click for full size)

I'm using the Teensy 3.6 beta board, and that other little board is a USB switch based on a FSUSB30 chip. You can see the USB comes out of the Teensy, goes through the Beagle 480, then the switch, to a 1GB USB memory stick. Pin 32 controls the FSUSB30 switch, so I can cause the device to be physically connected under software control.


I wouldn't have expected so much differences between Teensy 2.0 and Leonardo (considering it's using the same hardware) !

Careful design & optimization and testing makes an incredible difference.

It would be interesting to do the same kind of benchmarks with raw HID and with gaming devices HID.

Usually HID protocol is limited by the polling at most once per USB frame.

Then again, often benchmarking shows areas originally thought to be optimal sometimes aren't....


For sure you should be spending your time on getting EHCI to work :)
Even if I have a lot of other projects in parallel at the moment, I'll try to help and hope I can (but to be honest I'm really a noob when it comes to USB protocols, and it's sooo complex).

Indeed, EHCI is quite difficult. I have a *lot* of experience with USB, and indeed over the last few weeks I've re-read much of the USB 2.0 spec and of course the ECHI 1.0 spec many times over.

ECHI is also quite incredible. It allows you to construct and modify at runtime pretty much arbitrarily complex work queues in RAM (no slow register stuff) which it services autonomously, keeping the 480 Mbit/sec hardware fully utilized.
 
Did you get in touch with a NXP engineer to get some help ?
I have a friend working at Freescale, maybe I can ask him to get us an appropriated contact. After all the guys that designed the hardware should know how to use it right ?
Host USB support is probably much more complex than Device support, and looking at your code I'm afraid I won't be able to help much.
But maybe I can have a look at the HS device code in the Due core and adapt it to your code, do you think it would be a good and feasible approach ?
 
Did you get in touch with a NXP engineer to get some help ?

No, I haven't talked with them.


But maybe I can have a look at the HS device code in the Due core and adapt it to your code, do you think it would be a good and feasible approach ?

I looked briefly. Seems to be all based on endpoint FIFOs accessed with register I/O to endpoint FIFOs, similar to the AVR USB chips. Some of the code is in the core lib, the rest in libsam.

You're welcome to try playing with it, but unless I missed something major, it's probably pointless. They don't even seem to make any attempt to use Due's DMA, which of course is nothing like EHCI.
 
I know that feeling :)
Last week, I wrote my fourth attempt for a C64-Videochip emulation (I had almost given up..)- well, all previous versions worked somehow, but they were too slow. I hope that I'm on the right path now. It's unbelievable how much work it is to make the emulation of this 80's silicone fast enough with all the sprite-features and graphicmodes.
 
At least this USB stuff isn't real-time like C64 emulation. That really makes everything so much harder.

Just last night I finally got the EHCI to send a setup token. Very exciting! Until that moment, it had only ever automatically sent SOFs.

Of course it's taken many weeks false starts and redoing things and reading the EHCI spec and Freescale's rather minimal info. I've also re-read quite a lot of the USB 2.0 spec in the last month. Until now I had pretty much ignored how split transactions work, and a number of finer protocol details.

On the plus side, I've put a lot of thought into how to simplify the structure over the last couple weeks. For now, I'm focusing only on control, with plans to add bulk and interrupt, but not isochronous.
 
Now that's excellent news ! Congrats Paul :)
I suppose this was the hardest part wasn't it ?
Do you think it will be difficult do modify your code to make the device mode ? As soon as there's interrupt transfer support, I can dig into the code and try to add the HID stuff.
(sorry to hassle you with that, but if I can have a HS HID device on this board, a new era is beginning for me :))
 
I suppose this was the hardest part wasn't it ?

I sure hope it was, but it's far too early to say...


Do you think it will be difficult do modify your code to make the device mode ?

Why don't you jump in a give it a try, then you can tell me what you think?

Or yes. The answer is yes, I do believe it will be difficult. How's that?
 
Just a quick update on EHCI. While the last couple days have been a breakthrough for me, there are still many critically important parts missing. Queuing transfers is only half the job to successfully communicate. After the controller completes them, then you need to find the leftover transfer descriptors and actually do something with them. I started working this morning on adding another linked list that (hopefully) can be used to gain access to them after they're not longer linked from the queue heads.

Handling disconnect events is another area where things are still not working. Like not at all. The hardware doesn't even give an interrupt when the device unplugs, so I suspect something is wrong at the PHY level. Even when that's working, the EHCI process for dynamically removing queues is very complex.

There's also quite a lot of stuff needed to support periodic transfers for interrupt endpoints, including what looks like a lot of thorny details for managing transaction translators when communicating with 12 & 1.5 MBit/sec devices.

I will be working on this stuff and so much more. Eventually I will also work on device mode, but for now the focus is on running in host mode.

It's easy to get excited by this sort of news, and indeed it is exciting, but it's also easy to form unrealistic expectations. Please understand usable EHCI support is a huge project and I'm only at just the very beginning. It's going to take time. I would urge anyone who's interested to click "Watch" on the github repository to get updates as process is made on this long journey.
 
Yes it's easy to get exited.
I've begun to try to modify your code, but as expected it's far from straightforward, and I can't afford to spend weeks or even month reading thousands of pages of documentation about EHCI and K66, building a stack from scratch as you do. I'm more focusing on higher level layers and applications (FFB firmwares, motion simulator hardwares, video game engine, robotics and more).
I've downloaded the kinetis development studio and will probably have a look at the SDK, as it's supposed to have a ready to use USB HID stack.
For the STM32 I've been using ST tools and did manage to have a working FFB HID device with a few weeks of work, but it's only in 12 Mbits/s and ADC are 12 bits, which is why I wanted to use the Teensy 3.6 instead.
I'll keep you informed if I get any positive results with this approach.
I hope you'll be able to reach your goals and I understand that you focus on host mode.
 
Today I got the keyboard driver to receive 8-byte HID reports. It's not parsing them yet, and hubs aren't working, and even if they were lots of other infrastructure to support more than 1 device is still missing, and there's absolutely no support yet for disconnecting devices (all resources leak if you do), and interrupt endpoint intervals aren't being used, and a ton of other work is still needed. But it's at least now to the point with keyboards following the common boot protocol (almost all do) are now getting their data printed in the serial monitor, together with a huge pile of minutia from tons of other debugging code. Much remains to be done, but it's finally starting to come together....
 
Congrats !
Can you upload the code to Github ? I want to test if the USB-Conn. on my "flexi"-boards works.
 

Great, it's detecting keystrokes!
Code:
USB Host Testing
sizeof Device = 28
sizeof Pipe = 64
sizeof Transfer = 64
power up USBHS PHY
Plug in device...


ISR: 408C
 Port Change
port change: 14001403
    connect


ISR: 1004088
 Timer0
timer
  begin reset


ISR: 408C
 Port Change
port change: 14001405
  port enabled


ISR: 1004080
 Timer0
timer
  end recovery
new_Device: 1.5 Mbit/sec
new_Pipe
new_Control_Transfer


ISR: 4C081
 USB Async
Async Followup
enumeration:
new_Control_Transfer


ISR: 4E081
 USB Async
Async Followup
enumeration:
new_Control_Transfer


ISR: 4E081
 USB Async
Async Followup
enumeration:
new_Control_Transfer


ISR: 4E081
 USB Async
Async Followup
enumeration:
Config data length = 59
new_Control_Transfer


ISR: 4E081
 USB Async
Async Followup
enumeration:
bNumInterfaces = 2
bConfigurationValue = 1
new_Control_Transfer


ISR: 4E081
 USB Async
Async Followup
enumeration:
USBHub claim_device this=1FFF32B8
USBHub claim_device this=1FFF3308
USBHub claim_device this=1FFF3358
KeyboardController claim this=1FFF33A8
Descriptor 4 = INTERFACE
KeyboardController claim this=1FFF33A8
ep = 81
packet size = 8
polling interval = 10
new_Pipe
allocate_interrupt_pipe_bandwidth
 min_bw = 3
, at offset = 0, shift= 0
init periodictable with 1FFF3502
new_Data_Transfer
Descriptor 33 = HID
Descriptor 5 = ENDPOINT
Descriptor 4 = INTERFACE
Descriptor 33 = HID
Descriptor 5 = ENDPOINT


ISR: 8C089
 USB Periodic
Periodic Followup
KeyboardController Callback (static)
KeyboardController Callback (member)
  KB Data: 00 00 00 00 00 00 00 00 
new_Data_Transfer


ISR: 8E089
 USB Periodic
Periodic Followup
KeyboardController Callback (static)
KeyboardController Callback (member)
  KB Data: 00 00 00 00 00 00 00 00 
new_Data_Transfer

...

ISR: 8E089
 USB Periodic
Periodic Followup
KeyboardController Callback (static)
KeyboardController Callback (member)
  KB Data: [COLOR=#00ff00][I][B]00 00 28 00 00 00 00 00 [/B][/I][/COLOR]
new_Data_Transfer

Awesome...
 
Last edited:
@Paul: That's almost exactly what i need :)
Can i use my own callback for the incomming keyboard-data ? The raw keyboard-data would be perfect for my emu.

Edit: Perhaps a better question: Will it be possible to use a custom keyboard-driver instead of yours ?
 
Last edited:
All the APIs are likely to keep changing over the next several weeks as I continue supporting more USB functionality. But one thing I'm determined to avoid is interrupt-context execution of Arduino user code. I know that would be the "easy" way for me, and it would give excellent performance (unless the called code delays). But the usability cost is far too high.

If you want an ISR context callback, the path will (probably) involve forking the driver code. My intention is to allow device drivers to be their own separate libs, much like audio objects like your MP3 work can be in their own libs, but integrate with the rest of the system.

Of course, this is still a very early stage in development. Much is likely to change...
 
But one thing I'm determined to avoid is interrupt-context execution of Arduino user code. I know that would be the "easy" way for me, and it would give excellent performance (unless the called code delays). But the usability cost is far too high.

Yes, I think the same.
Your code it already good enough for me to use it for my current project ( don't know if the speed is ok - let's see... :) ) and I don't need more features at the moment.
Great work. I'll fork it und see what can be done :) I can still switch to the "official" LIB when it is ready.

My "TODO" is to map the keyboard-data to the C64 hardware keyboard-matrix (two 8 bit ports).
 
I've added the beginning of a USB MIDI device driver. Like the keyboard driver, it detects & claims the interface, and incoming MIDI data is received and printed to the serial monitor.

This means bulk endpoints are working, in addition to control and (partially working) interrupt endpoints. :) Well, at least IN direction... will do a serial device next to test OUT direction....
 
Really great work so far Paul, I'm enjoying watching your GitHub progress.

Obviously extremely early days yet so this feedback has little value aside from encouragement - I'm happy to report your sketch runs quite happily for me too:

ISR: 8E089
USB Periodic
Periodic Followup
KeyboardController Callback (static)
KeyboardController Callback (member)
KB Data: 00 00 1F 20 21 1E 00 00
new_Data_Transfer
(Keys: 1234)
 
Paul, I get

KB Data: 00 00 01 01 01 01 01 01

with keys "3" + "4" + "5" pressed.

This is a bit strange. Or is it ok ? If not, take it as a bug-report ;-)

Do you add a joystick-driver, too ? :rolleyes::p (must-have :)))
 
Last edited:
Do you add a joystick-driver, too ? :rolleyes::p (must-have :)))

Eventually I will add a generic HID driver, but parsing HID reports is a huge task. Seems not even the well established USB Host Shield library has this.

The EHCI layer is also still missing several features to properly support interrupt type endpoints/pipes. Much, much work remains....
 
KB Data: 00 00 01 01 01 01 01 01

with keys "3" + "4" + "5" pressed.

My keyboard also has trouble with 3, 4, 5. I get this if I press them in order.

KB Data: 00 00 20 21 00 00 00 00

or this if I press 5, 4, 3:

KB Data: 00 00 22 21 00 00 00 00

Seems it won't recognize more than any 2 of those 3 keys.

Any chance you could try your keyboard on a PC with keyboard test utility? Maybe something like this?

http://www.softpedia.com/get/System/System-Info/Keyboard-Test-Utility.shtml

Keyboard-Test-Utility_3.png

Would be really good to know if the Windows USB host and HID drivers can correctly see those keys pressed with your keyboard.
 
Status
Not open for further replies.
Back
Top