USB Host Ethernet Driver

Well if you uncomment the STATS define at the top, you can see how many times each thread loops per second so you can guess what kind of effect it would have if you wanted to run other stuff at the same time in each thread. Plus it's multi-thread capable so I figured might as well since the Teensy can be multi-threaded.
 
And like I noted, once it has an IP address the usb can be unplugged and replugged, but currently in that example if you start with the usb unplugged or the ethernet cable unplugged you won't make it past setup.
 
Is there a way to detect no activity in usbthread() and give up time slice with:: threads.yield();

There are 4 calls in that func() - not sure of returns that say - I'm busy don't leave early? Or no activity, okay to leave early!

I did this quick HACK that gives extra time on first entry to make the connection {10 cycles} - then AFAIK is leaving sooner {after 2 cycles on later entry} - but allows PING to work once the first connect is made in the longer run. I offered a debug edit to TeensyThreads to monitor thread cycle allocation - that could be hooked up to monitor time spent in loop() versus usbthread()::
Code:
void usbthread() {
  uint32_t [B]cc = 0[/B];
  while (1) {
    myusb.Task();
    asix1.read();
    fnet_poll();
    fnet_service_poll();
#ifdef STATS
    LoopedUSB++;
#endif
    cc++;
[B]    if ( cc > 10 ) {
      cc=8;
[/B]      threads.yield();
    }
  }
}

Didn't see a reply or revisit the TeensyThreads thread recently - but timing showed about 10% overhead from fast thread switching.
 
Turned stats on again now knowing what it does:

With HACK noted above {updated below} - with STATS had to change number for initial connect to 20/18 (as below) - these numbers do not change during PING:
Code:
Looped: 8087042
LoopedUSB: 498
Looped: 8087495
LoopedUSB: 501
Looped: 8087476
LoopedUSB: 501
Looped: 8087474
LoopedUSB: 498

Without that HACK it looks like this:
Code:
Looped: 1548831
LoopedUSB: 109205
Looped: 3793681
LoopedUSB: 275274
Looped: 3771043
LoopedUSB: 278455

Updated HACK code:
Code:
#define HACKED 1
void usbthread() {
  uint32_t cc = 0;
  while (1) {
    myusb.Task();
    asix1.read();
    fnet_poll();
    fnet_service_poll();
#ifdef STATS
    LoopedUSB++;
#endif
#ifdef HACKED
    cc++;
    if ( cc > 20 ) {
      cc=18;
      threads.yield();
    }
#endif
  }
}
 
Yeah it probably doesn't need to be run that often I thought about putting the threads.yield(), but I haven't even looked into speeding anything up besides just having it's own thread. I know FNET has to be polled at the most once every 100ms so it's definitely overkill the amount of times it runs right now. On the 3.6 it gets about 600,000 in the main loop and 80,000 in the USB loop.
 
I am using T4 as noted - Great to know it is working on T_3.6 - I was wondering about digging one out - now I can skip that. Expected it would as the other Bluetooth/devices and MSC Disk USB was tested on both during T4 beta and that 'MCU USB' hardware is the same I hear.

Well that HACK gives idea that .yield() will help prevent killing performance on loop() or other threads. Interesting that PING didn't change the counts - so the transaction is FAST.
 
Yeah the most I've tested it with is 8 pings running concurrently each sending 464 bytes of data to it and it didn't seem to change that much.
 
Ok - finally getting back to this after doing a few things including my nap. What can I say I am old :) After updating TT library. It worked out of the box with the example sketch - thanks for converting the IP into plain text.

On first ping test it lost the first packet with a request timeout but on the next ping pass there was not missed packets:
Code:
C:\Users\Merli>ping -l 400 -n 10  192.168.1.191

Pinging 192.168.1.191 with 400 bytes of data:
Request timed out.
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=2ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64

Ping statistics for 192.168.1.191:
    Packets: Sent = 10, Received = 9, Lost = 1 (10% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 2ms, Average = 0ms

C:\Users\Merli>ping -l 400 -n 10  192.168.1.191

Pinging 192.168.1.191 with 400 bytes of data:
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64
Reply from 192.168.1.191: bytes=400 time<1ms TTL=64
Reply from 192.168.1.191: bytes=400 time=1ms TTL=64

Ping statistics for 192.168.1.191:
    Packets: Sent = 10, Received = 10, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 1ms, Average = 0ms
with stats on you tend to loose a packet. Might gain some by playing with timeslices as @defragster mentioned.
 
Welcome back @mjs513 :)

The only lost PING packet I've seen if the first after restart.

I now have three CMD windows open doing :: ping -l 440 -n 100 192.168.0.23

opps … one of them did just get timeout.

But with those running and my HACK in place here are loop STATS - working the same as if nothing going on:
Code:
Looped: 8087502
LoopedUSB: 498
Looped: 8087489
LoopedUSB: 501
Looped: 8087412
LoopedUSB: 501
Looped: 8087412
LoopedUSB: 498

Three sets of 100 pings completed - only 1 LOST. Max ms in the three are 2, 6, 8 and an average of 0 ms.
Code:
Reply from 192.168.0.23: bytes=440 time<1ms TTL=64

Ping statistics for 192.168.0.23:
    Packets: Sent = 100, Received = 99, Lost = 1 (1% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 2ms, Average = 0ms
 
Really cool that this is working on USB and it works on the T3.6 as well. Besides the threads.yield you could play with the time slices but I don't think it would do much with performance with missing packet.
 
As far as the missed packets, I don’t know if they are missing or just passing the timeout period because the ASIX chip has its own buffers for packets it accepts to the MAC address. So maybe they are getting lost somewhere or just the processing time of the USB takes to long.
 
With above HACK code in place putting loop() count at 8M versus 500 in USB_loop()::

I did another 3 sets of the same 100 in parallel with no loss.

Then 3 sets of 1,000 with sizes of 44, 340, 140 - same numbers for loopedUSB 498 or 501.
Code:
    Packets: Sent = 1000, Received = 999, Lost = 1 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 12ms, Average = 0ms

    Packets: Sent = 1000, Received = 999, Lost = 1 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 10ms, Average = 0ms

    Packets: Sent = 1000, Received = 1000, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 8ms, Average = 0ms

Not sure what the Win 10 default ping timeout is? I set to 100ms and got a timeout after the first dozen with : '>ping -l 440 -n 1000 -w 100 192.168.0.23'

Started again with 460 byte packet and 300ms timeout :: >ping -l 460 -n 1000 -w 300 192.168.0.23
So that finished with no loss here - with the loss above I'm assuming it was lost not timed out with 100ms wait - given the highest wait I've seen is 8ms. With bigger packets maybe that will point to something if there is an issue and not just NET noise?
Code:
    Packets: Sent = 1000, Received = 999, Lost = 1 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 8ms, Average = 0ms
 
I myself don't know what any of the tools do since I haven't felt liked booting over to Windows to find out.
 
I myself don't know what any of the tools do since I haven't felt liked booting over to Windows to find out.

That answers that question :) At least it will have OS coverage :)

The other two looked useless as it didn't show, and the other had to do with making a filesystem image or something?
 
I believe one of the things it has available with the original port is it can transfer files to the flash memory or something and host it. But that probably doesn't work because I've bypassed a lot of the low level cpu support that the stack originally had.
 
Alright fixes are implemented for large packets and the new versions are on GitHub, seems like pinging up to 1472 bytes is fine. Pinging 1472 bytes makes up a full 1514 byte long Ethernet message, any larger and the ping is split into multiple packs which I don't think this stack supports. Also it seems like pinging certain numbers makes the packet get dropped a lot of the time and I'm not sure why that is, maybe I have some code wrong somewhere that I haven't seen yet.
 
Honestly I forgot how much of a speed improvement you get just by putting while(1) in the main thread, ~45-50% increase. Also I added defragsters HACK and changed the numbers a little bit so the thread loops more on the T3.6.
 
Honestly I forgot how much of a speed improvement you get just by putting while(1) in the main thread, ~45-50% increase. Also I added defragsters HACK and changed the numbers a little bit so the thread loops more on the T3.6.

Pulled the newest … will run soon.

Always BIG numbers when nothing else going on! Every loop() exit calls a yield() func that checks for serialEvent() calling on the Serial? ports. Even a dummy "void yield(){}" gets called for nothing if they aren't needed. So staying in loop() makes a big diff.

A RAW T4 can do just over 20M cnt/sec and that overhead drops it quickly. It is even worse when doing a T4 test with 5 or 7 active Serial ports, did that testing in Beta - and since - and manipulating the calls for serialEvent() checks to 1/100 still cycles loop() 7M/sec more than enough to catch even 5Mbaud ports are active - calling more often just wastes time. AFAIK that is the one big compromise for Arduino- everything else just calls your code - used libs or PJRC optimized core code.


About the HACK - that was just a quick test - assumed those functions called might return or have status that would allows them to assert when they should run again before leaving.
 
Updated both lib from ZIP. Was a way cooler and hackier hack with the reset value not the same as the initial value :) It only pushed the LoopedUSB count up a small bit doing all 20 cycles per call from 500 to 3500 - versus 19.8M counts in loop().

T4's Loops count Stats ::
Code:
Looped: 198517353
LoopedUSB: 3507
That shows very low impact for running ethernet over USB - good find on the starting library and putting it on Teensy!
This is the SAME with three CMD's doing : ping -l 1472 -n 200 192.168.0.23::
Code:
Looped: 198502137
LoopedUSB: 3507
Looped: 198509529
LoopedUSB: 3486

doing a PING of 1440 bytes works - avg 2ms:
Code:
Reply from 192.168.0.23: bytes=1440 time=3ms TTL=64

Ping statistics for 192.168.0.23:
    Packets: Sent = 100, Received = 99, Lost = 1 (1% loss),
Approximate round trip times in milli-seconds:
    Minimum = 2ms, Maximum = 4ms, Average = 2ms

A second run:
Code:
    Packets: Sent = 100, Received = 99, Lost = 1 (1% loss),
Approximate round trip times in milli-seconds:
    Minimum = 2ms, Maximum = 3ms, Average = 2ms

And 3 copies of 1472 PING together - followed with another 3 and also NO LOSS:
Code:
    Packets: Sent = 200, Received = 200, Lost = 0 (0% loss),
    Minimum = 2ms, Maximum = 7ms, Average = 2ms

    Packets: Sent = 200, Received = 200, Lost = 0 (0% loss),
    Minimum = 2ms, Maximum = 5ms, Average = 2ms

    Packets: Sent = 200, Received = 200, Lost = 0 (0% loss),
    Minimum = 2ms, Maximum = 11ms, Average = 2ms
 
As far as I know the fnet polls don’t return anything, you just have to call them periodically when you are running it in a multi-threaded environment. The normal port of fnet would be using the built in timers of the processor to update the stack when not in a multi-threaded environment, but the way I have it setup to run its always running as multi-threaded so you just poll it every so often. It could be done with a timer in the main loop, but this seemed like a better solution to me so I could isolate any issues if the one thread slowed down without locking up the whole processor.
 
@vjmuzik @All - Is this library running on the T4? I am getting the following compile error:
Code:
/home/wwatson/arduino-1.8.9/hardware/teensy/avr/libraries/TeensyThreads/TeensyThreads.cpp: In member function 'int Threads::setMicroTimer(int)':
/home/wwatson/arduino-1.8.9/hardware/teensy/avr/libraries/TeensyThreads/TeensyThreads.cpp:260:46: error: 'IRQ_PIT_CH0' was not declared in this scope
   int number = (IRQ_NUMBER_t)context_timer - IRQ_PIT_CH0;
                                              ^
Using library USBHost_t36 at version 0.1 in folder: /home/wwatson/arduino-1.8.9/hardware/teensy/avr/libraries/USBHost_t36 
Using library TeensyASIXEthernet in folder: /home/wwatson/Arduino/libraries/TeensyASIXEthernet (legacy)
Using library TeensyThreads at version 1.0 in folder: /home/wwatson/arduino-1.8.9/hardware/teensy/avr/libraries/TeensyThreads 
Using library FNET at version 4.6.4 in folder: /home/wwatson/Arduino/libraries/FNET 
Error compiling for board Teensy 4.0.

I checked intervalTimer.h in T4 cores and there is no reference to 'IRQ_PIT_CH0'.
Tried to compile 'Tests.ino' from teensyThreads and recieved the same error message.
Using Arduino 1.89 and TD 1.48 B1.
 
@wwatson - Yes on Teensy 4.0 :: I've only run this on T4 so far.

Perhaps a newer version of TeensyThreads is needed. It was was recently updated for T4 support - that was the last copy I got in past weeks.
 
Yeah TeensyThreads was updated recently for Teensy 4.0 support and I had forgotten I’ve already updated mine.
 
Back
Top