Any suggestions for maximising ethernet data rates on Teensy 4.1

AndyA

Well-known member
I have a project where I need to reliably receive a lot of data on the ethernet port of a Teensy 4.1. Ideally I'd like to hit around 20 Mbit/s, not a lot on some scales but a lot for something like a teensy.

Since I need reliability I'm using TCP rather than UDP. Since I need high data rates I'm using a raw TCP socket stream rather than any higher level protocols.

I'm currently using the Teensy41_AsyncTCP library to handle the network layer. I and then using TeensyThreads.h to run separate threads that take the received network data and push it out on the FlexIO interface.

Logically my data is being sent in 1k packets from a PC application. As soon as the data rate becomes anything over a few kbytes/s the windows IP stack starts combining and fragmenting packets. My teensy code is handling this and recombining / splitting the network packets back into my logical packets as they arrive. This code all works correctly for moderate data rates.

However once I dial the data rate up to around 4 Mbits/s I start to see errors in the data that would imply that packets are getting lost. Exactly where/how I'm not entirely sure at this point.

I have a few tricks left up my sleave to try to improve this performance but don't have much hope for them making a huge difference.

Has anyone else tried to do something like this or have any ideas or tips on how to get more data through the network port? Even something along the lines of "Don't use that library, it's really slow, use this one instead." would be appreciated.
 
Just for the benefit of anyone coming across this later I did manage to hit 70 Mbits/s in the ethernet and out over flexIO ports.
The two changes I made were to ditch the threading library, it's now all interrupt based with a single background loop, and I minimised the ethernet packet Rx callback.
When a packet arrives it is dumped into a buffer with no processing.
The background loop checks the network receive buffer, splits it into logical packets, removes the headers and pushes the rest into a second buffer of data waiting for the flexIO interface.
It then checks the flexIO status and flexIO buffer and starts an new block of flexIO transfers if needed.

Just goes to show how much difference in performance you can get by doing the same thing in a slightly different way.
 
That 70 Mbps is promising, and good effort to get from errors at 4 Mbps.

Using PC Packet TX and RX to Teensy in other tests with other libraries have shown that 70 and more ... haven;t looked in a while but one post heading said 90 Mbps.
 
Using PC Packet TX and RX to Teensy in other tests with other libraries have shown that 70 and more ... haven't looked in a while but one post heading said 90 Mbps.

I have managed to get over 90 Mbps at times but hit trouble avoiding buffer overflows/underruns at that point. I need to transfer data from the ethernet to the flex IO port in a way that maintains a very constant data rate on that IO port. Basically when the external device indicates it is ready for more data I had better be ready to supply it. I'm using an external QSPI RAM chip to give me a nice large buffer but once you have any sort of network latency the flow control to keep that buffer at a nice level starts getting tricky.

At 70 Mbps I can maintain a reasonable buffer level and as far as I can tell all the data is getting passed through as expected without any missing blocks.
 
Back
Top