Teensy 4.1 - W5500 using SPI DMA

I'd like to report absolutely dismal performance with QNEthernet/W5500/IPerfServer example (note: the driver uses MACRAW mode):
I'm seeing about 350 Kibps. That's at 30MHz SPI. At 14MHz, < 200 Kibps.
I tried with both interrupts enabled (not the default) and without (kSocketInterruptsEnabled in driver_w5500_config.h).

The link is connected directly to my Mac laptop (Belkin USB-C Ethernet adapter, macOS 15.7.3), and I'm using static IPs.

I haven't looked at this too closely yet, but it's either how I wrote the code, or something to do with raw mode (MACRAW), or both. Sigh.

I'd love to see others do these measurements, though, to rule out my specific hardware. (I only have one W5500 adapter.)
 
Last edited:
Steps:
1. Uncomment QNETHERNET_DRIVER_W5500 in qnethernet_opts.h
2. Change any parameters you need in qnethernet/drivers/driver_w5500_config.h
3. Build and run the IPerfServer example as normal

Depending on your setup (for example, my tests were done with a direct connection to my laptop via an Ethernet USB-C adapter), you might want to modify the example slightly to use a static IP: Ethernet.begin(ip, subnet, gateway)
 
Some additional guidance: If you happen to look at driver_w5500.c:
1. PHY initialization in driver_init() (and then low_level_init())
2. driver_proc_input() polls for input frames
3. driver_poll() is called regularly to check link status
4. driver_outputl() is used to send frames (sendFrame())

Those Register classes are used to actually write and read the registers, similar to how you made those classes to make what look like AVR-compatible registers.
 
Ran it just now. Confirm slow speed here.

Probably not worth a lot of effort since the normal Ethernet.h library is well optimized.

@istrateandrei26 - Please let us know what speed you get with the Ethernet.h iperf code (msg #25)?

Code:
iperf -i 1 -c 192.168.195.240
------------------------------------------------------------
Client connecting to 192.168.195.240, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.194.2 port 35120 connected with 192.168.195.240 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  65.6 KBytes   537 Kbits/sec
[  1] 1.00-2.00 sec  54.2 KBytes   444 Kbits/sec
[  1] 2.00-3.00 sec  68.4 KBytes   561 Kbits/sec
[  1] 3.00-4.00 sec  62.7 KBytes   514 Kbits/sec
[  1] 4.00-5.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 5.00-6.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 6.00-7.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 7.00-8.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 8.00-9.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 9.00-10.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 10.00-11.27 sec  25.7 KBytes   166 Kbits/sec
[  1] 0.00-11.27 sec   585 KBytes   425 Kbits/sec
 
I'd like to report absolutely dismal performance with QNEthernet/W5500/IPerfServer example (note: the driver uses MACRAW mode):
I'm seeing about 350 Kibps. That's at 30MHz SPI. At 14MHz, < 200 Kibps.
I tried with both interrupts enabled (not the default) and without (kSocketInterruptsEnabled in driver_w5500_config.h).

The link is connected directly to my Mac laptop (Belkin USB-C Ethernet adapter, macOS 15.7.3), and I'm using static IPs.

I haven't looked at this too closely yet, but it's either how I wrote the code, or something to do with raw mode (MACRAW), or both. Sigh.

I'd love to see others do these measurements, though, to rule out my specific hardware. (I only have one W5500 adapter.)
Thank you for doing this benchmark.
I have managed to test QNEthernet before I posted the question and I got slow and weird speed results also (I used static IP too + USB-C to Ethernet adapter). I did not have so much time to dig into the problem so as a consequence I discarded QNEthernet from my tests and then I saw W5500 ioLibrary achieved something in terms of Mbps...
Ran it just now. Confirm slow speed here.

Probably not worth a lot of effort since the normal Ethernet.h library is well optimized.

@istrateandrei26 - Please let us know what speed you get with the Ethernet.h iperf code (msg #25)?

Code:
iperf -i 1 -c 192.168.195.240
------------------------------------------------------------
Client connecting to 192.168.195.240, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.194.2 port 35120 connected with 192.168.195.240 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  65.6 KBytes   537 Kbits/sec
[  1] 1.00-2.00 sec  54.2 KBytes   444 Kbits/sec
[  1] 2.00-3.00 sec  68.4 KBytes   561 Kbits/sec
[  1] 3.00-4.00 sec  62.7 KBytes   514 Kbits/sec
[  1] 4.00-5.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 5.00-6.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 6.00-7.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 7.00-8.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 8.00-9.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 9.00-10.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 10.00-11.27 sec  25.7 KBytes   166 Kbits/sec
[  1] 0.00-11.27 sec   585 KBytes   425 Kbits/sec
Sure, I'll test it today and get back with the results. Thanks a lot for the dedication and help! :)
 
Yep. Paul’s Ethernet library is the way to go for speed, with a W5500. I think the reason the QNEthernet library’s W5500 driver is so slow is because it uses that MACRAW mode. I use it for raw frames only and provide the TCP/IP stack on the CPU, as well. (I’m still going to dig in some more, though.)

Funny enough, I don’t think I ran speed tests before…

With the Teensy’s native Ethernet PHY, I’ve seen speeds up to about 95 Mibps.
 
@PaulStoffregen
@shawn

I have tested the code you provided and achieved ~20 Mbps at 30 Mhz SPI.

As I saw previous claims about QNEthernet, as it possibly slows the transfer because of the MACRAW mode, I have some additional questions.

In my project I want to simulate a router (Teensy 4.1 with 2 ethernet interfaces - Native + W5500).
The main goal is to achieve packets/frames on an interface and forward them on the second one:
- is it possible to configure both interfaces ? I managed to configure individually the native interface using FNET; and the W5500 interface using Ethernet library.
- is it possible to parse the packets/frames so that I can take decisions based on the frame info?
- is it possbile to receive frames on an interface and forward them on the other one? (and vice-versa)

I asked this because now I am afraid of that MACRAW mode, which is possibly needed for this kind of tasks.

Besides that, can we conclude that Nucleo-F412 benchmark code is providing a greater throughput because of hardware differences (the manner the W5500 chip is physically connected) ?
 
Last edited:
can we conclude that Nucleo-F412 benchmark code is providing a greater throughput because of hardware differences

They're using 50 MHz SPI clock. If you edit w5100.h for 50 MHz, you'll see similar performance with Teensy's Ethernet.h library.

EDIT: Teensy will actually use 48 MHz clock because the default config creates the SPI clock by dividing 240 MHz by an integer clock divisor. The base clock available to SPI is controlled by the CCM_CBCMR hardware register.

Both benchmarks are achieving data rates approx 85% to 86% of the actual SPI clock speed.

Even though 48 MHz seems to work, please keep in mind that's considered overclocking. NXP's datasheet says 30 MHz is the maximum rated speed for SPI. Likewise, Wiznet claims "up to 80 MHz" but their datasheet says the parts are only tested to work at 33 MHz.
 
Last edited:
They're using 50 MHz SPI clock. If you edit w5100.h for 50 MHz, you'll see similar performance with Teensy's Ethernet.h library.
Actually I have the same clock for SPI (SPISettings(50000000, MSBFIRST, SPI_MODE0)) - 50 Mhz - and the throughput is still 20 Mbps...
I also checked that I use the correct Ethernet library, according to your previous insights :)

I figured that I use Teensyduino 1.56. Do you think updating to the last version would bring changes that impact this matter?
 
Definitely update to Teensyduino 1.60-beta6. There were lots of fixes. I’m not actually sure if it will impact your project, offhand, but try it.
 
I tried it with Arduino IDE 2.3.7 and Teensy software 1.56.2 and w5100.h edited to 30 MHz. Ran iperf on Linux desktop machine. Test communicates on a LAN through 2 ethernet switches.

Result is 26.44 Mbit/sec.

I don't know why you're seeing a slower result, but this should at least confirm it isn't anything about the old 1.56 software.

1770666120108.png
 
Here is one more test, with old 1.56.2 Teensy software and w5100.h edited to 48 MHz SPI.

Result is 40.98 Mbit/sec, which is 85% of the SPI clock speed.

1770666828922.png
 
Here's an update with the QNEthernet W5500 driver: I've been trying to improve it by focusing on receive buffering. Instead of reading one frame at a time from the chip, I'm now reading and buffering the chip's whole RX buffer, which may contain multiple frames. This requires fewer accesses over SPI.

The iperf speeds are now about half what they were before. Argh.
 
Last edited:
…but they both require at least one or two register reads (without an actual interrupt line): either just the size available or, with interrupts, the interrupt register and then the size register. We probably have the same W5500 module; I’ll see if the interrupt line is connected…
 
I tried it with Arduino IDE 2.3.7 and Teensy software 1.56.2 and w5100.h edited to 30 MHz. Ran iperf on Linux desktop machine. Test communicates on a LAN through 2 ethernet switches.

Result is 26.44 Mbit/sec.

I don't know why you're seeing a slower result, but this should at least confirm it isn't anything about the old 1.56 software.

View attachment 38774
Well, I was using a media converter adapter to connect the W5500 interface directly to the PC. I changed it with another one and managed to achieve 40 Mbps even using ioLibrary (at 48 Mhz SPI).

Somehow something is still strange as with Ethernet library it reaches up to 33 Mbps.

IMPORTANT: This test have been done without upgrading Teensyduino version.

Thanks a lot for your suggestions until here. 🙏

As the setup with both interfaces configured is concerned, is it possible to initialize them and somehow receive and send raw frames?
I'm looking forward to seeing this.
 
The tested code is the same as yours (from msg#25). I used Arduino IDE 1.8.13 and Teensyduino 1.57 in fact (I said before by mistake that I use 1.56...).
The hardware setup is simple and classic: Teensy 4.1 board with W5500 SPI attached ethernet chip. The link between the W5500 interface and PC was assured via an ethernet cable through an Ethernet USB-C adapter)

I kept changing the SPI CLK values (incremental) and observed the variation of throughput values.
 
I finally found the issue!! :)
Well, issues.

...five days later... [In French accent]

I tried everything I could think of, from different Ethernet-USB-C adapters to a multitude of adjustments to the code, to different packet sizes, to examining the iperf protocol (the iperf2 code is hard to go through, and there's no documentation about the protocol that I could find; I went through it a bit when I first wrote the IPerfServer example). So then I got the bright idea of trying out something similar to the simple "just read data" code that Paul showed. And wouldn't you know it...

...I started seeing 10Mbps speeds (via wireless->eero node->Teensy; more on that later). So that's the first problem. Something's up with the IPerfServer example. When I first wrote it, it worked great. Then the protocol started changing as the future became the present. It's on my to-do list to fix this. But for now, I'm not doing something correctly there, and I was seeing mostly "0" speeds because of how I'm doing the state machine for the iperf2 protocol. I didn't dive too deeply into that.

It's not really "next"; more "concurrently", I was testing in two different ways:
1. Teensy->eero node->wireless->Intel Mac laptop w/15.7.3 (this is where I started seeing the 10+Mbps speeds, sometimes up to ~13.3Mbps)
2. Teensy->Belkin Ethernet USB-C adapter (F2CU040) ->laptop

The second configuration was what I was using for my "major speed tests", thinking that traffic<->wireless would be lots slower than a direct connection. When I had, in the past and a while back, compared the two using the Teensy 4.1's native ethernet, this assumption was correct. I was seeing ~95Mbps with that adapter and < 20Mbps over wireless. But...

...here's where it gets more interesting. Now that I thought I found the problem, I switched over to setup #2, and lo and behold... I was seeing those 350kbps numbers again! Aha! It's the adapter! I tried rebooting, finding new drivers (I didn't), and finally two other adapters: An Anker one and a CalDigit TS3 Plus dock. Both exhibited the same behaviour as the Belkin adapter: 'round about 350-400kbps or so. I also tried rebooting and leaving the Belkin connector plugged in, thinking something needed "a good shaking out". But I was still seeing sub-1Mbps speeds.

It also turned out that my driver was mostly fine. I still improved it some, though.

I also found a mistake in how I was using the iperf command. I thought that the -l 1460 set the maximum packet payload size. Well, it doesn't. For TCP, it sets the read/write buffer size. I think the default is 128k, so after I removed that -l option, the speed-over-wireless jumped by about 3Mbps to ~13Mbps. Niiiicccce.

TLDR:
1. The IPerfServer example is faulty; a simple "read all the data from the socket" program showed much higher speeds than "mostly zero".
2. It was the USB-C network adapter(s) (I tried three different ones). They had over 10x worse performance than over wireless, which is usually slower than a direct connection: ~350-450kbps; wireless was getting > 10Mbps.
3. The driver was fine; I still improved it over this process, though.
4. I'm going to include a new SimpleIPerfServer example that just reads from the socket until closed.
5. I can get even better performance by not setting the TCP write buffer to 1460 bytes. :(

I have an ask:

Surprisingly, I don't have near me any machines with a direct Ethernet connection. Would someone be willing to try the new SimpleIPerfServer example with iperf (not iperf3) and see what speeds you're getting without using an Ethernet adapter? Note that you may need to set a static IP; to do this, just initialize the `kStaticIP`, `kSubnet`, and `kGateway` IPAddresses to something. The `iperf` command:
Bash:
iperf -c <IP_address> -i 1

You can simply get the latest on GitHub (https://github.com/ssilverman/QNEthernet) and overwrite everything in the Arduino libraries/ folder. I haven't yet released a v0.35.0 that you could get via the Arduino IDE.

Thanks for reading all this!
 
Would someone be willing to try the new SimpleIPerfServer example with iperf (not iperf3) and see what speeds you're getting without using an Ethernet adapter?

Sure. With the native ethernet port on Teensy 4.1:


Code:
iperf -c 192.168.195.236 -i 1
------------------------------------------------------------
Client connecting to 192.168.195.236, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.194.2 port 57334 connected with 192.168.195.236 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  11.3 MBytes  94.4 Mbits/sec
[  1] 1.00-2.00 sec  11.4 MBytes  95.4 Mbits/sec
[  1] 2.00-3.00 sec  11.2 MBytes  94.4 Mbits/sec
[  1] 3.00-4.00 sec  11.4 MBytes  95.4 Mbits/sec
[  1] 4.00-5.00 sec  11.4 MBytes  95.4 Mbits/sec
[  1] 5.00-6.00 sec  11.2 MBytes  94.4 Mbits/sec
[  1] 6.00-7.00 sec  11.4 MBytes  95.4 Mbits/sec
[  1] 7.00-8.00 sec  11.2 MBytes  94.4 Mbits/sec
[  1] 8.00-9.00 sec  11.4 MBytes  95.4 Mbits/sec
[  1] 9.00-10.00 sec  11.2 MBytes  94.4 Mbits/sec
[  1] 0.00-10.01 sec   113 MBytes  94.9 Mbits/sec

Code:
Starting program...
MAC = 04:e9:e5:1a:34:11
Starting Ethernet with DHCP...
[Ethernet] Link ON, 100 Mbps, Full duplex
[Ethernet] Address changed:
    Local IP = 192.168.195.236
    Subnet   = 255.255.254.0
    Gateway  = 192.168.194.1
    DNS      = 192.168.194.1
Accepted connection: 192.168.194.2:57334
Processing input...
Rate = 94925.1 kbps
 
With a Wiznet W5500 and src/qnethernet_opts.h edited to uncomment #define QNETHERNET_DRIVER_W5500

Code:
iperf -c 192.168.195.236 -i 1
------------------------------------------------------------
Client connecting to 192.168.195.236, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.194.2 port 55168 connected with 192.168.195.236 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-1.00 sec  91.2 KBytes   748 Kbits/sec
[  1] 1.00-2.00 sec  48.5 KBytes   397 Kbits/sec
[  1] 2.00-3.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 3.00-4.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 4.00-5.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 5.00-6.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 6.00-7.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 7.00-8.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 8.00-9.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 9.00-10.00 sec  51.3 KBytes   420 Kbits/sec
[  1] 10.00-11.08 sec  25.7 KBytes   194 Kbits/sec
[  1] 0.00-11.08 sec   576 KBytes   426 Kbits/sec

Code:
Starting program...
MAC = 04:e9:e5:1a:34:11
Starting Ethernet with DHCP...
[Ethernet] Link ON, 100 Mbps, Full duplex
[Ethernet] Address changed:
    Local IP = 192.168.195.236
    Subnet   = 255.255.254.0
    Gateway  = 192.168.194.1
    DNS      = 192.168.194.1
Accepted connection: 192.168.194.2:55168
Processing input...
Rate = 425.839 kbps
 
Back
Top