wiz820io max speed ?

Status
Not open for further replies.

nlecaude

Well-known member
I was wondering if there were optimizations beyond the 24mhz setting for the wiz820io that could be done to achieve better performance ?
People using the Artnet library keep reporting that the setup struggles when transferring more that 7-8 universes of data.
8 universes would represent this amount of data:
8 universes * 530 bytes * 44 frames per second = 186560 bytes per second = 0.1865 MB/s which doesn't seem that much...

Should we expect better performance from the wiz820io or is that a normal behaviour ?

Thanks !
 
in tests conducted more than a year ago, I was measuring more than 10 megabits/sec (UDP and TCP) on the wiz820io with the teensy 3. see
https://github.com/manitou48/DUEZoo/blob/master/wizperf.txt
24 mhz is the fastest that the SPI clock can run on the teensy 3. I'm not sure what SPI library the current Ethernet libraries use, but the SPI implementation will dictate max speed. Of course, delay/bandwidth over the Internet can vary.
 
My ISP's upstream bandwidth net yield at IP layer is 5Mbps. 56Mbps downstream.
It's fairly typical. And mine is top-tier on beloved TimeWarner.
 
The Arduino Ethernet library is filled with spectacularly inefficient code. I've tried to optimize parts, especially for the W5200 chip.

UDP used by Artnet should be using the faster code fairly well, if the entire packet is transmitted or received in one large transfer. TCP should be better too, if data is moved in chunks. But if it's moved 1-byte-at-a-time, there's no per-socket buffering within the Ethernet library, so each request turns into a lot of Wiznet register overhead.

If I ever look into UDP receive performance, it's going to need to be with a small Linux-based test program (written in C) to send the UDP packets, and of course a corresponding sketch on the Teensy side to receive them and check something like an incrementing number within the data, to detect how many were received and how many were missed. (hint... hint....)
 
If I ever look into UDP receive performance, it's going to need to be with a small Linux-based test program (written in C) to send the UDP packets, and of course a corresponding sketch on the Teensy side to receive them and check something like an incrementing number within the data, to detect how many were received and how many were missed. (hint... hint....)

Though not fresh data, the UDP receive rates in the wizperf.txt in reply 2 above is based on such a Linux-based test program -- a rate-based UDP transmitter, and i manually adjust the rate til i get no losses at the receiver
 
One curious thing is that on the Art-Net thread, someone said performance was way better when he switched his Macbook ethernet speed from 100baseT to 10baseT...
 
One curious thing is that on the Art-Net thread, someone said performance was way better when he switched his Macbook ethernet speed from 100baseT to 10baseT...

That's not unreasonable for a single-hop path. As noted, I measured that the Wiznet could recieve at around 10mbs, so a 10baseT link would effectively rate-limit transmitters and match the Wiznet receive rate. That could have a noticable effect on UDP performance. TCP should adapt to available bandwidth, but even Wiznet TCP apps might run slightly faster.

Multi-hop paths are harder to analyze and performance could vary depending on available buffer space at the bottleneck hop(s).

The Wiznet can receive some small number of packets at link speed. As I recall, one can configure the Wiznet buffers, so you might get additional performance for a given app/path.
 
Use the Wiznet buffers in a streaming fashion. Let the Wiznet processor have enough buffer to catch at least 2 MTU sized packets in its buffer. Your reads can be byte-streams not packet at a time.

IF your data source is across the Internet on a cumulative slow hop path, or on cellular, or other high latency path, be aware that Wiznet doesn't support TCP Window size negotiation. This bit me on a cellular link where the delays were 100's of mSec in the busy hour. The TCP ACKs were too late, and frequent needless retransmissions occurred. Cure was to disable windowing, at the expense of ideal throughput on long delay paths.
 
Last edited:
The Wiznet can receive some small number of packets at link speed.

It should be able to do so, but can it really?

This is actually on my very long list of low priority stuff to investigate (so low I'll probably never get to it). I've seen conversations where some people insisted this might not actually work in practice, with Arduino's Ethernet library code.


As I recall, one can configure the Wiznet buffers, so you might get additional performance for a given app/path.

Indeed, on my port of the Ethernet library, I kept the number of sockets at only 4, so each would get double the buffer memory. Whether that actually helps performance is a good question... and one that will take probably a few solid days to investigate.

If only there were a lot more hours in every day!
 
Your reads can be byte-streams not packet at a time.

This, I can assure you, performs terribly with the Arduino Ethernet library. It matters little whether the microcontroller is fast. The Wiznet protocol simply has massive overhead. I have indeed tested this very thoroughly.

Each read operation has very substantial overhead. If you only read a single byte, it's incredibly slow. If you read in even modestly sized chunks, like 100 bytes, the performance is much improved. But it's horribly slow for the simple examples (which are common in the Arduino world) which read 1 byte at a time.

This too is on my very low priority list of stuff to someday improve. Unlike the multiple packet buffering issue, I have looked into this in great detail and I do have a plan for how to dramatically improve this. But it's so low on the priority list, seems unlikely I'll get to it until 2016 or maybe even 2017.
 
Last edited:
Re: The Wiznet can receive some small number of packets at link speed.

It should be able to do so, but can it really?

This is actually on my very long list of low priority stuff to investigate (so low I'll probably never get to it). I've seen conversations where some people insisted this might not actually work in practice, with Arduino's Ethernet library code.

I can't find my notes/experiments confirming the the wiz820io can receive at link-speed (up to the configured stream buffer capacity). The implication was that the wiz chip could receive a few packets at link speed, not that you would get link-speed through the SPI exchanges to the MCU. So a couple of back-to-back UDP packets would not be lost, or similarly, if TCP window size was 2K, then a couple of MTU TCP packets might be received on the wiz chip at link speed without loss.

i'll try to verify...
 
Re: wiz820io receiving at link speed

The W5200 chip has 32KB for socket buffers. In w5100.h Paul has MAX_SOCK_NUM 4, so for each socket there is a 4KB receive buffer and a 4KB transmit buffer. For UDP, I can have my linux box send a burst of 4 1000-byte UDP packets to the wiz820io, the linux C program reports its sending at 80 mbs. From teensy 3.1, I successfully receive the 4 packets. If I blast 5 packets, only 4 are received. If I blast 8 packets, only 5 are received. So the wiz chip can receive at link speeds, if buffers are available. As noted above, the SPI data rate into the teensy will be limited by the SPI CLK (12 mhz default, 24 mhz max). I measured the sustained (lossless) UDP 1000-byte packet receive rate at about 10 mbs.

wiznet TCP advertises an initial window of 4096 bytes.
 
Last edited:
manitou, that's good news !
So in case of Artnet is there something that can be done within the library to optimize speed ?
Basically Artnet clients send 530 bytes of data, usually at 44 times a second.

On my side here's what I do:

Code:
const MAX_BUFFER_ARTNET = 530;
Udp.read(artnetPacket, MAX_BUFFER_ARTNET);
 
530-byte packets at 44 per second is less than a 1 mbs, BUT if (1) the packets come in bursts then the peak rate might exceed the buffer space on the wiznet chip and you would experience dropped packets, or (2) the amount of work you are doing on the teensy is keeping you from reading the UDP packets at an "acceptable rate". I don't have an Artnet "packet generator", but if you could plot/analyze the Artnet transmissions (rate, density ...), then you might know where to look for tuning points.

in w5100.h you could reduce the MAX sockets from 4 to 2, that might/should double the receive buffers, which might smooth out bursty packet arrivals.

you could also increase the SPI speed from 12 to 24 mhz (w5100.h w5100.cpp).
 
Last edited:
The size of the wiz 5xxx chip's RAM is for any one port is configured by a command over the SPI interface/commands. There's a hardware-defined upper limit. The lower limit is the MTU size in use - or if you're brave, the expected max IP packet size (not a good idea to be less than the MTU).
Changing the number of sockets to be used thus relies on some driver code to calculate the buffer size - which it may not do. Instead, it may rely on code or constants to define the buffer size for each socket, and the RX and TX for that socket may differ.
 
Last edited:
...thus relies on some driver code to calculate the buffer size - which it may not do. Instead, it may rely on code or constants to define the buffer size for each socket, and the RX and TX for that socket may differ.

Yep, you're right, w5100.cpp sets the buffer size to 4096

Code:
...
    CH_BASE = 0x4000;
    SSIZE = 4096;
    SMASK = 0x0FFF;
    TXBUF_BASE = 0x8000;
    RXBUF_BASE = 0xC000;
    for (i=0; i<MAX_SOCK_NUM; i++) {
      writeSnRX_SIZE(i, SSIZE >> 10);
      writeSnTX_SIZE(i, SSIZE >> 10);
    }
    for (; i<8; i++) {
      writeSnRX_SIZE(i, 0);
      writeSnTX_SIZE(i, 0);
    }
...
 
Some driver code derived from the gawd-awful Microchip boards just slapped the same concept in a Wiznet version.
 
Hello,

I have the same problem with artnet over udp with WIZ820io doesn't push more then 7 universes on 100MBit.
I have to push 11 universes. That works fine on 10MBit but since I have about 100 universes in total, I need a 100MBit connection.

Is there any solution to the big overhead of the ethernet library?
Or is there any faster library?

Thanks,
Birk.
 
need proper driver with block buffered I/O.

what does it mean?
Is there such a driver around?

I use the Ethernet and EthernetUdp Library from Paul's GitHub right now.
Any other Librarys with block buffered I/O avaible?
 
what does it mean?
Is there such a driver around?

Without specifics this probably doesn't mean much of anything.

Is there any solution to the big overhead of the ethernet library?
Or is there any faster library?

As you can see on github, I've put quite a bit of work into some modest improvements, mostly to avoid redundant reading of the per-socket registers. In a month or two, I'm going to work on the library again to add W5500 support and probably more incremental performance improvements.

I have the same problem with artnet over udp with WIZ820io doesn't push more then 7 universes on 100MBit.
I have to push 11 universes. That works fine on 10MBit but since I have about 100 universes in total, I need a 100MBit connection.

When I work on the ethernet lib again, I'd like to test this case. But I don't have any artnet software, nor any experience with artnet - other than reading some E1.31 protocol specs.

Presumably you're doing something with the incoming data, like retransmitting it to addressable LEDs. That matters. Testing and optimizing for only receive data to Teensy's memory isn't enough. I'd really like to come up with a way I can conveniently test this while working on the library.... but how? Any ideas?
 
When I work on the ethernet lib again, I'd like to test this case. But I don't have any artnet software, nor any experience with artnet - other than reading some E1.31 protocol specs.

Presumably you're doing something with the incoming data, like retransmitting it to addressable LEDs. That matters. Testing and optimizing for only receive data to Teensy's memory isn't enough.

Yes, I do transmit the data to LED's with the FastLed Library. But also when I just receive the artnet packages and print them out, like done in the library https://github.com/natcl/Artnet in the example https://github.com/natcl/Artnet/tree/master/examples/ArtnetReceive , I have the same speed problems. So the retransmitting to LED's doesn't matter in this case.

I'd really like to come up with a way I can conveniently test this while working on the library.... but how? Any ideas?

I try to write you a processing sketch later, that simply pulls out artnet information on 100 universes.
 
Status
Not open for further replies.
Back
Top