Slow performance with Arduino Ethernet shield

Status
Not open for further replies.

elnino

Member
Hi all,
I have a project that uses the Arduino ethernet shield (w5100) for controlling WS2812b RGB Leds via sACN/E1.31 lighting protocol. This works exceptional on a normal Arduino but the amount of leds I can control is seriously limited by the 2k of SRAM on the arduino.

I bought a Teensy 3.1 thinking faster processor + 64k SRAM = much more LEDS that i could control but so far i have been disappointed with the performance. i.e I cant even get the teensy 3.1 to perform as well as a stock arduino controlling about 240 leds with the same code.

I have it connected as per: http://www.epyon.be/2013/07/06/using-the-teensy-3-0-with-the-arduino-ethernet-shield/ with the exception of IOREF since the 3.1 is 5v tolerant and that part is working fine.

The Teensy 3.1 works fine running about 1500 leds by itself - No problems there at all but when using ethernet, its as if the SPI bus is 'lagging' and cant receive the data fast enough and is missing the packets. If I delay the transmission of the packets to ~100ms, it seems to work better but at 20ms (where the arduino is fine) it basically will only process the first packet each time.

Is there a setting that defines the SPI bus speed or is there an alternative UDP library for Teensy that I am not using?

This is the working code on arduino:
Code:
#include <SPI.h>
#include <Ethernet.h>
#include <EthernetUdp.h>
#include "FastLED.h"

//*********************************************************************************

// enter desired universe and subnet  (sACN first universe is 1)
#define DMX_SUBNET 0
#define DMX_UNIVERSE 15 //**Start** universe

// Set a different MAC address for each...
byte mac[] = { 0x74, 0x69, 0x69, 0x2D, 0x30, 0x18 };

// Uncomment if you want to use static IP
//*******************************************************
// ethernet interface ip address
IPAddress ip(10, 0, 0, 18);  //IP address of ethernet shield
//*******************************************************

EthernetUDP Udp;

// By sacrificing some of the Ethernet receive buffer, we can allocate more to the LED array
// but this is **technically** slower because 2 packets must be processed for all 240 pixels.

/// DONT CHANGE unless you know the consequences...
 #define ETHERNET_BUFFER 540 
 #define CHANNEL_COUNT 360 //because it divides by 3 nicely
 #define NUM_LEDS 240 // can not go higher than this - Runs out of SRAM on Arduino
 #define UNIVERSE_COUNT 2
 #define LEDS_PER_UNIVERSE 120

// The pin the data line is connected to for WS2812b
#define DATA_PIN 7

//********************************************************************************

// Define the array of leds
CRGB leds[NUM_LEDS];

unsigned char packetBuffer[ETHERNET_BUFFER];

void setup() {
  // Using different LEDs or colour order? Change here...
  // ********************************************************
     FastLED.addLeds<WS2812B, DATA_PIN, GRB>(leds, NUM_LEDS);  
  // ********************************************************

 
  // ********************************************************  
  Ethernet.begin(mac,ip);
  Udp.begin(5568);
  // ******************************************************** 
}

void loop() {
   //Process packets
   int packetSize = Udp.parsePacket(); //Read UDP packet count
   if(packetSize){
    Udp.read(packetBuffer,ETHERNET_BUFFER); //read UDP packet
    int count = checkACNHeaders(packetBuffer, packetSize);
    if (count) {
      sacnDMXReceived(packetBuffer, count); //process data function
    }
  }
}


void sacnDMXReceived(unsigned char* pbuff, int count) {
  if (count > CHANNEL_COUNT) count = CHANNEL_COUNT;
  byte b = pbuff[113]; //DMX Subnet
  if ( b == DMX_SUBNET) {
    b = pbuff[114];  //DMX Universe
    if ( b >= DMX_UNIVERSE && b <= DMX_UNIVERSE + UNIVERSE_COUNT ) {  
      if ( pbuff[125] == 0 ) {  //start code must be 0
      int ledNumber = (b - DMX_UNIVERSE) * LEDS_PER_UNIVERSE;
       // sACN packets come in seperate RGB but we have to set each led's RGB value together
       // this 'reads ahead' for all 3 colours before moving to the next led.
       //Serial.println("*");
       for (int i = 126;i < 126+count;i = i + 3){
          byte charValueR = pbuff[i];
          byte charValueG = pbuff[i+1];
          byte charValueB = pbuff[i+2];
          leds[ledNumber].setRGB(charValueR,charValueG,charValueB);
          ledNumber++;
        }
      FastLED.show();  //Do it!
      }
    }
  }
}

int checkACNHeaders(unsigned char* messagein, int messagelength) {
  //Do some VERY basic checks to see if it's an E1.31 packet.
  //Bytes 4 to 12 of an E1.31 Packet contain "ACN-E1.17"
  //Only checking for the A and the 7 in the right places as well as 0x10 as the header.
  //Technically this is outside of spec and could cause problems but its enough checks for us
  //to determine if the packet should be tossed or used.
  //This improves the speed of packet processing as well as reducing the memory overhead.
  //On an Isolated network this should never be a problem....
  if ( messagein[1] == 0x10 && messagein[4] == 0x41 && messagein[12] == 0x37) {	
      int addresscount = messagein[123] * 256 + messagein[124]; // number of values plus start code
      return addresscount -1; //Return how many values are in the packet.
    }
  return 0;
}
 
I've added this to my list of issues to investigate. Any idea what I should use to send UDP packets to this program?

Usually the W5200 chip is used with Teensy 3.1, usually as the WIX820io module. It performs much better than the W5100 chip.

But the W5100 chip is supported, and it should run at a similar speed as regular Arduino. I don't know why you're seeing such different performance.
 
I have had the same issue with the UDP lib. Traced it down to the Udp.flush() function in the parsePacket() function. The flushing took an astonishing 34ms while all the physical receiving took only some tens of microseconds.
I have found a working (while preliminary) solution by changing the Default flush() function to:

Code:
void EthernetUDP::flush()
{
    byte test[_remaining];
    read(test,_remaining);
    _remaining = 0;
}

Maybe that is of help to someone. For me the thing sped up significantly while still receiving correct data.
 
I appreciate the input steckel but it did not make any difference for me. I changed my code to use OctoWS and it is a little bit better but still quite laggy. I think the problem is purely that the ethernet module can not handle getting the packets all at once. The best i could get at even a moderate (~15fps) speed was 480 leds. I even tried a W5500 (which is supposed to have 6 pipes) and it was no better. Ironically, a ENC28J60 module outperformed the wiznet items. I was really hoping i would be able to control all of the strips with one MCU but I will probably have to go back to using 6 individual controllers on 240 leds each.
 
I've used the Wiznet modules... its speed is dependent on how your code and driver handles the sockets and DMA vs. polling, etc.
Your reference to 'pipes' probably means the sockets - each socket has RAM on the chip for TX and for RX. And for each, the RAM is larger than the IP packet MTU default.
It works best if your code implements a byte-stream rather than a one packet at a time.
Lots of the drivers and code from Arduino-land is rather quick-and-dirty, intended to toss something together for low packet rates, small average packet sizes.

The Wiznet chips/modules are very good in my experience. The TCP/IP/UDP protocol stack off-load from the MCU is very beneficial.
 
Just a quick followup to this old thread, since I've been working recently on Ethernet library improvements.

I have had the same issue with the UDP lib. Traced it down to the Udp.flush() function in the parsePacket() function. The flushing took an astonishing 34ms while all the physical receiving took only some tens of microseconds.

Indeed the code in parsePacket() which discards the any previously unread data was horrible. It would fetch the data 1 byte at a time. Each byte would require re-reading multiple 16 bit socket registers from the chip. Then each single-byte transfer would go through the command-check process to update the Wiznet chip's buffer pointers. So horribly inefficient.

A little over 1 year ago, I changed the socket layer to cache those registers. That eliminates a huge amount of the SPI overhead to re-read them over and over. But the command overhead was still being done.

Yesterday I changed the read(buffer, size) function to allow a NULL pointer. It simply discards the data, but goes through the process of updating the socket state in the chip the same as if you had read all the data. It's much more efficient than even the workaround in msg #3, since it skips all the SPI communication to actually read the data bytes from the Wiznet chip and only does the work of updating the buffer pointers with a single command to the Wiznet chip.

I've updated UDP parsePacket() to use this new efficient NULL block read. I also made several updates in DHCP & DNS to use it, where the same inefficient 1-byte-at-a-time reading was being used only to skip past unwanted data.

This code is on github now. Soon it will begin the official Ethernet library for Teensy and Arduino (yes, they're going to accept it soon....) so everyone using any Arduino board will get these updates. :)

https://github.com/PaulStoffregen/Ethernet
 
Status
Not open for further replies.
Back
Top