Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 25 of 25

Thread: Teensy 3.6 and WIZ820io low performance/slow transfer speed

  1. #1
    Junior Member
    Join Date
    Jun 2017
    Posts
    12

    Teensy 3.6 and WIZ820io low performance/slow transfer speed

    Hey all,

    I'm working on a large scale LED project with about 190000 LEDs driven by various Teensy 3.6 boards with the OctoWS2811 library each driving 8600 LEDs. This is working very well at max 29fps.
    Now, the LED displays are distributed to different places on an area. 4 Teensies are grouped together and drive one wall. Those groups sync over ethernet due to the long distances between them. But I also need to transfer the videos via Ethernet to them if i want to change the content (not realtime). They are about 5 to 10 minutes long and the Teensies should put them on their µSDCard. That means, that one teensy needs to receive about 194Mb for 5Min Video in the OctoWS2811 binary format. I was hoping that the WIZ820io or WIZ850io could handle this. But the transfer performance is very slow. I only get 66KByte/sec transfer rates and I couldn't find any hints so far what is going wrong. I also don't understand how I could try different SPI libs as suggested in other quite old posts.

    I'm working with Arduino 1.8.2 with Teensyduino 1.36. I connected the WIZ820io directly to the Teensy 3.6 (no adapter)

    I reduced the whole prog to a test sketch based on the DHCP Chat server example. The video is send by a python script.

    Teensy sketch
    Code:
    #include <SPI.h>
    #include <Ethernet.h>
    
    // Enter a MAC address and IP address for your controller below.
    // The IP address will be dependent on your local network.
    // gateway and subnet are optional:
    byte mac[] = {
      0x00, 0xAA, 0xBB, 0xCC, 0xDE, 0x02 //0xC3, 0x23, 0x6B, 0xAC, 0xDE, 0x12
    };
    IPAddress ip(192, 168, 1, 177);
    
    // listen to port 60000
    EthernetServer server(60000);
    boolean gotAMessage = false; // whether or not you got a message from the client yet
    
    void setup() {
      pinMode(9, OUTPUT);
      digitalWrite(9, LOW);    // begin reset the WIZ820io
      delay(100);
      digitalWrite(9, HIGH);   // end reset pulse
      
      // Open serial communications and wait for port to open:
      Serial.begin(9600);
    
      // start the Ethernet connection:
      Serial.println("Trying to get an IP address using DHCP");
      if (Ethernet.begin(mac) == 0) {
        Serial.println("Failed to configure Ethernet using DHCP");
      }
      // print your local IP address:
      Serial.print("My IP address: ");
      ip = Ethernet.localIP();
      for (byte thisByte = 0; thisByte < 4; thisByte++) {
        // print the value of each byte of the IP address:
        Serial.print(ip[thisByte], DEC);
        Serial.print(".");
      }
      Serial.println();
      // start listening for clients
      server.begin();
    
    }
    
    void loop() {
      // wait for a new client:
      EthernetClient client = server.available();
    
      // when the client sends the first byte, say hello:
      if (client) {
        if (!gotAMessage) {
          Serial.println("We have a new client");
          client.println("Hello, client!");
          gotAMessage = true;
        }
    
        // read the bytes incoming from the client:
        char thisChar = client.read();
    
        Ethernet.maintain();
      }
    }
    Python Sender
    Code:
    import socket # Import socket module
    import time
    
    s = socket.socket() # Create a socket object
    host = "192.168.188.116" # Get local machine name
    port = 60000 # Reserve a port for your service.
    
    s.connect((host, port))
    s.send("Hello server!")
    
    while True:
        filename='VIDEO.BIN'
        f = open(filename,'rb')
        l = f.read(1024)
        bs = 0;
        start = time.time()
        while (l):
            s.send(l)
            bs+=1024
            if(time.time() - start > 1.0):
                print('Byte/s: ', bs)
                bs = 0
                start = time.time()
            l = f.read(1024)
        print()
        f.close()
    
        print('Done sending')
        s.send('Thank you for receiving!')
        s.close()
    I really need AT LEAST 10MBit transfer rates to make this whole thing practical.
    If this is not possible via WIZ820/850io could you make any suggestions for a board to make the transfer and then using Video2Serial to send it to the 4 grouped teensies? Or any other suggestions to make this work?!

    Thanks for your help!

  2. #2
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,593
    you should set up a 1024 byte buffer on the teensy and do multi-byte receives in loop(), something like

    Code:
    #define RECLTH 1024
    uint8_t buf[RECLTH];
    ...
    
        while(client.connected()) {
          if ((n=client.available()) > 0) {
            if (n > RECLTH)  n = RECLTH;
            client.read(buf,n);
            bytes += n;
          }
        }
    and you need to look at the Ethernet lib and set the SPI clock to 24mhz or 30 mhz.

    Doing print's and Ethernet.maintain() inside your receive loop will only slow things down.

    You may still be limited by how fast the python can read the data file and put bytes out on the ether and/or how fast you can write to the uSD.
    Last edited by manitou; 06-30-2017 at 06:57 PM.

  3. #3
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    15MBit should be be doable:
    https://github.com/manitou48/DUEZoo/...er/wizperf.txt

    loop() is rather slow to begin with and you are doing ton of pointless stuff inside it. You are only reading a single byte per loop() execution.

    Get your EthernetClient once and keep it around. Using read() to get a single character at a time has tons of overhead, use a decently sized buffer to read a block of data:
    https://github.com/PaulStoffregen/Et...Client.ino#L84

  4. #4
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    7,003
    You might want to take care of the speed of sd-writes, too

  5. #5
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Ah, okay, I picked the "wrong" example and didn't look at what is going on carefully enough. Thanks for your fast replies!

    So I optimized my code as manitou suggested and set the SPI clock to 30mhz. That helped quite a bit. I get 400kbyte/sec or 3.2mbit/sec now.

    Code:
    #include <SPI.h>
    #include <Ethernet.h>
    #include "TeensyID.h"
    
    #define RECLTH 1024
    uint8_t buf[RECLTH];
    unsigned int n = 0;
    
    uint8_t mac[6];
    IPAddress ip(0, 0, 0, 0);
    
    // listen to port 60000
    EthernetServer server(60000);
    EthernetClient client;
    boolean gotAMessage = false; // whether or not you got a message from the client yet
    
    void setup() {
      pinMode(9, OUTPUT);
      digitalWrite(9, LOW);    // begin reset the WIZ820io
      delay(100);
      digitalWrite(9, HIGH);   // end reset pulse
      
      // Open serial communications and wait for port to open:
      Serial.begin(9600);
    
      // read the burned in MAC address
      teensyMAC(mac);
      Serial.printf ("MAC Address: %02X:%02X:%02X:%02X:%02X:%02X \n", mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
    
      // start the Ethernet connection:
      Serial.println("Trying to get an IP address using DHCP");
      if (Ethernet.begin(mac) == 0) {
        Serial.println("Failed to configure Ethernet using DHCP");
      }
      
      // print your local IP address:
      Serial.print("My IP address: ");
      ip = Ethernet.localIP();
      for (byte thisByte = 0; thisByte < 4; thisByte++) {
        // print the value of each byte of the IP address:
        Serial.print(ip[thisByte], DEC);
        Serial.print(".");
      }
      Serial.println();
      
      // start listening for clients
      server.begin();
    }
    
    void loop()
    {
      Ethernet.maintain();
      // wait for a new client:
      client = server.available();
    
      // when the client sends the first byte, say hello:
      if (client)
      {
        if (!gotAMessage)
        {
          Serial.println("We have a new client");
          client.println("Hello, client!");
          gotAMessage = true;
        }
    
        while(client.available())
        {
            n = client.available();
            //Serial.println(n);
            if (n > 0)
            {
                if (n > RECLTH)  n = RECLTH;
                client.read(buf,n);
            }
        }
      }
    }
    I also tested the example WebClient Sketch tni linked to. But with that sketch I only get 20kbyte/sec?!

    Regarding my 400kbyte/sec I see in the linked performance test file that the speed I get is rather 4Mhz SPI?! Do I change the clock In the right file?

    /Applications/Arduino_1.8.3.app/Contents/Java/hardware/teensy/avr/libraries/Ethernet/w5100.h
    I need to change it like this right?

    Code:
    // Safe for all chips
    //#define SPI_ETHERNET_SETTINGS SPISettings(14000000, MSBFIRST, SPI_MODE0)
    
    // Safe for W5200 and W5500, but too fast for W5100
    // uncomment this if you know you'll never need W5100 support
    #define SPI_ETHERNET_SETTINGS SPISettings(30000000, MSBFIRST, SPI_MODE0)

    Regarding the µSD write speeds I was quite optimistic based on my experiences with the read speeds of the SDFat lib from Bill and the test results mentioned in this thread
    https://forum.pjrc.com/threads/36737...Teensy-3-5-3-6
    But it's a good point to check. I will write a benchmark with my setup...

  6. #6
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    I forgot to mention that I upgraded to Arduino 1.8.3 and Teensyduino 1.37 this morning...
    Last edited by JanGee; 07-01-2017 at 10:40 AM. Reason: typo

  7. #7
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    Quote Originally Posted by JanGee View Post
    I also tested the example WebClient Sketch tni linked to. But with that sketch I only get 20kbyte/sec?!
    The intent was to show the buffered read, the same thing manitou posted. You wouldn't want to include the part where the buffer is written to Serial.

  8. #8
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    7,872
    Warning: I have not done much with any of these adapters...

    But as @manitou mentioned, I would check both sides of the equation. That is at least with my limited Python usage (Some ROS modules run on ARM processors). The python modules did not have very good performance... Or more in my case they ate a lot more CPU resources than I wanted.

    Also would be curious of things like in the code:
    Code:
        while(client.available())
        {
            n = client.available();
            //Serial.println(n);
            if (n > 0)
            {
                if (n > RECLTH)  n = RECLTH;
                client.read(buf,n);
            }
        }
    What type of values your are getting back from client.available(). Also my quick look through each of those calls creates an spI transaction. So I would have a tendency to minimize those types of calls. Maybe more like:
    Code:
        while((n = client.available()))
        {
            //Serial.println(n);
            uint16_t read_length = RECLTH;
            while (n > 0)
            {
                if (n < RECLTH)  n = RECLTH;
                client.read(buf, read_length);
                n -= read_length;
            }
        }
    Or maybe see if I can avoid it completely, something like:
    Code:
        while((n = client.read(buf, RECLTH)))
        {
            // do something with what you read...
            //Serial.println(n);
        }


    I would also through in a dummy yield method, so you don't have that additional overhead:
    Code:
    void yield() {
    }

  9. #9
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Quote Originally Posted by tni View Post
    The intent was to show the buffered read, the same thing manitou posted. You wouldn't want to include the part where the buffer is written to Serial.
    Yes, got that. I just was astonished, that the examples are so inefficient. Even my single read approach at the beginning was 3x faster. I also read in some other threads that Paul wanted to optimize the Ethernet lib for single reads because people use them so often. Now I wonder if it would be more efficient to optimize the examples to teach best practices to get stable and high performance and using the things "right"?! I am not experienced with low level hw programming and try to understand the examples and learn from them. Don't get me wrong. This is no criticism. I just want to give some feedback about the traps I tap into as a Teensy noob.

    @KurtE: Thanks for the suggestions. It didn't make a difference. What is that yield() function about? The ouput from my version looks like this:

    Code:
    MAC Address: 04:E9:E5:04:D6:C1 
    Trying to get an IP address using DHCP
    My IP address: 192.168.188.119.
    We have a new client
    13
    1024
    1011
    1037
    13
    1460
    436
    1460
    436
    1460
    436
    1460
    436
    ...
    I already wondered about the alternating value 436. When changing RECLTH=2048 the output is

    Code:
    MAC Address: 04:E9:E5:04:D6:C1 
    Trying to get an IP address using DHCP
    My IP address: 192.168.188.119.
    We have a new client
    13
    1460
    575
    1460
    1460
    1460
    1460
    1460
    1460
    1460
    1460
    ...
    I just have to pick up the WIZ850io which arrived today and will see if it makes a difference. I will also investigate the python side and do some performance tests. In the production system I want to go down to c level anyway. So maybe this is the right time to start
    Last edited by JanGee; 07-01-2017 at 04:33 PM. Reason: forgot to ask about yield function

  10. #10
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,593
    my suggestion snippet above was while(client.connected()) (not available), not sure if that would improve anything. FWIW, the python script on my linux desktop can put data on the ether at a rate of 88 mbs, so it wouldn't be a bottleneck. I removed the prints in the python transmit loop

    1460 is the max TCP segment size, so 2048 buffer should help
    Last edited by manitou; 07-01-2017 at 04:48 PM.

  11. #11
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Quote Originally Posted by manitou View Post
    my suggestion snippet above was while(client.connected()) (not available), not sure if that would improve anything. FWIW, the python script on my linux desktop can put data on the ether at a rate of 88 mbs, so it wouldn't be a bottleneck. I removed the prints in the python transmit loop 1460 is the max TCP segment size, so 2048 buffer should help
    Thanks for checking. I removed the prints too. I couldn't get your snippet working yesterday. But now it works directly. Maybe it was to late. With all three versions of the receiving code on the Teensy side I get equal results. The Serial.println does not have a significant effect:

    Code:
        // VERSION 1
        while((n = client.available()))
        {
            Serial.println(n);
            if (n > 0)
            {
                if (n > RECLTH)  n = RECLTH;
                client.read(buf,n);
            }
        }
    
        // VERSION 2 (manitou)
        while(client.connected()) {
          if ((n=client.available()) > 0) {
            if (n > RECLTH)  n = RECLTH;
            client.read(buf,n);
          }
        }
        
        // VERSION 3 (KurtE)
        while((n = client.read(buf, RECLTH)))
        {
            // do something with what you read...
            //Serial.println(n);
        }
    Python sender output
    Code:
    VERSION 1
      ('FileSize (Mb): ', 25.8)
      ('Transfer time (min): ', 1.1169624169667562)
      ('KByte/s: ', 384.97247061762727)
      ('MByte/s: ', 0.3849722816190169)
    
    VERSION 2 (manitou)
      ('FileSize (Mb): ', 25.8)
      ('Transfer time (min): ', 1.1094992319742838)
      ('KByte/s: ', 387.5620748930524)
      ('MByte/s: ', 0.3875618944475407)
    
    VERSION 3 (KurtE)
      ('FileSize (Mb): ', 25.8)
      ('Transfer time (min): ', 1.1069506486256917)
      ('KByte/s: ', 388.4543709256919)
      ('MByte/s: ', 0.3884541966205364)

  12. #12
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Tested with the WIZ850io. No difference.

    @manitou: I looked at your performance tests again and you obviously got something working to get 24-27MBit receive speeds.
    So three questions came to me when reading
    1.) I browsed the repo for the wiztest sketch but it doesnt seem to be included?! Do you have it available?
    2.) You say that a (2nd power supply for W5200) was connected. Did that make a difference? Right now I connect the WIZ850io to the 3.3V from the Teensy 3.6.
    3.) What lib does PaulSPI stand for?

  13. #13
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    7,872
    Quote Originally Posted by JanGee View Post
    @KurtE: Thanks for the suggestions. It didn't make a difference. What is that yield() function about? The ouput from my version looks like this:
    The idea of the yield function, is it allows you to put code in that should be run everytime you logically wish to wait for something to happen as to allow some other stuff to be completed. It is also called in the main loop of the main program, everytime you exit the loop() function. That is:
    Code:
    extern "C" int main(void)
    {
    	// Arduino's main() function just calls setup() and loop()....
    	setup();
    	while (1) {
    		loop();
    		yield();
    	}
    }
    The default implementation (weak linked), checks all 6 serial ports (Serial.available(), Serial1.available()... Serial6.available()) and for all of them that do not return 0, they call
    serialEvent(), serial1Event().... serial6Event

    Which all of default serialEvent implementations just return. So by including your own version of yield you bypass all of that code.

  14. #14
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,593
    Quote Originally Posted by JanGee View Post
    Tested with the WIZ850io. No difference.

    @manitou: I looked at your performance tests again and you obviously got something working to get 24-27MBit receive speeds.
    So three questions came to me when reading
    1.) I browsed the repo for the wiztest sketch but it doesnt seem to be included?! Do you have it available?
    2.) You say that a (2nd power supply for W5200) was connected. Did that make a difference? Right now I connect the WIZ850io to the 3.3V from the Teensy 3.6.
    3.) What lib does PaulSPI stand for?
    I have updated the wizpaul.ino sketch at https://github.com/manitou48/teensy3...er/wizpaul.ino
    It's quite a hack, and it uses various tools on my linux box to send or receive packets (ttcp.c or iperf or things i crafted). Good Luck.
    (It also measures time to read/write the wiznet buffer area with SPI. Ethernet performance is guaranteed to be no faster than the buffer SPI rates!)

    The various sketch names refer to using SdFAT SPI, and over time Paul updated his SPI implementation to be competitive with SdFAT SPI.

    I do power the WIZ board from a separate power supply (common ground). it can consume 150+ ma

    The last time I tested was in January with 30 mhz clock on T3.2@120mhz
    https://forum.pjrc.com/threads/41151...l=1#post128975
    Last edited by manitou; 07-01-2017 at 11:57 PM.

  15. #15
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Thanks for updating manitou.

    So here are my results:

    To make the ttcp_server() test accept client connections I have to remove the wizdump() call in the setup function. Do you see the same behaviour? Otherwise just nothing happens when I try to connect after "server listening"...

    Code:
    My IP address: 192.168.188.122.
    write 326 us mbs 25.13
    read 300 us  mbs 27.31
    wrt/rd errors 1020
    read 18 us   mbs 24.44
         00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
    0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0030 00 00 00 00 00 00 00
    socket info
         00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
    0000 00 00 00 37 00 00 00 00 00 00 00 00 00 00 00 00
    0010 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00
    0020 00 00 00 00 00 00 00 00 00 00 00 00
    IP  address:0.0.0.0
    B8000000
    server listening
    The SPI read speeds do not seem to be the bottle neck if I interpret the figures right?! But "wrt/rd errors 1020" looks suspicious?!

    When I remove the wizdump() call it I get these results. I had to change NBYTES to bytes in the calculation for the mbit to get calculated right.
    Code:
    My IP address: 192.168.188.122.
    server listening
    client connected
    recv  25800037 bytes 63541 ms n 364  mbits 3.25
    server listening
    Those results are far away from your results for tcp.

    I also tried activating the W5500 4K buffers in W5100.cpp. But that dropped receive speeds near zero and the transfer interrupts...
    Code:
    My IP address: 192.168.188.122.
    server listening
    client connected
    recv  3309833 bytes 282665 ms n 0  mbits 0.09
    server listening
    Another thing I tried was reducing the teensy clock to 120MHz so HAS_SPIFIFO gets set. I set SPIFIFO.begin(ss_pin, SPI_CLOCK_24MHz). Unfortunately the Ethernet server doesn't react to client connections anymore like when wizdump() gets called.

    I also wonder what this code would look like for the 180MHz of the teensy 3.6 with 30MHz SPI clock? And could the SPIFIFO help at all?!
    Code:
    #if F_BUS == 120000000
    #define HAS_SPIFIFO
    #define SPI_CLOCK_24MHz   (SPI_CTAR_PBR(3) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) //(120 / 5) * ((1+1)/2)
    #define SPI_CLOCK_16MHz   (SPI_CTAR_PBR(0) | SPI_CTAR_BR(2))                //(120 / 2) * ((1+0)/4) = 15 MHz
    #define SPI_CLOCK_12MHz   (SPI_CTAR_PBR(3) | SPI_CTAR_BR(0))                //(120 / 5) * ((1+0)/2)
    #define SPI_CLOCK_8MHz    (SPI_CTAR_PBR(3) | SPI_CTAR_BR(4) | SPI_CTAR_DBR) //(120 / 5) * ((1+1)/6)
    #define SPI_CLOCK_6MHz    (SPI_CTAR_PBR(3) | SPI_CTAR_BR(2))                //(120 / 5) * ((1+0)/4)
    #define SPI_CLOCK_4MHz    (SPI_CTAR_PBR(3) | SPI_CTAR_BR(4)) 		    //(120 / 5) * ((1+0)/6)
    By the way if I drop SPI clock to 4MHz I get 200Kb/s transfer speeds. So transfer rate only halfs even though the clock is divided by 7.5.

    I really wonder where the bottle neck is.
    Last edited by JanGee; 07-02-2017 at 07:39 AM.

  16. #16
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    What's your network latency? 2kB is a tiny TCP window size. With a decent wired network and a latency of a couple of hundred us, it won't matter much. If you get towards milliseconds, it does.

    If you are using a router (or anything that potentially messes with TCP/IP headers), try to remove it.

    There can be disastrous performance interactions with Nagle and delayed ACKs, especially with the small TCP window used. Try to disable them.

    In 'EthernetServer::begin()', there is
    sockindex = Ethernet.socketBegin(SnMR::TCP, _port);

    change that to:
    sockindex = Ethernet.socketBegin(SnMR::TCP | SnMR::ND, _port);

    \\

    Other people have successfully used W5200 with 16kB buffers:
    https://github.com/alex-Arc/Ethernet...055401536a380b

    \\

    Post a tcpdump, something may jump out.

  17. #17
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Yeah! Now I get about 9Mbit. The 4K Buffers work now. But 8K and 16K slow things down again.
    Code:
    Done sending
    ('FileSize (Mb): ', 25.8)
    ('Transfer time (min): ', 0.3655051509539286)
    ('KByte/s: ', 1176.4525121626396)
    ('KBit/s: ', 9411.612832609342)
    ('MByte/s: ', 1.1764508494701662)
    ('MBit/s: ', 9.411599940367045)
    But my network seems to be a big part of the problem. When connecting the teensy directly to my mac I even get 20+Mbit!!! Still not 27 but it's getting usable for my purpose
    Code:
    Done sending
    ('FileSize (Mb): ', 25.8)
    ('Transfer time (min): ', 0.15660630067189535)
    ('KByte/s: ', 2745.7303254582152)
    ('KBit/s: ', 21965.80526143985)
    ('MByte/s: ', 2.745721547261544)
    ('MBit/s: ', 21.96573726547405)
    Here is a wireshark screenshot of the first packets of the transfer. How can I copy a readable dump out of it?!
    Click image for larger version. 

Name:	Screen Shot 2017-07-02 at 12.11.07.jpg 
Views:	140 
Size:	233.6 KB 
ID:	10930

  18. #18
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    What did you change?

    From the network capture, it looks like Nagle and delayed ACK are disabled on the Mac side and 'SnMR::ND' is set on the WIZio. It does look like what I would want to see. Can you post one with 16kB buffers and one with a ping?

    Your Mac is sometimes extremely slow to respond. At the 62.600083 TCP window update (WIZio signals that is has free buffer space), it sends new data within 32us; at the 62.603081 window update it takes 8'790us. Maybe you can disable some power-saving stuff on the Mac.

    Teensy SPI officially supports 30MHz. TeensyLogicAnalyzer overclocks it to up to 120MHz.

    Quote Originally Posted by JanGee View Post
    Here is a wireshark screenshot of the first packets of the transfer. How can I copy a readable dump out of it?!
    File / Export Packet Dissections / As CSV.

  19. #19
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Well, I changed the code according to your suggestions and the linked 16K changes and connected a cable instead of using WLAN. But in the production env I will have a high quality LAN and Server dedicated to the task of transfering the LED video BIN files and syncing them during runtime. So I don't care for WLAN right now.

    The Mac is connected to a power supply. But there is a lot of processes active.Maybe the python sender is put into wait while other tasks are running. I'll probably need to clean up.

    Right now it's:
    4Mbit/s - WLAN/2KB buffer
    9Mbit/s - WLAN/4KB buffer
    18MBit/s - LAN/4KB buffer (over a FRITZ! router)
    21Mbit/s - LAN/4KB buffer (direct connection between Mac & Teensy)

    And I just found out that I hade a mistake (see comment in code below) in the code for the 16K buffer

    25Mbit/s - LAN/16KB buffer (over a FRITZ! router) YEAH!

    It's getting time to take care of the SDCard write speed. I will add this to my test sketch.

    I read about the WIZ850io that it supports 84MHz SPI max. So overclocking SPI would be very interesting. ButI don't understand how they do it inthe LogicAnalyzer. Overclocking the teensy didn't make a difference?!
    And I read that it has 32 kB buffer mem. So I even tried a 32K receive buffer but that didn't work...I don't get an IP from DHCP...
    http://shop.wiznet.eu/wiz850io.html

    Here is the code for the bigger buffer sizes
    Code:
    ...
    else if (isW5500()) {
    		CH_BASE = 0x1000;
            #ifdef W5500_32K_BUFFERS
            SSIZE = 32768;    // 32K buffers
            SMASK = 0x7FFF;
            #elif defined(W5500_16K_BUFFERS)
            SSIZE = 16384;    // 16K buffers
            SMASK = 0x3FFF;
            #elif defined(W5500_8K_BUFFERS)
            SSIZE = 8092;    // 8K buffers
            SMASK = 0x1FFF;
    		#elif defined(W5500_4K_BUFFERS)
    		SSIZE = 4096;    // 4K buffers
    		SMASK = 0x0FFF;
    		#else
    		SSIZE = 2048;    // 2K buffers
    		SMASK = 0x07FF;
    		#endif
            SMASK = SSIZE-1; // Could be removed when SMASK is set implicitly above...
    		TXBUF_BASE = 0x8000;
    		RXBUF_BASE = 0xC000;
    		//#ifdef W5500_4K_BUFFERS <-- FORGOT TO UNCOMMENT THIS IN PREVIOUS 16K TESTS
    		for (i=0; i<MAX_SOCK_NUM; i++) {
    			writeSnRX_SIZE(i, SSIZE >> 10);
    			writeSnTX_SIZE(i, SSIZE >> 10);
    		}
    		for (; i<8; i++) {
    			writeSnRX_SIZE(i, 0);
    			writeSnTX_SIZE(i, 0);
    		}
    ...
    Results for 16K buffers
    Code:
    Done sending
    ('FileSize (Mb): ', 200.0)
    ('Transfer time (min): ', 1.0452636003494262)
    ('KByte/s: ', 3265.522509343923)
    ('KBit/s: ', 26124.173420791387)
    ('MByte/s: ', 3.1889847403503966)
    ('MBit/s: ', 25.51187142480386)
    Here is the tcp dump for 16k buffers
    Click image for larger version. 

Name:	Screen Shot 2017-07-02 at 15.05.13.jpg 
Views:	158 
Size:	200.3 KB 
ID:	10937
    Click image for larger version. 

Name:	Screen Shot 2017-07-02 at 15.06.24.png 
Views:	112 
Size:	411.6 KB 
ID:	10938
    Last edited by JanGee; 07-02-2017 at 04:06 PM. Reason: currupted attachments

  20. #20
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    Quote Originally Posted by JanGee View Post
    I read about the WIZ850io that it supports 84MHz SPI max. So overclocking SPI would be very interesting. ButI don't understand how they do it inthe LogicAnalyzer. Overclocking the teensy didn't make a difference?!
    I'm not sure if LogicAnalyzer has a different clock setup. With the standard Teensy SPI and Teensy 3.6, I can use F_BUS set to 80MHz (40MHz SPI clock, 240MHz CPU). Any higher bus clock results in corrupted data.

    And I read that it has 32 kB buffer mem. So I even tried a 32K receive buffer but that didn't work...
    It's split between RX and TX. So you can't use more than 16kB for a socket.

    Quote Originally Posted by JanGee View Post
    It's getting time to take care of the SDCard write speed. I will add this to my test sketch.
    Use SdFat-beta, SdFatSdioEX. Make sure the card isn't busy, when you try to write ('sd.card()->isBusy()'). The write would block and wait, while you could instead empty the WIZio buffer. Teensy 3.6 has a lot of memory for buffering.

    512 byte writes work fine with SdFatSdioEX. The SDIO interface runs at 200Mbit/s. That's the actual transfer rate you get to the SD card, even if it writes slower to it's flash (it will signal busy for subsequent sector writes).

  21. #21
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    Quote Originally Posted by JanGee View Post
    Here is the tcp dump for 16k buffers
    For these, the SPI reading is clearly the bottleneck. The WIZio receive buffer is kept well filled and going down to an 8kb buffer size probably wouldn't make a difference.

  22. #22
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Okay, here we go. Thanks a lot to you tni and everybody else. I get quite descend transfer/write speeds. Here are some for different file sizes:

    Code:
    Done sending
    ('FileSize (Mb): ', 1.1728744506835938)
    ('Transfer time (min): ', 0.006500101089477539)
    ('KByte/s: ', 3079.2941971853024)
    ('KBit/s: ', 24633.404930145676)
    ('MByte/s: ', 3.0068916450035728)
    ('MBit/s: ', 24.054206889301522)
    
    
    Done sending
    ('FileSize (Mb): ', 24.60479736328125)
    ('Transfer time (min): ', 0.16970289945602418)
    ('KByte/s: ', 2474.4468397186592)
    ('KBit/s: ', 19795.545516159356)
    ('MByte/s: ', 2.416444862628856)
    ('MBit/s: ', 19.33152857342053)
    
    
    Done sending
    ('FileSize (Mb): ', 200.0)
    ('Transfer time (min): ', 1.515394151210785)
    ('KByte/s: ', 2252.438627410822)
    ('KBit/s: ', 18019.50585350615)
    ('MByte/s: ', 2.1996463472569037)
    ('MBit/s: ', 17.59716787104682)
    Without writing to the SD card I get
    Code:
    Done sending
    ('FileSize (Mb): ', 1.1728744506835938)
    ('Transfer time (min): ', 0.005503666400909424)
    ('KByte/s: ', 3636.7542392459454)
    ('KBit/s: ', 29092.60569728996)
    ('MByte/s: ', 3.5511947782322393)
    ('MBit/s: ', 28.408266268016323)
    
    
    Done sending
    ('FileSize (Mb): ', 24.60479736328125)
    ('Transfer time (min): ', 0.12643694877624512)
    ('KByte/s: ', 3321.182323911194)
    ('KBit/s: ', 26569.398470139102)
    ('MByte/s: ', 3.2433279448624193)
    ('MBit/s: ', 25.946568924533576)
    
    
    Done sending
    ('FileSize (Mb): ', 200.0)
    ('Transfer time (min): ', 1.0279139002164206)
    ('KByte/s: ', 3320.639884678672)
    ('KBit/s: ', 26565.112196955026)
    ('MByte/s: ', 3.2428108078653324)
    ('MBit/s: ', 25.94248014486254)
    RECLTH = 16384 (That was the fastest. 512 is slow). Both, the Ethernet read and the SD write use this size. I couldn't figure out a good way to implement isBusy yet. And there surely is a lot potential to optimize the read/write to work together in an optimal way. The weekend has been long and my brain is tired. The overclocking does have no effect. Or only very small effects. The data is transfered and written correctly to the sd. Here is the code so far:

    Code:
    #include <SPI.h>
    #include "Ethernet.h"
    #include <SPIFIFO.h>
    #include "TeensyID.h"
    #include "SdFat.h"
    
    // --- SD ---
    SdFatSdioEX SDIO;
    const char VIDEO_FILENAME[] = "ETHTEST.BIN";
    File videofile;
    
    #define ETH_RCV_LEN 16384
    uint8_t buf[ETH_RCV_LEN];
    unsigned int n = 0;
    
    uint8_t mac[6];
    IPAddress ip(192, 168, 0, 2);
    
    // listen to port 60000
    EthernetServer server(60000);
    EthernetClient client;
    boolean gotAMessage = false; // whether or not you got a message from the client yet
    
    void setup() {
      pinMode(9, OUTPUT);
      digitalWrite(9, LOW);    // begin reset the WIZ820io
      delay(100);
      digitalWrite(9, HIGH);   // end reset pulse
      
      // Open serial communications and wait for port to open:
      Serial.begin(9600);
    
      // Init SD Card
      if (SDIO.begin())
          Serial.println("SD card initialized");
      else
          Serial.println("Could not access SD card");
    
      // read the burned in MAC address
      teensyMAC(mac);
      Serial.printf ("MAC Address: %02X:%02X:%02X:%02X:%02X:%02X \n", mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
    
      //Ethernet.begin(mac, ip);
      // start the Ethernet connection:
      Serial.println("Trying to get an IP address using DHCP");
      if (Ethernet.begin(mac) == 0) {
        Serial.println("Failed to configure Ethernet using DHCP");
      }
      
      // print your local IP address:
      Serial.print("My IP address: ");
      ip = Ethernet.localIP();
      for (byte thisByte = 0; thisByte < 4; thisByte++) {
        // print the value of each byte of the IP address:
        Serial.print(ip[thisByte], DEC);
        Serial.print(".");
      }
      Serial.println();
      
      // start listening for clients
      server.begin();
    }
    
    void loop()
    {
      // wait for a new client:
      client = server.available();
    
      // when the client sends the first byte, say hello:
      if (client)
      {
        if (!gotAMessage)
        {
          Serial.println("We have a new client");
          client.println("Hello, client!");
          gotAMessage = true;
        }
    
        if(client.connected())
        {
            SDIO.remove(VIDEO_FILENAME);
          
            videofile = SDIO.open(VIDEO_FILENAME, FILE_WRITE);
            if(videofile)
                Serial.println("File opened");
            else
                Serial.println("File open failed!");
              
            while((n = client.read(buf, ETH_RCV_LEN)))
            {
                  videofile.write((uint8_t*)buf, n);
                  //Serial.println(n);
            }
        }
    
        if(videofile)
        {
            Serial.println(videofile.size());
            videofile.close();
        }
        
      }
    
      // close the connection:
      if (client)
      {
           client.stop();
           Serial.println("client disconnected");
      }
    
    }
    
    void yield()
    {
        //Ethernet.maintain();
    }
    I think I will find some time next week to digg deeper into this. That the transfer speed drops so dramatically when files get bigger must have something to do with the read and write blocking each other?! Any quick suggestions?

  23. #23
    Senior Member
    Join Date
    Jan 2013
    Posts
    843
    Quote Originally Posted by JanGee View Post
    The overclocking does have no effect. Or only very small effects.
    Did you set the SPI clock in w5100.h? Did you change the CPU clock to 240MHz and F_BUS to 80MHz?

    This will choose the highest SPI clock rate (w5100.h):
    #define SPI_ETHERNET_SETTINGS SPISettings(-1, MSBFIRST, SPI_MODE0)

    To check F_BUS:
    Serial.printf("F_BUS: %i\n", F_BUS);

  24. #24
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    I forgot to set the SPISettings. But still no change. I wonder how the 80Mhz get reduced to 60Mhz?

    Code:
    F_CPU: 240000000 / F_BUS: 60000000
    SD card initialized
    MAC Address: 04:E9:E5:04:D6:C1 
    Trying to get an IP address using DHCP
    My IP address: 192.168.188.119.
    We have a new client
    File opened
    client disconnected
    
    ---
    
    Done sending
    ('FileSize (Mb): ', 24.60479736328125)
    ('Transfer time (min): ', 0.18279671669006348)
    ('KByte/s: ', 2297.201435690031)
    ('KBit/s: ', 18377.584719570314)
    ('MByte/s: ', 2.2433544374583323)
    ('MBit/s: ', 17.946810921695256)
    kinetis.h
    Code:
    ...
    #if (F_CPU == 240000000)
     #define F_PLL 240000000
     #ifndef F_BUS
     //#define F_BUS 60000000
     #define F_BUS 80000000   // uncomment these to try peripheral overclocking
     //#define F_BUS 120000000  // all the usual overclocking caveats apply...
     #endif
    ...
    W5100.h
    Code:
    ...
    // Safe for all chips
    //#define SPI_ETHERNET_SETTINGS SPISettings(14000000, MSBFIRST, SPI_MODE0)
    
    // Safe for W5200 and W5500, but too fast for W5100
    // uncomment this if you know you'll never need W5100 support
    #define SPI_ETHERNET_SETTINGS SPISettings(-1, MSBFIRST, SPI_MODE0)
    
    #define MAX_SOCK_NUM 1 
    ...
    W5100.cpp
    Code:
    ...
    #define W5500_16K_BUFFERS
    //#define W5500_8K_BUFFERS
    //#define W5500_4K_BUFFERS
    //#define W5200_4K_BUFFERS
    ...

  25. #25
    Junior Member
    Join Date
    Jun 2017
    Posts
    12
    Grrr....I hate it. I edited the wrong kinetis.h version. Was still open from the time before upgrading to Arduino 1.8.3

    But that made things worse. I don't get an IP anymore...
    Code:
    F_CPU: 240000000 / F_BUS: 80000000
    SD card initialized
    MAC Address: 04:E9:E5:04:D6:C1 
    Trying to get an IP address using DHCP
    Failed to configure Ethernet using DHCP
    My IP address: 128.0.0.0.
    When I reduce the SPISettings in W5100.h to 30000000 it works again. 60000000 doesn't work neither.
    Last edited by JanGee; 07-02-2017 at 09:50 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •