Forum Rule: Always post complete source code & details to reproduce any issue!
Page 108 of 152 FirstFirst ... 8 58 98 106 107 108 109 110 118 ... LastLast
Results 2,676 to 2,700 of 3789

Thread: Teensy 4.0 First Beta Test

  1. #2676
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,005
    FWIW, in https://www.pjrc.com/teensy/usb_serial.html there was a USBsend sketch and host receiver (serial_listen.c or serial_read.c). For T3.6 it reports 1152.09 kbytes/sec to linux host with USB hub and 1152.11 kbytes/sec for T3.2. For T4B2 (latest github no debug printf's) it reports 7729.36 kbytes/sec -- a bit too fast me thinks, so maybe it's not a valid benchmark.

    Code:
    // https://forum.pjrc.com/threads/29078-USB-Transmission-speed
    // https://www.pjrc.com/teensy/usb_serial.html
    // USB Serial Transmit Bandwidth Test
    // Written by Paul Stoffregen, paul@pjrc.com
    // This benchmark code is in the public domain.
    //
    // Within 5 seconds of opening the port, this program
    // will send a message as rapidly as possible, for 10 seconds.
    //
    // To run this benchmark test, use serial_read.exe (Windows) or
    // serial_listen (Mac, Linux) program can read the data efficiently
    // without saving it.
    // http://www.pjrc.com/teensy/serial_listen.c
    // http://www.pjrc.com/teensy/serial_read.c
    // http://www.pjrc.com/teensy/serial_read.exe
    //
    // You can also run a terminal emulator and select the option
    // to capture all text to a file.  However, some terminal emulators
    // may limit the speed, depending upon how they update the screen
    // and how efficiently their code processes the imcoming data.  The
    // Arduino Serial Monitor is particularly slow.  Only use it to
    // verify this sketch works.  For actual benchmarks, use the
    // efficient receive tests above.
    //
    // Full disclosure: Paul is the author of Teensyduino. 
    //
    // Results can vary depending on the number of other USB devices
    // connected.  For fastest results, disconnect all others.
    
    
    #define USBSERIAL Serial       // for Leonardo, Teensy, Fubarino
    //#define USBSERIAL SerialUSB  // for Due, Maple
    
    void setup()
    {
      USBSERIAL.begin(115200);
    }
    
    void loop()
    {
      // wait for serial port to be opened
      while (!USBSERIAL) ;
    
      // give the user 5 seconds to enable text capture in their
      // terminal emulator, or do whatever to get ready
      for (int n=5; n; n--) {
        USBSERIAL.print("10 second speed test begins in ");
        USBSERIAL.print(n);
        USBSERIAL.println(" seconds.");
        if (!USBSERIAL) break;
        delay(1000);
      }
    
      // send a string as fast as possible, for 10 seconds
      unsigned long beginMillis = millis();
      do {
        USBSERIAL.print("USB Fast Serial Transmit Bandwidth Test, capture this text.\r\n");
      } while (millis() - beginMillis < 10000);
      USBSERIAL.println("done!");
    
      // after the test, wait forever doing nothing,
      // well, at least until the terminal emulator quits
      while (USBSERIAL) ;
    }
    EDIT:
    win10x64 + USB hub: T3.6 870.32 KBs, T4B2 1197.95 KBs
    macos: T3.6 1168.07 KBs, T4B2 6209.18 KBs
    Last edited by manitou; 05-04-2019 at 08:35 PM.

  2. #2677
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    Quote Originally Posted by manitou View Post
    For T4B2 (latest github no debug printf's) it reports 7729.36 kbytes/sec -- a bit too fast me thinks, so maybe it's not a valid benchmark.
    Might be a valid result, since T4 is using 480 Mbit/sec USB speed, and that test uses fairly large message sizes with little other work being done.

    I made the lines/sec test to intentionally exercise the commonly used parts of Arduino's Print class and try to optimize this common case of lines assembled from several small fragments. Until the optimization is very good, we can expect to see much slower speeds in the lines/sec test than we get in this older and much "easier" test where a single, fairly large and fixed message is repeatedly sent.

    According to the USB 2.0 spec, with the 64 byte max packet size we're using now, the theoretical best speed is 32,256,000 bytes/sec.

    Click image for larger version. 

Name:	speeds.png 
Views:	16 
Size:	119.5 KB 
ID:	16531

    So 7.7 Mbyte/sec isn't too shocking, only about 24% of what should be possible with perfect optimization.

    My hope is to eventually get into that 40-50 Mbyte/sec range! ... and to achieve that speed when people use the Print class in ordinary ways to print text and numbers.
    Last edited by PaulStoffregen; 05-02-2019 at 11:46 AM.

  3. #2678
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    I committed a fix for USB serial receive.

    Performance is still far from where I want to get, but it should at least pass the latency test now. Please let me know if you're able to get the test to fail again?

  4. #2679
    Senior Member+
    Join Date
    Jul 2014
    Location
    New York
    Posts
    3,509
    Just ran @defragster's latency test (one before last update) and it got through the test with no errors like before, again I am on a Win10x64 machine so...
    Code:
    port COM23 opened
    waiting for board to be ready:
    .ok
    latency @    1 bytes: 3.28 ms average,  13 max hits,    0.00 2nd max,   16.00 maximum
    latency @    2 bytes: 3.29 ms average,  14 max hits,    0.00 2nd max,   16.00 maximum
    latency @   12 bytes: 3.44 ms average,  15 max hits,    15.00 2nd max,  16.00 maximum
    latency @   16 bytes: 3.27 ms average,  13 max hits,    15.00 2nd max,  16.00 maximum
    latency @   30 bytes: 3.44 ms average,  14 max hits,    0.00 2nd max,   16.00 maximum
    latency @   31 bytes: 3.44 ms average,  14 max hits,    0.00 2nd max,   16.00 maximum
    latency @   63 bytes: 3.44 ms average,  14 max hits,    0.00 2nd max,   16.00 maximum
    latency @   64 bytes: 3.43 ms average,  14 max hits,    15.00 2nd max,  16.00 maximum
    latency @   65 bytes: 6.72 ms average,  27 max hits,    0.00 2nd max,   16.00 maximum
    latency @   71 bytes: 6.56 ms average,  26 max hits,    0.00 2nd max,   16.00 maximum
    latency @  126 bytes: 6.72 ms average,  27 max hits,    0.00 2nd max,   16.00 maximum
    latency @  127 bytes: 6.72 ms average,  27 max hits,    0.00 2nd max,   16.00 maximum
    latency @  128 bytes: 6.72 ms average,  27 max hits,    0.00 2nd max,   16.00 maximum
    latency @  129 bytes: 9.84 ms average,  40 max hits,    15.00 2nd max,  16.00 maximum
    latency @  500 bytes: 26.56 ms average,         20 max hits,    31.00 2nd max,  32.00 maximum
    latency @  512 bytes: 26.41 ms average,         23 max hits,    31.00 2nd max,  32.00 maximum
    latency @  640 bytes: 33.08 ms average,         10 max hits,    46.00 2nd max,  47.00 maximum
    latency @ 1000 bytes: 52.85 ms average,         12 max hits,    53.00 2nd max,  63.00 maximum
    latency @ 1278 bytes: 65.99 ms average,         11 max hits,    69.00 2nd max,  79.00 maximum
    latency @ 1279 bytes: 66.07 ms average,         10 max hits,    78.00 2nd max,  79.00 maximum
    latency @ 1280 bytes: 66.09 ms average,          5 max hits,    63.00 2nd max,  79.00 maximum
    latency @ 1281 bytes: 69.21 ms average,         15 max hits,    78.00 2nd max,  79.00 maximum
    latency @ 2000 bytes: 105.46 ms average,        31 max hits,    93.00 2nd max,  110.00 maximum
    latency @ 2047 bytes: 105.61 ms average,        29 max hits,    109.00 2nd max,         110.00 maximum
    latency @ 2048 bytes: 105.45 ms average,        30 max hits,    109.00 2nd max,         110.00 maximum
    latency @ 2049 bytes: 108.73 ms average,        34 max hits,    0.00 2nd max,   110.00 maximum
    latency @ 4000 bytes: 207.62 ms average,        20 max hits,    0.00 2nd max,   219.00 maximum
    latency @ 4095 bytes: 210.87 ms average,         7 max hits,    219.00 2nd max,         223.00 maximum
    latency @ 4096 bytes: 210.85 ms average,        34 max hits,    218.00 2nd max,         219.00 maximum
    latency @ 4097 bytes: 214.23 ms average,        46 max hits,    219.00 2nd max,         224.00 maximum
    latency @ 8000 bytes: 411.52 ms average,        26 max hits,    421.00 2nd max,         422.00 maximum
     UP ----- pass #1        elapsed time 253.120 secs for 4106700 bytes

  5. #2680
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,005
    Quote Originally Posted by PaulStoffregen View Post
    I committed a fix for USB serial receive.

    Performance is still far from where I want to get, but it should at least pass the latency test now. Please let me know if you're able to get the test to fail again?
    OK fetched latest cores from github and disabled debug printf, T4B2 latency_test on linux laptop
    Code:
    latency @ 1 bytes: 0.14 ms average, 0.35 maximum
    latency @ 2 bytes: 0.13 ms average, 0.35 maximum
    latency @ 12 bytes: 0.16 ms average, 2.50 maximum
    latency @ 30 bytes: 0.12 ms average, 0.15 maximum
    latency @ 62 bytes: 0.14 ms average, 1.21 maximum
    latency @ 71 bytes: 0.25 ms average, 0.31 maximum
    latency @ 128 bytes: 0.25 ms average, 0.33 maximum
    latency @ 500 bytes: 0.38 ms average, 0.48 maximum
    latency @ 1000 bytes: 0.50 ms average, 0.51 maximum
    latency @ 2000 bytes: 0.79 ms average, 0.91 maximum
    latency @ 4000 bytes: 1.36 ms average, 2.25 maximum
    latency @ 8000 bytes: 2.39 ms average, 3.36 maximum
    looks good. zoom zoom

    on windows 10x64
    Code:
    latency @ 1 bytes: 0.16 ms average, 15.57 maximum
    latency @ 2 bytes: 0.31 ms average, 15.62 maximum
    latency @ 12 bytes: 0.31 ms average, 15.77 maximum
    latency @ 30 bytes: 0.16 ms average, 15.63 maximum
    latency @ 62 bytes: 0.31 ms average, 15.63 maximum
    latency @ 71 bytes: 0.16 ms average, 15.62 maximum
    latency @ 128 bytes: 0.31 ms average, 15.62 maximum
    latency @ 500 bytes: 0.31 ms average, 15.62 maximum
    latency @ 1000 bytes: 0.62 ms average, 15.63 maximum
    latency @ 2000 bytes: 0.78 ms average, 15.63 maximum
    latency @ 4000 bytes: 1.41 ms average, 15.68 maximum
    latency @ 8000 bytes: 2.66 ms average, 15.68 maximum
    Last edited by manitou; 05-02-2019 at 06:35 PM.

  6. #2681
    Senior Member+
    Join Date
    Jul 2014
    Location
    New York
    Posts
    3,509
    @manitou
    Reran on my win10x64 machine - you are right with debug_printf off I get pretty much the same numbers as you:
    Code:
    port COM23 opened
    waiting for board to be ready:
    .ok
    latency @ 1 bytes: 0.22 ms average, 15.57 maximum
    latency @ 2 bytes: 0.16 ms average, 15.62 maximum
    latency @ 12 bytes: 0.31 ms average, 15.62 maximum
    latency @ 30 bytes: 0.31 ms average, 15.62 maximum
    latency @ 62 bytes: 0.22 ms average, 15.62 maximum
    latency @ 71 bytes: 0.16 ms average, 15.62 maximum
    latency @ 128 bytes: 0.31 ms average, 15.62 maximum
    latency @ 500 bytes: 0.31 ms average, 15.62 maximum
    latency @ 1000 bytes: 0.53 ms average, 15.62 maximum
    latency @ 2000 bytes: 0.69 ms average, 15.64 maximum
    latency @ 4000 bytes: 1.32 ms average, 15.65 maximum
    latency @ 8000 bytes: 2.32 ms average, 15.67 maximum
    Do have some variation which I expected based on Paul's earlier comment on machine chip configuration

    EDIT: If I read my notes right its now better than the T3.6 as well. As @manitou said "vroom vroom vroom"
    Last edited by mjs513; 05-02-2019 at 05:04 PM.

  7. #2682
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    4,903
    @...

    I picked up the latest stuff and I tried on Windows10 64 bit and it does complete. Also does make a big difference when I have turned off the two printf statements (one in usb.c and other in usb_serial.c)
    As you can see in:

    Code:
    C:\Users\kurte\Desktop\latency_test>latency_test.exe COM7
    port COM7 opened
    waiting for board to be ready:
    .ok
    latency @ 1 bytes: 3.74 ms average, 7.45 maximum
    latency @ 2 bytes: 3.53 ms average, 7.59 maximum
    latency @ 12 bytes: 3.61 ms average, 7.36 maximum
    latency @ 30 bytes: 3.91 ms average, 7.42 maximum
    latency @ 62 bytes: 4.18 ms average, 7.99 maximum
    latency @ 71 bytes: 7.44 ms average, 11.37 maximum
    latency @ 128 bytes: 7.46 ms average, 11.04 maximum
    latency @ 500 bytes: 27.21 ms average, 31.16 maximum
    latency @ 1000 bytes: 53.78 ms average, 57.18 maximum
    latency @ 2000 bytes: 106.42 ms average, 109.88 maximum
    latency @ 4000 bytes: 208.34 ms average, 211.96 maximum
    latency @ 8000 bytes: 412.80 ms average, 416.85 maximum
    
    C:\Users\kurte\Desktop\latency_test>latency_test.exe COM7
    port COM7 opened
    waiting for board to be ready:
    .ok
    latency @ 1 bytes: 0.26 ms average, 1.03 maximum
    latency @ 2 bytes: 0.26 ms average, 1.18 maximum
    latency @ 12 bytes: 0.26 ms average, 1.22 maximum
    latency @ 30 bytes: 0.26 ms average, 1.24 maximum
    latency @ 62 bytes: 0.25 ms average, 1.15 maximum
    latency @ 71 bytes: 0.27 ms average, 1.21 maximum
    latency @ 128 bytes: 0.25 ms average, 1.24 maximum
    latency @ 500 bytes: 0.36 ms average, 1.35 maximum
    latency @ 1000 bytes: 0.47 ms average, 1.42 maximum
    latency @ 2000 bytes: 0.74 ms average, 1.64 maximum
    latency @ 4000 bytes: 2.01 ms average, 3.83 maximum
    latency @ 8000 bytes: 3.39 ms average, 5.04 maximum
    
    C:\Users\kurte\Desktop\latency_test>

  8. #2683
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Updated cores for USB fix and results agree with KurtE - No Errors on Latency test { of course I'm using the updated version - went back to deprecated gettimeofday } - and much faster with those two debug print's gone.
    ** I removed the TWO bothersome prints - but left on "status =" with PRINT_DEBUG_STUFF define … opps …
    > But that didn't make a change to numbers - though for the group below it goes up 0.1 secs in general when the other Teensy's are on the hub and active.

    >> lps_test is the same or lower at ~2800 lines per second.

    Updated version uses running 'A-Z' data for verification on receive, and it shows only ~5% of the 100 are at the MAX time - on multiple runs the ones over 5% change - so it isn't a particular size transfer with a problem
    Code:
    T:\T_Downloads\pjrc_latency_test>latency_test.exe COM25
    port COM25 opened
    waiting for board to be ready:
    .ok
    latency @    1 bytes: 0.25 ms average,   5 max hits,    0.50 2nd max,   0.51 maximum
    latency @    2 bytes: 0.25 ms average,   3 max hits,    0.50 2nd max,   0.51 maximum
    latency @   12 bytes: 0.25 ms average,   3 max hits,    0.50 2nd max,   0.51 maximum
    latency @   16 bytes: 0.25 ms average,   2 max hits,    0.50 2nd max,   0.69 maximum
    latency @   30 bytes: 0.25 ms average,   3 max hits,    0.50 2nd max,   0.51 maximum
    latency @   31 bytes: 0.25 ms average,   3 max hits,    0.50 2nd max,   0.51 maximum
    latency @   63 bytes: 0.25 ms average,   4 max hits,    0.51 2nd max,   0.52 maximum
    latency @   64 bytes: 0.26 ms average,   4 max hits,    0.63 2nd max,   0.70 maximum
    latency @   65 bytes: 0.26 ms average,   4 max hits,    0.50 2nd max,   0.55 maximum
    latency @   71 bytes: 0.25 ms average,   7 max hits,    0.50 2nd max,   0.63 maximum
    latency @  126 bytes: 0.26 ms average,   2 max hits,    0.51 2nd max,   0.51 maximum
    latency @  127 bytes: 0.26 ms average,   3 max hits,    0.50 2nd max,   0.71 maximum
    latency @  128 bytes: 0.26 ms average,   3 max hits,    0.50 2nd max,   0.51 maximum
    latency @  129 bytes: 0.27 ms average,   3 max hits,    0.50 2nd max,   0.75 maximum
    latency @  500 bytes: 0.35 ms average,   4 max hits,    0.59 2nd max,   0.64 maximum
    latency @  512 bytes: 0.35 ms average,   8 max hits,    0.51 2nd max,   0.99 maximum
    latency @  640 bytes: 0.39 ms average,   9 max hits,    0.51 2nd max,   0.73 maximum
    latency @ 1000 bytes: 0.50 ms average,   3 max hits,    0.99 2nd max,   1.00 maximum
    latency @ 1278 bytes: 0.56 ms average,   2 max hits,    1.00 2nd max,   1.01 maximum
    latency @ 1279 bytes: 0.56 ms average,   3 max hits,    1.00 2nd max,   1.01 maximum
    latency @ 1280 bytes: 0.55 ms average,   7 max hits,    0.99 2nd max,   0.99 maximum
    latency @ 1281 bytes: 0.57 ms average,   5 max hits,    0.99 2nd max,   1.04 maximum
    latency @ 2000 bytes: 0.76 ms average,   8 max hits,    1.02 2nd max,   1.14 maximum
    latency @ 2047 bytes: 0.76 ms average,   4 max hits,    1.04 2nd max,   1.04 maximum
    latency @ 2048 bytes: 0.80 ms average,   6 max hits,    1.08 2nd max,   1.26 maximum
    latency @ 2049 bytes: 0.78 ms average,   6 max hits,    1.06 2nd max,   1.10 maximum
    latency @ 4000 bytes: 1.34 ms average,   7 max hits,    1.63 2nd max,   1.65 maximum
    latency @ 4095 bytes: 1.39 ms average,   4 max hits,    1.60 2nd max,   1.61 maximum
    latency @ 4096 bytes: 1.34 ms average,   4 max hits,    1.56 2nd max,   1.75 maximum
    latency @ 4097 bytes: 1.37 ms average,   2 max hits,    1.49 2nd max,   1.96 maximum
    latency @ 8000 bytes: 2.42 ms average,   5 max hits,    2.79 2nd max,   2.86 maximum
     UP ----- pass #1        elapsed time 1.867 secs for 4106700 bytes
    Last edited by defragster; 05-02-2019 at 08:00 PM.

  9. #2684
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Quote Originally Posted by PaulStoffregen View Post
    I committed a fix for USB serial receive.

    Performance is still far from where I want to get, but it should at least pass the latency test now. Please let me know if you're able to get the test to fail again?
    Noted above - no fail on latency_test.

    Paul - did you see the lines per sec sketch edit to keep the wild T_3.x numbers line based on Serial.availableForWrite()? Is that the best fix for those puzzling numbers when Serial is non-blocking?

  10. #2685
    Senior Member+
    Join Date
    Jul 2014
    Location
    New York
    Posts
    3,509
    SDFat Library

    Started playing with the SDFat library, made a couple of changes and ran the SDInfo sketch using an external reader (had to reduce the SPI clock to 8Mhz - old reader). It compiles and runs part ways:
    Code:
    Card type: SDXC
    
    Manufacturer ID: 0X3
    OEM ID: SD
    Product: SC64G
    Version: 8.0
    Serial number: 0XA2D04DD0
    Manufacturing date: 9/2012
    
    cardSize: 63864.57 MB (MB = 1,000,000 bytes)
    flashEraseSize: 128 blocks
    eraseSingleBlock: true
    OCR: 0XC0FF8000
    
    SD Partition Table
    part,boot,type,start,length
    1,0X0,0XC,63,124735425
    2,0X0,0X0,0,0
    3,0X0,0X0,0,0
    4,0X0,0X0,0,0
    error: 
    File System initialization failed.
    Have zero familiarity with this lib so surprised I got this far. Guess more debugging to do.

    EDIT:
    SDFormatter seems to work no problem on a 128GB card.

    Well FreeCluster seems to be working:
    Code:
    Please edit SdFatConfig.h and set
    MAINTAIN_FREE_CLUSTER_COUNT nonzero for
    maximum freeClusterCount() performance.
    
    Type any character to start
    
    First call to freeClusterCount scans the FAT.
    
    freeClusterCount() call time: 31425528 micros
    freeClusters: 1953455
    freeSpace: 128021.625 MB (MB = 1,000,000 bytes)
    
    Create and write to Cluster.test
    
    Second freeClusterCount call is faster if
    MAINTAIN_FREE_CLUSTER_COUNT is nonzero.
    Ok after reformatting the SD Card I reran the SDInfo sketch and this time it ran fine so it was a matter of formatting:
    Code:
    init time: 3 ms
    
    Card type: SDXC
    
    Manufacturer ID: 0X95
    OEM ID: SU
    Product:      
    Version: 0.2
    Serial number: 0X520AB128
    Manufacturing date: 3/2015
    
    cardSize: 128042.66 MB (MB = 1,000,000 bytes)
    flashEraseSize: 128 blocks
    eraseSingleBlock: true
    OCR: 0XC0FF8000
    
    SD Partition Table
    part,boot,type,start,length
    1,0X0,0XC,8192,250075136
    2,0X0,0X0,0,0
    3,0X0,0X0,0,0
    4,0X0,0X0,0,0
    
    Volume is FAT32
    blocksPerCluster: 128
    clusterCount: 1953456
    freeClusters: 1953454
    freeSpace: 128021.56 MB (MB = 1,000,000 bytes)
    fatStartBlock: 10436
    fatCount: 2
    blocksPerFat: 15262
    rootDirStart: 2
    dataStartBlock: 40960
    I will post the lib if you all want to play - I haven't tried SDIO_ext - not sure how that works

    EDIT:
    Just pushed to WIP repository: https://github.com/mjs513/WIP/tree/master/SdFat

    EDIT2: I also tested it on the Audio Shield and it works there as well.
    Last edited by mjs513; 05-02-2019 at 08:49 PM. Reason: Updated Info

  11. #2686
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    Quote Originally Posted by defragster View Post
    Paul - did you see the lines per sec sketch edit to keep the wild T_3.x numbers line based on Serial.availableForWrite()? Is that the best fix for those puzzling numbers when Serial is non-blocking?
    Nope, not yet. Spent the time tracking down that receive size bug, and also looking at ways to minimize the receive latency on T4.

    A lot has been posted lately. Can you point me to the best message / code I should review that makes the problem occur *more* than all others? I'm interested in learning why it's happening, what's really going on to cause the timing to go so far off the rails. Also really want to know if anyone can reproduce this on Linux? For me, testing stuff with Windows takes about 10X longer....

    Until I fully understand what's really happening (which honestly might not happen until long after T4 release), really not so interested in the workarounds. Not going to spend time on figuring out which way is best to avoid the not-yet-understood problem.

  12. #2687
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Quote Originally Posted by PaulStoffregen View Post
    Nope, not yet. Spent the time tracking down that receive size bug, and also looking at ways to minimize the receive latency on T4.

    A lot has been posted lately. Can you point me to the best message / code I should review that makes the problem occur *more* than all others? I'm interested in learning why it's happening, what's really going on to cause the timing to go so far off the rails. Also really want to know if anyone can reproduce this on Linux? For me, testing stuff with Windows takes about 10X longer....

    Until I fully understand what's really happening (which honestly might not happen until long after T4 release), really not so interested in the workarounds. Not going to spend time on figuring out which way is best to avoid the not-yet-understood problem.
    Paul - this is the post #2665 - this sketch change resolves the problem as I see it.

    Quote Originally Posted by defragster View Post
    Answers coded below in LPS_TEST.INO - SerMon's and lps_test.exe agree OverRun Surge is fixed for T_3's.

    Oddity Question:: Why does this not affect Linux the same?

    > T4 does not yet have a working Serial.availableForWrite(), and it already stops when dis-connected.
    > Using this on T_3's to limit the count increment solves the issue:: if ( Serial.availableForWrite() > 15 )

    My test code from before to show the millis is still in place and holds at zero. This code updated on github.com/Defragster/T4_demo/... /pjrc_latency_test

    Code:
    // https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=204681&viewfull=1#post204681
    uint32_t count, prior_count;
    uint32_t prior_msec;
    uint32_t count_per_second;
    
    // Uncomment this for boards where SerialUSB needed for native port
    //#define Serial SerialUSB
    
    void setup() {
      Serial.begin(1000000);
      while (!Serial) ;
      count = 10000000;
      prior_count = count;
      count_per_second = 0;
      prior_msec = millis();
    }
    int blog = 0;
    void loop() {
      Serial.print("c#");
      Serial.print(count);
      Serial.print(" b#");
      Serial.print(blog);
      Serial.print(", lines/s=");
      Serial.println(count_per_second);
      #if !defined(__IMXRT1062__)
      if ( Serial.availableForWrite() > 15 )
      #endif
        count = count + 1;
      uint32_t msec = millis();
      if (msec - prior_msec > 1000) {
        prior_msec = prior_msec + 1000;
        blog = (msec - prior_msec) / 10;
        count_per_second = count - prior_count;
        prior_count = count;
      }
    }
    BTW: Since it has to run a second or two now to get counts ... that next param is important

    ALSO - the counts reported now are more like 20K lines/sec - even when they were working the free running count++ in loop() was over estimating on T_3's.

    And opening a TyComm second instance to second Teensy drops the count on both - as one would expect with the machine bandwidth getting stretched.
    Here is a fresh run showing it starting at 0 and then going up to expected and believable value with that sketch edit - including the 'b#0' showing time skew from the 1 sec check at 0:
    Code:
    T:\T_Downloads\pjrc_latency_test>lps_test.exe COM8 4
    port COM8 opened
    repeat 80
    surge 0 delay
    #0 : __>> c#225627184 b#0, lines/s=12870 <<__
    #1 : __>> c#225628932 b#0, lines/s=0 <<__
    #2 : __>> c#225630687 b#0, lines/s=0 <<__
    #3 : __>> c#225632443 b#0, lines/s=0 <<__
    #4 : __>> c#225634198 b#0, lines/s=0 <<__
    #5 : __>> c#225635954 b#0, lines/s=0 <<__
    #6 : __>> c#225637709 b#0, lines/s=0 <<__
    #7 : __>> c#225639465 b#0, lines/s=0 <<__
    #8 : __>> c#225641467 b#0, lines/s=12546 <<__
    #9 : __>> c#225643515 b#0, lines/s=12546 <<__
    #10 : __>> c#225645563 b#0, lines/s=12546 <<__
    #11 : __>> c#225647611 b#0, lines/s=12546 <<__
    #12 : __>> c#225649659 b#0, lines/s=12546 <<__
    #13 : __>> c#225651707 b#0, lines/s=12546 <<__
    #14 : __>> c#225653755 b#0, lines/s=12546 <<__
    #15 : __>> c#225655803 b#0, lines/s=12546 <<__
    #16 : __>> c#225657851 b#0, lines/s=12546 <<__
    #17 : __>> c#225659899 b#0, lines/s=12546 <<__
    #18 : __>> c#225661947 b#0, lines/s=12546 <<__
    #19 : __>> c#225663995 b#0, lines/s=12546 <<__
    #20 : __>> c#225666043 b#0, lines/s=25668 <<__
    #21 : __>> c#225668091 b#0, lines/s=25668 <<__
    #22 : __>> c#225670139 b#0, lines/s=25668 <<__

  13. #2688
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Paul - FYI side note - your changes to Cores on T4 took the lps_test from 4k down to 3K lines/sec:

    Today:
    Code:
    #78 : __>> c#10172166 b#0, lines/s=2853 <<__
    Yesterday:
    Code:
    #7 : __>> count=10645383, lines/sec=3957 <<__

  14. #2689
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    Quote Originally Posted by defragster View Post
    your changes to Cores on T4 took the lps_test from 4k down to 3K lines/sec:
    I still see ~11000 lines/sec with Linux.

  15. #2690
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Quote Originally Posted by PaulStoffregen View Post
    I still see ~11000 lines/sec with Linux.
    YAY for you

    Odd. All debug prints disabled here …

    IDE SerMon and TyComm and CmdLine report

    give counts about :: 2833, 2856 and 2842 here on Windows 10.

    I just thought I'd mention it while what you changed was fresh because it had an effect. For 1062 that sketch change didn't alter anything, so it would seem to be Teensy USB edits.

    Odd the EXE is a couple lps UNDER TyComm - would expect it to be faster

    Doubled the 64K buffer to 128K for sustained reads and no change, moved the line per buffer print from the inner loop to outside and no change

  16. #2691
    Senior Member+
    Join Date
    Jul 2014
    Location
    New York
    Posts
    3,509
    @defragster
    Just saw these posts and just wanted to let you know I ran your sketch only and am seeing about 4000 lines/sec. This is prior to your latest change and direct from the sketch. If I run your .exe I am seeing 3999-4000 lps. I never got around to updating to your latest GitHub changes. Maybe something in your new push changed? Again this is on my Win10x64 machine with printf's turned on.

  17. #2692
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    @mjs513 ...
    odd … I went back to zip (3) and (4) exe's and they are both giving me the 2850 and (2) doesn't have that EXE created yet ...

    Maybe the printf's speed it up

  18. #2693
    Senior Member+
    Join Date
    Jul 2014
    Location
    New York
    Posts
    3,509
    Quote Originally Posted by defragster View Post
    @mjs513 ...
    odd I went back to zip (3) and (4) exe's and they are both giving me the 2850 and (2) doesn't have that EXE created yet ...

    Maybe the printf's speed it up
    Attached is the zip I currently have on the machine:

    pjrc_latency_test.zip

    You inspired me and I wound up installing minGW on the desktop today. Tomorrow I think I will try installing Linux on one of my old laptops gathering dust.

  19. #2694
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Looks like a size .exe I have and same results against my T4-2

    Code:
    #3 : __>> c#11714742 b#0, lines/s=2849 <<__
    That is against the updated sketch … is that what you see?

  20. #2695
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    Did you try running yesterday's code to check that it still gives ~4000 lines/sec on your machine? You know, just to rule out the possibility something may have changed with your computer...

    If it sounds as if I'm unconcerned whether the speed is 4000 or 2800 or 11000 or 27000, that's because indeed I really do not care at this point. Testing the performance of utterly unoptimized code may be fun, but it's kind of pointless. It really don't mean much. It's not even really an indication Linux is better than Windows.

    As I start optimizing, we're going to see these numbers climb well into the 6 digit range. That's when they'll matter.

    What is really important at this early stage is correctness. This is the time to focus on making sure errors aren't happening. Fixing errors, like the receive bug with the latency test, becomes harder as the code becomes more optimized.

  21. #2696
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    @Paul - as noted that was just FYI - in case it rang a bell while those changes were fresh in your mind. I did not run the old sketch - IIRC the change was to add one line - with an ifdef !1062 so I assumed there would not be any change. I'll see if I can confirm.

    IT WAS the old sketch - it shows the 4,000 - found here Teensy-4-0-First-Beta-Test
    > Didn't compare yet - but there was something else …
    I added two more 'Serial.print()' and that was it - back to 4K without them



    it isn't a worry - I'm sure it will all come together in the end. I only wanted you to know that the T_3's were reporting expected numbers and that you saw my change as a valid reason why - since they don't block on serial they were inordinately high on start and the loop() flow was and somehow kept over incrementing the count into following seconds for some time


    @mjs513 - that link I posted for MinGW had some clear steps and that is all I did - that and making sure another reasonable thing or two were checked IIRC.

  22. #2697
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Well that got to the bottom of that!

    >> Serial.printf( "count= %d, lines/sec=%d \n", count, count_per_second );

    Getting 20K lines/sec in SerMon, TyComm and with the lps_test.exe
    #75 : __>> count= 10203664, lines/sec=19978 <<__
    Serial.print() is WAY slow at this time - changing to Serial.printf().

    Doesn't help T_3.6 @180 or 256 now 19K lps, so T_3.6 almost 1K faster with .print()

    Code:
    // https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=204681&viewfull=1#post204681
    uint32_t count, prior_count;
    uint32_t prior_msec;
    uint32_t count_per_second;
    
    void setup() {
      Serial.begin(1000000);
      while (!Serial) ;
      count = 10000000;
      prior_count = count;
      count_per_second = 0;
      prior_msec = millis();
    }
    
    void loop() {
      Serial.printf( "count= %d, lines/sec=%d \n", count, count_per_second );
    #if !defined(__IMXRT1062__)
      if ( Serial.availableForWrite() > 15 )
    #endif
        count = count + 1;
      uint32_t msec = millis();
      if (msec - prior_msec > 1000) {
        prior_msec = prior_msec + 1000;
        count_per_second = count - prior_count;
        prior_count = count;
      }
    }

  23. #2698
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    20,173
    Quote Originally Posted by defragster View Post
    changing to Serial.printf().
    Please don't.

    By doing this, you're changing the benchmark from what is was meant to test to something else, more similar to the very old benchmark Manitou ran yesterday (showing 7.7 Mbyte/sec speed with his fast Linux system).

    FWIW, with this change I get 59370 lines/sec on Linux. It's not "fixing" anything, just changing to measuring something else which runs faster.

    If you do keep running this, please call it something very different. A speed benchmark is meaningful only if we all do it the same way, so the numbers are comparable.

  24. #2699
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    @Paul:
    INDEED :: On my machine I already left the old code when I made the change for that reason - it is under #ifdef as I just wanted to the diff since adding two lines dropped it 25%. I'll change it to a comment for reference.

    Amazing it makes that much difference on Linux! Though that is an increase proportional to what that the PC saw.

    Just started writing the INVERSE test - so far showing 18K lines/sec if what I have is working:
    > Using PC code to send similar block of text: len = sprintf( buf, "count= %d, lines/sec=%d \n", count, count_per_second );

    For now the sketch just counts '\n' chars per second as it is doing : c = Serial.read();
    > Then once per second print out Serial4 and seeing 18K


    Very crude and no double check yet the data is right as my Teensy doesn't have a display and it just happened ...

  25. #2700
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    8,826
    Paul - just did 7+ 15 sec Restore holds - on computer then on USB battery all failed.

    After 15 secs a blip of the red bootloader LED - then release and nothing.

    Then I realized the two Teensy Debug Serial Rx/Tx pin devices were powered - same USB Hub.

    So the same effect for normal power On button, or plug of powering USB onto T4 halts the CPU startup for bootloader controlled restore.

    <edit>: The entry in MSG #6 updated with added link to this post and put a comment about UART power and MCU startup.
    Last edited by defragster; 05-03-2019 at 08:21 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •