Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 2 1 2 LastLast
Results 1 to 25 of 31

Thread: Optimization Fast/Faster/Fastest with/without LTO?

  1. #1
    Junior Member
    Join Date
    Jun 2015
    Posts
    15

    Optimization Fast/Faster/Fastest with/without LTO?

    Hi Folks,

    I've tried searching around but wasn't able to find much info on what the difference between the various optimization options are - Fast / Faster / Fastest and there are with LTO options too. Any explanation would be appreciated.

    Thanks

  2. #2
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    A good reason is: There isn't a simple answer - it can vary with included code present.

    Increasing level of compiler provided optimizations follow the implied naming - reaching for more obscure or extensive changes to generate 'efficient' code. Sometime fastest may not be and resulting code size can be a factor as code is replicated or altered to be faster it may get larger if that is a concern. The 'default' level of optimization was chosen as it generally results in expected results and a usable program. Some optimizations may result in odd/unexpected execution.

    LTO is Link Time Optimizations where the totality of the compiled source code is scanned and elements are viewed as unused and dropped or altered in there placement. Some rare times this has been seen to make a non executable program in errant cases where a code or compiler expectation is interpreted differently - other times it can be faster.

  3. #3
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    6,906
    Quote Originally Posted by Zaite12 View Post
    Hi Folks,

    I've tried searching around but wasn't able to find much info on what the difference between the various optimization options are - Fast / Faster / Fastest and there are with LTO options too. Any explanation would be appreciated.

    Thanks
    fast, faster, fastest translate to GCC -Ox options, where x= optimization level.
    https://gcc.gnu.org/onlinedocs/gcc/O...e-Options.html
    -LTO is link time optimization, you can read about in the GCC docs, too.
    Last edited by Frank B; 04-25-2017 at 07:18 PM.

  4. #4
    Junior Member
    Join Date
    Jun 2015
    Posts
    15
    Thank you very much both, that is most helpful. Appreciated

  5. #5
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276

    Smallest Code causing incorrect execution

    Quote Originally Posted by defragster View Post
    The 'default' level of optimization was chosen as it generally results in expected results and a usable program. Some optimizations may result in odd/unexpected execution.
    What is the default level: Fast?

    I am seeing instability with printf() causing among other things incorrect printf output or the worst so far: cessation of all SPI comms with optimize=Smallest Code. With Optimize=Debug or Fast, execution seems correct.

  6. #6
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    Quote Originally Posted by bboyes View Post
    What is the default level: Fast?
    It seems FAST is general default , have to open the IDE to confirm that for the Teensy at hand - T_LC does SMALL IIRC.

    If you can repro with a general example - posting might help get to the bottom of a problem. Perhaps printing too much/fast, within an isr() or trashing memory with bad pointer/array abuse, behavior of some things can be hidden depending on compile/link options and where things end up being placed.

  7. #7
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276

    Example: not the simplest, but it's repeatable

    Quote Originally Posted by defragster View Post
    If you can repro with a general example - posting might help get to the bottom of a problem.
    I have a reproducible example. It uses Teensy 3.2, WIZ850io, and a TMP102 temperature sensor. We have a custom board with this hardware, but those parts in white board with a TMP102 breakout (sparkfun or adafruit) will also work. Is that simple enough? (If not I can try to hack out the TMP102 use and see if the bug still persists.) The example is built on the canonical Ethernet WebServer example. The working example with some additional google doc about it is at https://github.com/systronix/W5500_Test, the TempServer example, it's there now. The TMP102 library has been in use more than a year and is very stable.

    BTW in Arduino 1.8.1 and TD 1.35, with Smallest Code, the printf() cause all output of SPI to cease, which of course breaks the whole Ethernet interface.
    Last edited by bboyes; 05-03-2017 at 12:32 AM.

  8. #8
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    The toolchain in 1.35 was what it was before - for TD 1.36 the tool chain (compiler/linker unique to build for Teensy) was wholly changed with update to newer version.

    IDE version of 1.8.1 or 1.8.2 should be fine probably - but you should be on TD_1.36 as that is what can be changed? Do you see it there?

  9. #9
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    How much too small is ID[32]?
    Code:
    	char ID[32];
    
    	sprintf(ID, "%08lX %08lX %08lX %08lX", SIM_UIDH, SIM_UIDMH, SIM_UIDML, SIM_UIDL);
    My guess is 4 bytes ... Things like this "trashing memory with bad pointer/array abuse" will trash something somewhere . . .

  10. #10
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,104
    Good catch. Looks like ID really needs to be 35 bytes.

  11. #11
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    Quote Originally Posted by PaulStoffregen View Post
    Good catch. Looks like ID really needs to be 35 bytes.
    I guessed 36 - three 'spaces' - plus a Null?

    It only takes one thing like that to break stuff - I stopped looking after I saw that. That and not being sure the compiler in use was the TD_1.36 build.

  12. #12
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,104
    Yes, you're right, 36.

  13. #13
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276
    Yes, using TD 1.36. I'm impressed by your cerebral debugger, 36. That's what I get for copying some code without looking at it more closely. I will fix that, rerun it and also push that back to the original author of that ID code. Thanks!

  14. #14
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,790
    Hopefully it will work now if there aren't additional gotchas
    That's why "post complete source code" is Paul's rule and seems to pay off - as it is the most efficient way to get fresh eyes on the problem.

    ... I assumed something like that might be the cause - but even posting that doesn't always get another to be critical/knowing enough of code that is so simple as to be obvious or was copied.

  15. #15
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,104
    There very well could still be subtle bugs in the Ethernet library. Those Wiznet chips might also have bugs. I've seen strange behavior from them which gives me pause.

    Realistically, I can't do much on Ethernet until after Maker Faire. Even then, I'm planning to focus on several Audio lib features and general website improvements. But if anyone does post a test case I can reproduce, I'll be very tempted to dig into it. There are several reports of Ethernet slowing or becoming unreliable after hours of use. So far I have not managed to reproduce any. Most hav been reported without full code....

  16. #16
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276

    Moving ahead, thanks... Ethernet library...

    Quote Originally Posted by PaulStoffregen View Post
    There very well could still be subtle bugs in the Ethernet library. Those Wiznet chips might also have bugs. I've seen strange behavior from them which gives me pause.
    The fixed version of Temperature Server is up and running at systronix.hopto.org:8080, you are welcome to hit it once in a while as part of the test.

    There are many things I don't understand about the Ethernet library:
    1. almost no return values from functions, so how do you know if they are succeeding? As a ridiculous test, I can ground the MISO signal (so no data from the WIZnet 850io ever arrives). Guess what? Not a single exception thrown. Can my code tell if I am even still communicating with the WIZnet chip?
    2. Without the ability to detect malfunctions, it's not possible to write code which will recover from them...
    3. Missing abilities such as even discovering the remote IP address of an http requester. This addition was done and rejected by the Arduino keepers. Many things seem to be rejected in order to keep it "simple for beginners" and to keep code small so it will run on 15-year old 8-bit AVRs (at the expense of support for the host of new ARM versions). I don't know enough about how gcc handles library options or granularity, but crippling the ability of new technology just to maintain backwards compatability with old technology (if that is indeed what is happening) seems ridiculous.
    4. Can't tell how many sockets are open and other measures of current status of the Ethernet connection(s)
    5. If the core SPI is like Wire, we have reason to worry about that level in the stack of layers under Ethernet: is SPI part of the problem?
    6. It's like Wire vs i2c_t3 all over again except where is the more capable, robust Ethernet library? I've googled around and I don't see one.

    So what I observe happening, after some hours (always less than 24): the server stops responding to requests. I can trace the request up to the router which passes it to the WIZnet node, and I can see SPI talking, both MOSI and MISO are still alive but all http request are getting ignored. I have a Total Phase Beagle SPI sniffer; I just wish it could decode the meaning of the SPI data. I need to stare at the W5500 data sheet for a while and decode it manually.

    I'm going to have to try to add some of these things in order to have code which is robust enough for the task, and we're on a schedule... so my interest in helping on this is keen and urgent.
    Last edited by bboyes; 05-04-2017 at 11:13 PM.

  17. #17
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276

    Thanks!

    Quote Originally Posted by defragster View Post
    Hopefully it will work now if there aren't additional gotchas
    That's why "post complete source code" is Paul's rule and seems to pay off - as it is the most efficient way to get fresh eyes on the problem.
    Thanks so much for your "fresh eyes". If we ever meet, and food or drink are involved, it's on me.

    I need to learn how to turn on compiler options such as keeping around the data map file. And maybe start building with some other tool than Arduino. I don't use the Arduino IDE to edit (I use Sublime text now, and Eclipse in the past) or load hex files (TyQt), just to compile. Any advice there? I'm running on Windows now but have been thinking about moving to a Linux build environment.

    Also I need to find some tool like lint to pass over my files in hopes of catching some things like this...

  18. #18
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,104
    Quote Originally Posted by bboyes View Post
    Also I need to find some tool like lint to pass over my files in hopes of catching some things like this...
    You could use snprintf() instead of sprintf(). Won't solve other issues, but it's a good habit to always add that extra "n" and give it the buffer size.

  19. #19
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,531
    Deja Vu. The same snippet of code with the sprintf() stack overflow was also discussed at
    https://forum.pjrc.com/threads/42033...l=1#post133281

  20. #20
    Quote Originally Posted by bboyes View Post
    So what I observe happening, after some hours (always less than 24): the server stops responding to requests. I can trace the request up to the router which passes it to the WIZnet node, and I can see SPI talking, both MOSI and MISO are still alive but all http request are getting ignored. I have a Total Phase Beagle SPI sniffer; I just wish it could decode the meaning of the SPI data.
    ... so my interest in helping on this is keen and urgent.
    I have no idea if this is related but I experienced a similar behavior.

    I luckily had included a Telnet connection that allowed me to examine the socket status of all the WizNet sockets. I found that slowly all of the available sockets would become "stuck". They would usually be stuck in "Close Wait" status and the library would not assign (re-use) a socket locked in "Close Wait". The library I was using did not have a time-out to forcibly close a socket stuck in "Close Wait" after some suitable period.

    I could watch my free sockets get stuck one by one until finally the system became unresponsive to any request.

    It required a reboot to free the "stuck" sockets.

    Note that there is/was some consideration for this case in the Teensy 1.6.12 ethernet library. See socket.cpp and note especially the #IF 0 related to releasing a socket in "Close Wait". Since this is a last ditch attempt to allocate a socket when no others are free it may be worth the risk of not having all data flushed... In my case, it certainly would have been ok.

    A simple patch to socket.cpp in the ethernet library that enables this option may be a test tool to discover if a similar condition is causing your issue.

    Code:
    uint8_t socketBegin(uint8_t protocol, uint16_t port)
    {
    	uint8_t s, status[MAX_SOCK_NUM];
    
    	//Serial.printf("W5000socket begin, protocol=%d, port=%d\n", protocol, port);
    	SPI.beginTransaction(SPI_ETHERNET_SETTINGS);
    	// look at all the hardware sockets, use any that are closed (unused)
    	for (s=0; s < MAX_SOCK_NUM; s++) {
    		status[s] = W5100.readSnSR(s);
    		if (status[s] == SnSR::CLOSED) goto makesocket;
    	}
    	//Serial.printf("W5000socket step2\n");
    	// as a last resort, forcibly close any already closing
    	for (s=0; s < MAX_SOCK_NUM; s++) {
    		uint8_t stat = status[s];
    		if (stat == SnSR::LAST_ACK) goto closemakesocket;
    		if (stat == SnSR::TIME_WAIT) goto closemakesocket;
    		if (stat == SnSR::FIN_WAIT) goto closemakesocket;
    		if (stat == SnSR::CLOSING) goto closemakesocket;
    	}
    #if 0
    	Serial.printf("W5000socket step3\n");
    	// next, use any that are effectively closed
    	for (s=0; s < MAX_SOCK_NUM; s++) {
    		uint8_t stat = status[s];
    		// TODO: this also needs to check if no more data
    		if (stat == SnSR::CLOSE_WAIT) goto closemakesocket;
    	}
    #endif
    	SPI.endTransaction();
    	return MAX_SOCK_NUM; // all sockets are in use
    In any event, it may be enlightening for your problem to add a socket status monitor feature to watch your free socket count come and go as the system responds to various requests.

    The code snippet I used to monitor and watch this happen is

    Code:
        static const char *SnMr[] = {"Close", "TCP", "UDP", "IPRAW", "MACRAW"};
        char socStatus[7];
    
        for (uint8_t i = 0; i < 8; i++) {
          switch (socketStatus(i)) {
            case 0x00:
              sprintf(socStatus, "Closed");
              break;
            case 0x14:
              sprintf(socStatus,"Listen");
              break;
            case 0x17:
             sprintf(socStatus, "Establ");
              break;
            case 0x1c:
             sprintf(socStatus, "ClWait");
              break;
            default:
              sprintf(socStatus, "0x%02x  ", socketStatus(i));
              break;
          }
    
          TelnetServer.printf("Socket(%d) SnSr = %s SnMR = %s\r\n", i, socStatus, SnMr[W5100.readSnMR(i)]);
        }
    Using this simple monitoring tool, I also quickly discovered that under some conditions I could run out of available sockets when using the default MAX_SOCK_NUM of four sockets.... This was a separate problem from the "Stuck in Close Wait" issue. It was just nice to watch and verify that the system had the resources to "go the distance" and accomplish what I needed it to do.

    As you can see, the above is only a code snippet as your case is no doubt different. Perhaps you have a local serial port that can periodically spew the socket status.

    In my case I was snowed out of the remote site and could only Telnet in over the network until I foolishly created a "test case" that used the last socket and the system was unresponsive until the snow melted this spring.
    Last edited by drmartin; 05-04-2017 at 05:07 AM.

  21. #21
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276
    @drmartin That sounds useful, will try it. I just ran TempServer with about 15 clients, so about 3 requests/second and managed to get it to be unresponsive in less than an hour.

    What hardware are you using? What test code were you running: http server? Or some UDP application?

    It would seem there should be a way to recover stuck sockets in firmware by some command to the W5500 chip...
    Thanks
    Bruce

    How do I tell the version of Paul's Ethernet lib? https://github.com/PaulStoffregen/Ethernet

  22. #22
    Quote Originally Posted by bboyes View Post
    What hardware are you using?
    My application was using Teensy 3.2(s)

    What test code were you running: http server? Or some UDP application?
    My application was remote data monitoring and logging. I have TFT Display, Push Buttons, One-Wire Temperature Sensors, I2C Temperature, Pressure, and Humidity, and multiple I2C Voltage and Current sensors being logged to SD Card. The amount of data being collected and transferred via the Ethernet Network was significant. This effort is supported by a Telnet Server, an FTP Client, and an NTP Client. Multiple sites had been running continuously for over 14 months without a problem. It was only when I was trying to "improve" the back end stuff that I was able to generate some incomplete TCP connect/disconnects that left the Teensy "confused" at one of the sites. Failure analysis resulted in the comments noted above. The issue was reproducible and I was finally able lock everything up as I was pushing the limits to gather enough data to understand exactly who/what/when and why.

    It would seem there should be a way to recover stuck sockets in firmware by some command to the W5500 chip...
    There most definitely is a way to recover. However, in my case, I created the problem at a time when I could not immediately reload the system with a new code version that included a recovery technique. I was literally "stuck" until road conditions improved. My focus at the time was to understand what had happened so I could plan for a more robust version later and not leave the same trap available in future projects.

    Again, this code had been running fine at multiple sites continuously for over a year. I have several man-years of reliable Ethernet library performance logged. I just finally stepped on a hidden "land mine" and it blew up in my face. I wanted to make sure it didn't happen again.
    Last edited by drmartin; 05-05-2017 at 06:40 AM.

  23. #23
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276
    Quote Originally Posted by drmartin View Post
    ...Multiple sites had been running continuously for over 14 months without a problem. It was only when I was trying to "improve" the back end stuff that I was able to generate some incomplete TCP connect/disconnects that left the Teensy "confused" at one of the sites.
    Just for another data point, I loaded and ran the canonical Ethernet>Webserver example, with only the needed network values changed, and output of the current millis() in the page. Also a couple of printf to the serial monitor. Then I had four local browser tabs open, just on the LAN, so no Internet needed. It became unresponsive after about 18 hours. My github page for that is here. SPI messages seem to have the same pattern as when TempServer failed. My point in doing this is to see if even the most simple possible server example, provided in the official Arduino examples, would fail, and it does.

    This supports the idea that the problem is most likely in the W5500 baked-in firmware, and/or the Ethernet library.

    What's somewhat interesting is that I also have DHCP and NTP tests (code and docs in my W5500_test repo) which run for days (limited by either local power outage or need to reboot Windows).

    So the obvious question is: what's different? Why would http server fail so much more easily than NTP client?

    Next step is to instrument and watch sockets and other W5500 health measures and see if I can zero in on this.

    @drmartin: what is your recovery technique? If I understand you, it was something in your code responsible for the socket loss, not the underlying Ethernet library? If so, what was that cause?

    In the case of TempServer, I detect incomplete http requests and close the connection with client.stop(). In a recent test cycle that code did execute three times.

    Thanks!

  24. #24
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,531
    Quote Originally Posted by bboyes View Post
    What's somewhat interesting is that I also have DHCP and NTP tests (code and docs in my W5500_test repo) which run for days (limited by either local power outage or need to reboot Windows).

    So the obvious question is: what's different? Why would http server fail so much more easily than NTP client?
    NTP and DHCP use UDP, a stateless protocol that doesn't mind if packets are lost, duplicated, or arrive out of order. http is based on TCP, a stateful protocol, with timers and retransmissions. There are some TCP options (KEEP_ALIVE) that attempt to detect if the other end of the connection has died, but i'm not sure if wiznet supports the option. Several of the handshakes between TCP client and server should have timeouts (SYN, FIN), but depending on how robust the wiznet implementation is (not the Ethernet lib), there could be scenarios where the socket fails to close. That could be remedied with timers in your sketch and peeking at socket status as suggested in earlier post, and perhaps clearing socket data in the wiznet chip or resetting the whole chip. You could also use a UDP port to do remote telemetry, sending back socket status or accepting commands to reset.

    nefarious note: if your http server is reachable from the internet, bad guys are constantly trolling the internet looking for targets or purposely opening but not closing TCP connections, or flooding servers with SYN packets, and other nasty packets that might consume all (4) of the wiznet sockets ... sigh

    Re: you can also get the client IP address from the wiznet chip, see for example https://forum.arduino.cc/index.php?topic=82416.0
    Last edited by manitou; 05-05-2017 at 07:18 PM. Reason: fix url

  25. #25
    Senior Member
    Join Date
    Nov 2012
    Location
    Salt Lake City, UT, USA
    Posts
    276

    Fork this Ethernet topic to a new thread

    Since this original topic is about optimizations, and this resulting Ethernet thread seems of interest, I have made a new thread with a more appropriate topic: Ethernet library socket issues.

    Also @manitou I think you meant to reference this link?: How to obtain the remote client IP address when using the Ethernet Shield
    Last edited by bboyes; 05-05-2017 at 07:14 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •