String class warning on Arduino forum

Status
Not open for further replies.

Fluxanode

Well-known member
I would like to use the String class functions in the Teensy but have seen a warning about it. Specifically like to try using readStringUntil().

Warning was in https://forum.arduino.cc/index.php?topic=495454.0.
"It is not a good idea to use the String (capital S) class on an Arduino as it can cause memory corruption in the small memory on an Arduino. Just use cstrings - char arrays terminated with 0."

Is this an issue with the Teensy's? Looks like the String class would be handy.
 
Use of dynamic memory allocation (used within Arduino's String class) is a controversial subject. People with very conservative coding views will say to avoid any use of dynamic memory for "embedded" programming. They do have a point, that some risk is involved due to the possibility of memory fragmentation.

With any sort of risk, not just programming, decisions usually involve estimating the odds of a bad outcode and the likely cost if it does happen, against the benefits that come with taking the risk, or against the cost of an alternative which reduces or eliminates the risk. Life is full of trade-offs...

The highly conservative viewpoint is that no risk is acceptable under any circumstances. Often people who feel this way will exaggerate the risk and dismiss any concept of trade-off between costs & benefits, especially when communicating online with strangers. For example, in reply #9 on that Arduino forum thread, code is given which is claimed "will lock up in under a second". However, the code has at least 2 potential issues unrelated to dynamic memory. First, it prints increasing long strings, which puts tremendous CPU demand on the Arduino Serial Monitor. Second, it prints without any delay, which can flood your PC with data far too fast if using Teensy 4. Here is a very slightly modified copy, which adds newline characters so lines stay short, and delays 1 millisecond per loop.

Code:
String bigAssString;
int i;
void setup() {
  Serial.begin(9600);
}
void loop(){
  bigAssString += i++;
  bigAssString += '\n';
  Serial.print("bigAssString = ");
  Serial.println(bigAssString);
  delay(1);
}

I tried this on Teensy LC and Teensy 4.1. I also ran it on Arduino Uno. It does not lock up on any of those boards. I ran it for several minutes on each.

I also ran the original code from that message on Arduino Uno (which can't print fast enough to overwhelm the serial monitor, even with very long lines). The original code doesn't lock up either. I'm sure MorganS had the best of intentions in writing that message, but as anyone can clearly see by running it, he didn't actually test his code (or if he did, not with Arduino Uno which has the smallest memory of any current Arduino product). He just assumed it would crash. In fact it keeps running indefinitely. Or at least it has run for about 15 minutes that I've tested.

Of course an infinitely growing string can not grow beyond the limited memory. Something must go wrong. But when it comes to evaluating the risk, there is a world of difference between your program crashing or locking up as claimed versus "wrong" behavior like the string being truncated or failing to grow any larger. To be honest, I haven't tried to look at how this is actually being handled. I just let the code run and casually watched whether it was still spewing a sea of numerical digits to the serial monitor.

This doesn't prove String using will not crash or lock up under all circumstances. But I believe it does prove that some of the highly conservative advice you'll hear is based much more on intuition and assumptions than actual testing. There are risks, but perhaps those risks are not as dire as some people assume.
 
If you do use String, one way to mitigate much of the memory fragmentation risk is with the reserve() function. It causes the String to pre-allocate at least that much memory. The other thing you can do is avoid growing the string beyond that length. In the example above, you would check if the string is still less than the amount you reserved and only add another character if the String length is under that threshold.
 
I would like to use the String class functions in the Teensy but have seen a warning about it. Specifically like to try using readStringUntil().

Warning was in https://forum.arduino.cc/index.php?topic=495454.0.
"It is not a good idea to use the String (capital S) class on an Arduino as it can cause memory corruption in the small memory on an Arduino. Just use cstrings - char arrays terminated with 0."

Is this an issue with the Teensy's? Looks like the String class would be handy.

Firstly early versions of String in the Arduino library had bugs, putting a lot of people off using it.

Secondly the standard Uno has 2k of SRAM total, which is very cramped - even with a timeout set, readStringUntil()
could rapidly exhaust memory in a fault situation, and this usually leads to heap and stack colliding and undefined
outcome (crash, random behaviour, badness).

With a lot more RAM its easier to ensure enough space is available for the worst-case, even allowing for fragmentation
(I think the standard Arduino libraries use a buddy system, IIRC, which has reasonable robustness to fragmentation).

The extreme aversion to dynamic memory allocation stems from mission critical work, where life and limb
are involved - proving dynamic memory allocation is suitably bounded is a hard problem theoretically,
proving static memory fits into the space available is simple and mechanically checkable. You only have to
read about failures and near failures of embedded coding for spacecraft to appreciate the extreme cost of getting
anything wrong.

Also you have to consider data that's used by an ISR - there are hazards involved when using free() with
volatile pointers.

Once you get to larger systems with more memory you have a chance of monitoring memory allocation in real time
and warn about impending problems before they bite.

So if you're using up a lot of Teensy memory, you might want to avoid things like String, but for a small program its
not a deal-breaker. From what I understand malloc'd memory on T4's is slower than statically allocated though, so
performance issues are tangled up with the decision for that architecture.
 
I expect that for 99+% of cases using "small" Strings, it works well and creates cleaner code with less bugs.

On the other hand, it's easy to come up with cases that inadvertently fail. For example, you have 2K of heap space, so you think it's OK to use a 1K String. Then you write: Serial.print(string + "\n") and the program crashes. OK, I didn't code that up :).
 
Secondly the standard Uno has 2k of SRAM total, which is very cramped - even with a timeout set, readStringUntil() could rapidly exhaust memory in a fault situation

Is this an assumption or based on actual testing? And what specifically is meant by "fault situation"? If it does "exhaust memory", what is the consequence?

Might also be worth mentioning readStringUntil() is one of the places where Teensy breaks with Arduino's API somewhat. The Teensy version has a 2nd input for the maximum length, which defaults to 120 chars.
 
On the other hand, it's easy to come up with cases that inadvertently fail. For example, you have 2K of heap space, so you think it's OK to use a 1K String. Then you write: Serial.print(string + "\n") and the program crashes. OK, I didn't code that up :).

Ok then, let's put that to the test.

Code:
unsigned int count = 0;
void setup() {
  Serial.begin(9600);
}
void loop() {
  Serial.print("loop");
  Serial.println(count++);
  String string = F("The extreme aversion to dynamic memory allocation stems from mission critical work, where life and limb are involved - proving dynamic memory allocation is suitably bounded is a hard problem theoretically, proving static memory fits into the space available is simple and mechanically checkable. You only have to read about failures and near failures of embedded coding for spacecraft to appreciate the extreme cost of getting anything wrong.  Also you have to consider data that's used by an ISR - there are hazards involved when using free() with volatile pointers.  Once you get to larger systems with more memory you have a chance of monitoring memory allocation in real time and warn about impending problems before they bite.  I expect that for 99+% of cases using \"small\" Strings, it works well and creates cleaner code with less bugs.  On the other hand, it's easy to come up with cases that inadvertently fail. For example, you have 2K of heap space, so you think it's OK to use a 1K String. Then you write: Serial.print(string + \"\n\") and the program crashes. OK, I didn't code that up :-).\n");
  Serial.println(string + "\n");
}

Here is the result of running this program on Arduino Uno.

sc.png

Obviously it can't create a copy of the string, but the program definitely is not crashing.

I'm not saying crashes are impossible. There may indeed be cases where the board completely locks up. But now I've put 2 programs to the test on Arduino Uno with only 2K of RAM and the result sure looks like the String class handles out of memory situations much better than assumed.
 
p#7 Output left me wondering where in the process that was - as the string didn't show - because it doesn't have memory to print it. I bought an UNO once ... somewhere ... but there was a SFUN redboard (uno) beside me. Bit of a remembering curve to set the baud rate ... Teensy Way more Cool!

Found a note about free RAM and added it below and truncated the string shows 1770 bytes of RAM:
1770
loop3677
The extreme aversion to dynamic memory allocation stems

Removing the string truncation and comment shows only 725 bytes of RAM - it runs but 'gracefully fails' and cannot print the string:
725
loop12463

for fun - the code as run:
Code:
unsigned int count = 0;
unsigned int count = 0;
void setup() {
  Serial.begin(9600);
  while( !Serial);
}
void loop() {
  Serial.print("loop");
  Serial.println(count++);
  String string = F("The extreme aversion to dynamic memory allocation stems[B][COLOR="#FF0000"]");// [/COLOR][/B]from mission critical work, where life and limb are involved - proving dynamic memory allocation is suitably bounded is a hard problem theoretically, proving static memory fits into the space available is simple and mechanically checkable. You only have to read about failures and near failures of embedded coding for spacecraft to appreciate the extreme cost of getting anything wrong.  Also you have to consider data that's used by an ISR - there are hazards involved when using free() with volatile pointers.  Once you get to larger systems with more memory you have a chance of monitoring memory allocation in real time and warn about impending problems before they bite.  I expect that for 99+% of cases using \"small\" Strings, it works well and creates cleaner code with less bugs.  On the other hand, it's easy to come up with cases that inadvertently fail. For example, you have 2K of heap space, so you think it's OK to use a 1K String. Then you write: Serial.print(string + \"\n\") and the program crashes. OK, I didn't code that up :-).\n");
  Serial.println(string + "\n");
  Serial.println(freeMemory());
}

#ifdef __arm__
// should use uinstd.h to define sbrk but Due causes a conflict
extern "C" char* sbrk(int incr);
#else  // __ARM__
extern char *__brkval;
#endif  // __arm__

int freeMemory() {
  char top;
#ifdef __arm__
  return &top - reinterpret_cast<char*>(sbrk(0));
#elif defined(CORE_TEENSY) || (ARDUINO > 103 && ARDUINO != 151)
  return &top - __brkval;
#else  // __arm__
  return __brkval ? &top - __brkval : &top - __malloc_heap_start;
#endif  // __arm__
}

Putting the p#2 bigA$$String in with the above it runs fine and the RAM decrements to end with this - after some loops ... the concatenation stops ... but the RAM stays steady at 162 bytes free::
...
404
405

223
loop406
The extreme aversion to dynamic memory allocation stems // LAST PRINT OF THIS - RAM at 223

bigAssString = 0
1
2

...

419
420 // LAST UPDATE OF THIS - RAM at 163

163
loop421


bigAssString = 0
1
2

... // MUCH LATER

419
420


162 // RAM STILL AT 162 BYTES - and running ...
loop775


bigAssString = 0
1
2
 
Last edited:
Sure, we can code up cases that fail and cases that crash. Given that quietly failing may be worse than crashing, the distinction may not be important.

Could one use String more safely? I only glanced at it, but it looks like malloc() reserves 128 bytes of stack space beyond the current stack pointer. This may or may not be enough to avoid stack corruption (leading to crashes or "random" behavior). Perhaps this 128 bytes should more easily adjusted. Along with more ways to detect failed allocations by the String routines?
 
Some cases expected to crash fail - do not with current String support.

As noted - I wasn't sure all of what the test output was showing - and if there might be more to show it is running steady state there ( ran 15 minutes ... ) - i.e. RAM not leaking to death - but just no longer able to perform functions in limited RAM.

Indeed the cases that run out of RAM - fail gracefully - I'd say that is a WIN - because using memory dynamically without bound needs to be prevented/accounted/tested for and at least it is measured in failing and measurable with checking on RAM when something seems amiss.

Those cases don't ask for fragmentation and other touchy areas - but general function seems as expected.
 
There is the gcc option "-fstack-protector-strong", but not clear what it does on a teensy.

Linux libraries set a global variable "errno" when a malloc fails, but I don't see similar for teensy. How can one be sure that temporary String use like Serial.print(s + "\n") succeeded?

I expect that there is a lot that could be done - coding practices, libraries, gcc options, etc to get closer to "mission critical" or "hacker proof" reliable. But it's much more complex than "don't use String".
 
There is the gcc option "-fstack-protector-strong", but not clear what it does on a teensy.

Linux libraries set a global variable "errno" when a malloc fails, but I don't see similar for teensy. How can one be sure that temporary String use like Serial.print(s + "\n") succeeded?

I expect that there is a lot that could be done - coding practices, libraries, gcc options, etc to get closer to "mission critical" or "hacker proof" reliable. But it's much more complex than "don't use String".

Serial.Print() returns the number of characters printed, so when the print fails it can be monitored.

And when expanding the String the length can be checked of course :: yy=bigAssString.length();

current combo sketch - much faster at 115200 baud:
Code:
// https://forum.pjrc.com/threads/62859-String-class-warning-on-Arduino-forum?p=251753&viewfull=1#post251753
String bigAssString;
int i;
unsigned int count = 0;
void setup() {
  Serial.begin(115200);
  while ( !Serial);
}
void loop() {
  int xx,yy;
  Serial.print("loop");
  Serial.println(count++);
  String string = F("The extreme aversion to dynamic memory allocation stems");// from mission critical work, where life and limb are involved - proving dynamic memory allocation is suitably bounded is a hard problem theoretically, proving static memory fits into the space available is simple and mechanically checkable. You only have to read about failures and near failures of embedded coding for spacecraft to appreciate the extreme cost of getting anything wrong.  Also you have to consider data that's used by an ISR - there are hazards involved when using free() with volatile pointers.  Once you get to larger systems with more memory you have a chance of monitoring memory allocation in real time and warn about impending problems before they bite.  I expect that for 99+% of cases using \"small\" Strings, it works well and creates cleaner code with less bugs.  On the other hand, it's easy to come up with cases that inadvertently fail. For example, you have 2K of heap space, so you think it's OK to use a 1K String. Then you write: Serial.print(string + \"\n\") and the program crashes. OK, I didn't code that up :-).\n");
  xx = Serial.print(string + "\n");
  Serial.print( "long text printed bytes ==");
  Serial.println( xx);
  bigAssString += i++;
  yy=bigAssString.length();
  if ( !(i % 20) )
    bigAssString += '\n';
  else
    bigAssString += ',';

  Serial.print("\tbigAssString = :: length()=");
  Serial.println( yy);
  xx = Serial.println(bigAssString);
  Serial.print( "\tbAS print size==");
  Serial.println( xx);
  Serial.print( "\tRAM freeMemory() == ");
  Serial.println(freeMemory());
}

#ifdef __arm__
// should use uinstd.h to define sbrk but Due causes a conflict
extern "C" char* sbrk(int incr);
#else  // __ARM__
extern char *__brkval;
#endif  // __arm__

int freeMemory() {
  char top;
#ifdef __arm__
  return &top - reinterpret_cast<char*>(sbrk(0));
#elif defined(CORE_TEENSY) || (ARDUINO > 103 && ARDUINO != 151)
  return &top - __brkval;
#else  // __arm__
  return __brkval ? &top - __brkval : &top - __malloc_heap_start;
#endif  // __arm__
}

Output snippet - where this goes ZERO - long text printed bytes ==0, then were bAS$ won't grow:
Code:
loop386
[B]The extreme aversion to dynamic memory allocation stems
long text printed bytes ==56
[/B]	bigAssString = :: length()=1437
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
...
360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379
380,381,382,383,384,385,386,
[B]	bAS print size==1440
	RAM freeMemory() == 219
[/B]loop387
[B]long text printed bytes ==0
[/B]	bigAssString = :: length()=1441
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99
100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119
120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139
140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179
180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199
200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219
220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259
260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279
280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299
300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319
320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339
340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359
360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379
380,381,382,383,384,385,386,387,
	[B]bAS print size==1444
	RAM freeMemory() == 215
[/B]
...

loop401

long text printed bytes ==1
	bigAssString = :: length()=1494
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99
100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119
120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139
140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179
180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199
200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219
220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259
260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279
280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299
300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319
320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339
340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359
360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379
380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399
400,,
[B]	bAS print size==1497
	RAM freeMemory() == 162
[/B]
...

[B]loop2790[/B]

long text printed bytes ==1
	bigAssString = :: length()=1495
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99
100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119
120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139
140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159
160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179
180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199
200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219
220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259
260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279
280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299
300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319
320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339
340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359
360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379
380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399
400,,
[B]	bAS print size==1497
	RAM freeMemory() == 162
[/B]
 
Interesting that the MISRA C coding standard doesn't allow the use of dynamic memory allocation. I believe that JPL/NASA does the same.
 
Status
Not open for further replies.
Back
Top