T_3.6 Strange conflict between sd card reader and uarts.

Status
Not open for further replies.

bkw

Member
Teensy 3.6
SdFat 1.0.7 - using SdFatSdioEX
Serial1-Serial6
Examples->SdFat->TeensySdioDemo

I hit a strange issue where:
* Only if using Serial1 - Serial6, not the built in usb Serial
* Only if using SdFatSdioEX, not SdFatSdio

After doing sd.begin() or file.open(), I can no longer send more than 64 bytes to SerialN.
On the 65th cumulative byte of any form of Serial1.print() println() write() etc, in any combintion, the Teensy locks up.

I found that if I avoid writing 64 bytes until a file.write(), then after the file.write() the uart is fixed and I can once again write as much as I want to it.
I found that file.sync() also clears the problem.

Both file.write() and file.sync() require a file to be opened though, but the problem also happens after sd.begin() and no file has been opened yet.
I found that sd.cacheClear() also clears the problem, and doesn't need any file opened.

So in the end the best work-around I found is to either not use SdFatSdioEX (use SdFatSdio instead),
or issue a sd.cacheClear() immediately after any sd.begin() or file.open().

Except there is still something missing, because I hit this problem in code that started out as a copy of TeensySdioDemo.

To show the problem cleanly without a lot of other junk, I made a minimal new sketch from scratch, that adhered to all the rules I described, and it does NOT exhibit the problem!

But if I take a fresh copy of TeensySdioDemo and just do the minimum changes to that (global replace Serial to Serial1, and add some Serial1.println() so it exceeds 64 bytes in the problem spots, and that's IT, no other changes), the problem exists and is easily shown, and it's easilt shown that adding sd.cacheClear() fixes it.

A little more detail here:
https://github.com/greiman/SdFat/issues/112

Attached are two sketches

The clean/minimal one which SHOULD be failing in it's current state, according to all I just said, but is working fine.
(two sd.cacheClear() are commented out, and so the next stest() after sd.begin() should lock up the Teensy, but it isn't)

A copy of TeensySdioDemo with the minimum changes to show the problem and the work-around.
If you comment-out either sdEx.cachClear() in this one, it locks up in the next stest().

To see just the changed parts, search for "BKW" in this one. The only changes are:
* global replace Serial to Serial1 - that should be perfecty "legal", it doesn't functionally change anything as long as you actually have something hooked to the serial port. (I am using a Schmartboard cmos-rs232 shfter, powered from one of the teensys 3.3v pins (not Vin or VUSB). The serial connection is working perfectly outside of this specific conflict.
* Insert some more Serial1.println() in various places to show that it's harmless some places and causes a lockup in other places.

Any ideas what the heck is going on?
 

Attachments

  • TeensySdioDemo_uart_conflict.zip
    2.6 KB · Views: 56
  • TeensySdFatEX_uart_bug.zip
    1.9 KB · Views: 55
I took the 'TeensySdFatEX_uart_bug.zip ' and ran that sketch - I didn't connect anything to Serial1 (except I have a loopback wire from prior testing)

I duplicated some prints to Serial. and moved the 'Press Enter' to Serial and modified the .available and .read so it loops and repeats on enter from USB Serial - and I don't see anything hanging?

Serial1 will dump bytes without anything connected - I see I have Serial1 Rx jumpered to Tx - so anything going out comes back in.

So I wrote this xSer() and call it between Serail1 print lines and it still seems to be running 'otherwise' unchanged so I don't see a problem:
Code:
void xSer() {
  while ( Serial1.available() ) {
    Serial.print( (char) Serial1.read() );
  }
}

In the case I'm not reading all the Serial1 output in time it will just go to the bitbucket and overflow.

I have a T_3.6 with an SD card installed - using TD 1.42 on IDE 1.8.5 "Using library SdFat at version 1.0.7 in folder: t:\tcode\libraries\SdFat "

I just pulled off the loopback wire and the edits to get enter from Serial have it working still:
Code:
  Serial.print("Press [Enter] to proceed: ");
  xSer();
  while (!Serial.available());
  c = Serial.read();
  SERIAL_PORT.println(c);
  if (c != '\r') return;
  SERIAL_PORT.println();

If you replace the "press [Enter[ …" lines above and remove the Serial1 wires and just jumper Pin 0 to Pin 1 does it still hang?
 
TeensySdFatEX_uart_bug.zip is the one that does not exhibit the problem.

TeensySdioDemo_uart_conflict.zip exhibits the problem.

It's just that TeensySdioDemo also has a bunch of other junk that normally you would not want if you're trying to demonstrate a problem, or debug it. Too many variables.
So I stripped away everything except the minimum factors I described: uart serial port, SdFatSdioEX, file.write() just a single byte. We don't care about all the performance counters (unless they somehow turn out to be part of the problem). But after doing that, the end result doesn't have the original problem any more. So, it's a mystery. Like, does yield() somehow cause it just by existing in the file?

So, since I failed to create a proper minimal clean problem demo, the next best I can do to remove variables is take stock example code that is the same for everyone, and presumably fairly well vetted, and then change that as little as possible. If the entire program can't be simple, then at least my delta from a known program can be simple.

So you have to use TeensySdioDemo_uart_conflict.zip to see the problem. You have to choose "1" to seee the problem, because choice 2 uses SdFatSdio which doesn't have the problem.
 
Updated TeensySdioDemo_uart_conflict.zip

The one in the original post wasn't finished and doesn't even compile, sorry.
Initially I wanted to make it easier to switch serial ports, so I was going to put a #define everywhere in place of "Serial" or "Serial1" etc,
but then I decided that it was more important to avoid changing the original reference code as much as possible to avoid even the risk or appearance of introducing variables.
So I went back and made everything static "Serial1.println()" etc, and missed some.

So use this one.

As-shipped, everything works, because the work-around is in place and enabled.
To see the problem, just comment out either of the two sdEx.cacheClear(), and choose option 1 at run time.
 

Attachments

  • TeensySdioDemo_uart_conflict.zip
    2.7 KB · Views: 53
I just took the one with 'bug' in the title …

Pulled the other one down to look at … it is missing … #define SERIAL_PORT Serial1

I was anxious to try this to see if it was hard faulting the CPU and I'm working on a debug library to catch those. That and I was just looking at another logger issue with Serial1 input from a GPS.

I made similar edits to ignore Serial1 setup - just connected Rx from Tx and {generously) added the xSer() code to display what came in Serial1 to have it used and shown.

I did not catch a fault - but there was a SERIOUS delay after some part of this with option 2 'SdFatSdio':

Type '1' for SdFatSdioEX or '2' for SdFatSdio

2: 0123456789012345678901234567890123456789012345678901234567890123456789
2: Wrote more than 64 bytes to SERIAL_PORT.
2: If you can see this line, then SERIAL_PORT is still go4: 0123456789012345678901234567890123456789012345678901234567890123456789
4: Wrote more than 64 bytes to SERIAL_PORT.
4: If you can see this line, then SERIAL_PORT is still good.


size,write,read
bytes,KB/sec,KB/s261.92,1199.31
1024,596.75,2301.77
2048,error: read failed
SD errorCode: 0X31,0X100001

Somewhere about the part the BOLD text starts it STOPPED … for some time when I thought I should have seen a FAULT … looking back it then continued through the bold in about 20 seconds the rest of the BOLD text was shown - with no fault registered - but it shows there was a READ error?

Indeed OPTION 1 completes in a timely fashion. Only Option 2 has this issue - and I ran that the other day IIRC with no problem as an unaltered sketch.

I just restarted the Teensy and Option 1 completed - Option 2 hung about 20 secs in the same place and then completed with:
_PORT is still good.


size,write,read
bytes,KB/sec,KB/sec
279.99,1225.56
1024,559.60,2304.94
2048,error: data check

Just loaded and ran the option 1 and 2 with no Serial1 I/o and as noted they both work.

There is some interference - a DELAY and then a read failure of some sort is detected - the code compeltes and is not fautling the processor.

As I was running I see an alternate was posted - but I can see something wrong in this non-PJRC library - where Serial1 usage ( even routed back to the same Teensy Serial1 not a foreign device ) - causes some interference in that library operation … it seems.
 
I just removed sketch yield() and it completed without delay or errors.

Putting the xSer(); in yield pushed the Serial1 to Serial - but ended up with an error:
2048,1178.42,error: read failed
SD errorCode: 0X31,0X100001

I replaced system yield with an empty and put the included code under foog():
Code:
// Replace "weak" system yield() function.
void yield() {}
void foog() {
  // Only count cardBusy time.
// ...

This then worked without read error - so it seems using Serial1 causes something in that included yield() to be trouble, it isn't anything essential in the PJRC yield() code.

On my system it completes with error after a delay in output.
The delay goes away and it completes without error using the yield below that transfers Serial1 to USB - so it doesn't seem to be the Serial1 processing:
Code:
void yield() {
  xSer();
}

I commented out the call to micros() and it wasn't that.

Commenting out the call to sdBusy() also didn't stop the readerror showing.

Compiling FASTEST ( just seeing Paul's comment on 1.43b2 ) doesn't stop the read error.
 
Working more minimal program to show problem.

Lots of comments to explain everything but the actual code is much more stripped down than TeensySdioDemo, and DOES show the problem.

Now it looks like the culprit is maybe just the yield() function from TeensySdioDemo.
 

Attachments

  • Teensy_SdFatEX_yield_uart_bug.zip
    2.3 KB · Views: 53
This is probably a bit redundant for you defrag, we were working at the same time haha, looks like you already got in about as far as I have by now.
Option 2 is always very slow for me, but I read other people say it's normal so I didn't suspect it.
Option 2 is never pausing in the middle of a serial output like that for me, only during the first write test with 512 byte blocks, which is expected and not actually paused, just busy.
I don't know what you're hitting to get that.
 
greiman says it's the yield() from the demo. Just remove it. yield() is called from within the systems own Serial and Serial1-6 code, and it just happens that Serial doesn't hit the problem because it's has a larger buffer and is a lot faster and just rarely ever gets to the point where it would call yield(). I still don't see exactly what about the demo yield() causes any problem, nor why cacheClear() fixes it, but I guess we don't care. The official word is just delete yield(). It's in the demo but you don't want it in a normal program.

And indeed I don't have any problem since doing that. I can hammer both Serial1 and the sdcard and do so while using SdFatSdioEX and it's all solid.
 
Indeed that yield code was just for stats and demonstration and doesn't do anything functional for the SD processing.

I do see yield() is embedded in Serial# some rare times in output code to buy time for remote receiver to pick up the chars.
 
Status
Not open for further replies.
Back
Top