Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 3 1 2 3 LastLast
Results 1 to 25 of 70

Thread: BUG in arm_dcache_delete

  1. #1
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277

    BUG in arm_dcache_delete

    Code:
    // Delete data from the cache, without touching memory
    //
    // Normally arm_dcache_delete() is used before receiving data via
    // DMA or from bus-master peripherals which write to memory.  You
    // want to delete anything the cache may have stored, so your next
    // read is certain to access the physical memory.
    __attribute__((always_inline, unused))
    static inline void arm_dcache_delete(void *addr, uint32_t size)
    {
        uint32_t location = (uint32_t)addr & 0xFFFFFFE0;
        uint32_t end_addr = (uint32_t)addr + size;
        asm volatile("": : :"memory");
        asm("dsb");
        do {
            SCB_CACHE_DCIMVAC = location;
            location += 32;
        } while (location < end_addr);
        asm("dsb");
        asm("isb");
    }
    Had a crash.. and had to debug some code...
    The marked red line leads to using a start adress that may be lower than the adress requested.
    So, data got deleted, unintentionally, and my program crashed.

    + you should use the "&" for the end_addr, where it is missing..

    Please fix that.

    It's a killer for any mem that got allocated with malloc(), and kills the heap management.
    Last edited by Frank B; 09-02-2021 at 09:33 PM.

  2. #2
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Attention, same bug is in arm_dcache_flush_delete() and arm_dcache_flush().
    In arm_dcache_flush() it may not lead to crash, but it's wrong, too.

  3. #3
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    9,560
    @frank/@Paul - I am curious if there is some documentation (RM or??) on the SCB_CACHE_DCIMVAC
    I know from source code that it is: #define SCB_CACHE_DCIMVAC (*(volatile uint32_t *)0xE000EF5C)

    My guess is says to delete that 32 byte page from the cache without first writing it's contents out to memory...

    So again pardon this random rambling... But this time hopefully short

    But what I wonder is:
    a) If I set this memory location to a value whose bottom 5 bits is not 0... What does it do.. Still 32 bytes starting at that location, the rest of that logical page starting at that offset? error?
    b) If you use the flush/delete does your code work?
    c) If you are running into this issue, than I assume one can run into similar issue at the end... That is if the addr+len does not end at the end of some 32 byte boundary, the rest of that boundary is deleted as well?

    Again Interesting problem.

  4. #4
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,070
    I believe that it needs 32 byte alignment. If so, the code is correct, but arm_dcache_delete() should only be called after a flush or with a 32 byte aligned address pointing to a buffer that is a multiple of 32 bytes long. Making the bug a lack of a warning comment.

    Wouldn't it be better to use the equivalent CMSIS-Core routines (which do document the alignment requirement)?

  5. #5
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    Indeed with just the code as a reference seeing in he while();:
    Code:
     location += 32;
    Along with the start addr location manipulation suggests the code needs 32 byte alignment.

    So starting up to 31 bytes early and ending to include the end addr from size requested, plus perhaps 31 bytes, would be enforcing that and perhaps preventing a failure to function, if not a fault.

  6. #6
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Of course it works with aligned data.
    The problem is unaligend data.
    Malloc returns unaligend data.
    Yes you can do workarounds in your code.
    3rd party code does not have Teensy workarounds.

    So why not just fix the bug?
    Last edited by Frank B; 09-03-2021 at 05:14 AM.

  7. #7
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    If it only works on 32 bytes - as @luni said the fix is to document the behavior.

    Caller must assure any data area of interest is 32 byte aligned. Allocate extra and self align as needed.

  8. #8
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    and yes, a workaround means, that if you want to malloc a single byte (malloc (1)) you have to malloc 32
    Not your code only, 3rd party code, too.

    It must be a joke.
    "You can't use malloc() without manual alignment of the returned pointer afterwards, if you use the Teensy provided cache management".
    I'm not seeing this anywhere ...

  9. #9
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    @all: I think you don't see the real problem. I thought it was easy....
    Ok.
    Invested more time and wrote a short demo for you:
    Code:
    void setup() {
     Serial.begin(8600);
     if (CrashReport) {
      Serial.println(CrashReport);
      CrashReport.clear();
     }
     void *p1 = malloc(100);
     void *p2 = malloc(100);
     arm_dcache_delete(p2, 100);
    
     int *p3 = malloc(100); //<- yes i know there is no nullptr-check. but trust me, teensy 4 has 300 bytes free. really!
     Serial.printf("%x %x %x", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
     *p3 = 7;
    }
    
    void loop() {}
    Run this.


    Hey, you even make digitalWrite(4700, 1) work in a way that it does not crash. But for malloc, its ok..?

    Code:
    CrashReport:
      A problem occurred at (system time) 6:55:58
      Code was executing from address 0x100
      CFSR: 82
        (DACCVIOL) Data Access Violation
        (MMARVALID) Accessed Address: 0x0 (nullptr)
          Check code at 0x100 - very likely a bug!
          Run "addr2line -e mysketch.ino.elf 0x100" for filename & line number.
      Temperature inside the chip was 38.17 C
      Startup CPU clock speed is 600MHz
      Reboot was caused by auto reboot after fault or bad interrupt detected
    Perhaps we can just remove all the dumb checks that make Arduino slow. - i.e. from digitalWrite - it would be way faster without it!
    We just have to document that digitalWrite(4700,1) crashes.Genious!
    There are hundrets of such checks. A tremendous speedup.
    We should tell this others. All the checks for buffer overflows and nullpointers are not needed too! Best would be, to check nothing. We don't need faster PCs.

    And hey, it saves us 3 minutes not to edit the buggy lines in the Teensy core. A simple "+ 32" and a "&" in the next line. Think a bit longer about arm_dcache_flush_delete() and arm_dcache_flush() ;-)

    Back to serious: There is no reason not to fix it.
    Last edited by Frank B; 09-03-2021 at 06:44 AM. Reason: Typos

  10. #10
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    + turn the while{} do .. while into a while {}.. ;-)

  11. #11
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    @all: I think you don't see the real problem. I thought it was easy....
    No Frank - until the posted sketch it was not clear what was happening. Even then added prints needed to show where it is Crash'ing for sure.

    From reading it seemed what was wrong was the boundary extension was 'Destroying' memory contents unexpectedly - and that was causing the Cra.

    Code:
    ONE:: 20203068 202030d0 63
    TWO:: 20203068 202030d0 63
    CrashReport:
      A problem occurred at (system time) 23:49:51
      Code was executing from address 0x26FC
      CFSR: 82
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x1690AB14
      Temperature inside the chip was 30.58 C
      Startup CPU clock speed is 600MHz
      Reboot was caused by auto reboot after fault or bad interrupt detected
    
    ONE:: 20203068 202030d0 63
    TWO:: 20203068 202030d0 63
    From this version:
    Code:
    void setup() {
      Serial.begin(8600);
      if (CrashReport) {
        Serial.println(CrashReport);
        CrashReport.clear();
      }
      void *p1 = malloc(100);
      void *p2 = malloc(100);
      int *p3 = 99;
      Serial.printf("ONE:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
      arm_dcache_delete(p2, 100);
      Serial.printf("TWO:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
    
      p3 = malloc(100); //<- yes i know there is no nullptr-check. but trust me, teensy 4 has 300 bytes free. really!
      Serial.printf("THREE:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
    
      Serial.printf("%x %x %x", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      *p3 = 7;
    }
    
    void loop() {}

  12. #12
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,623
    as @luni said the fix is to document the behavior.
    Disclaimer: If I'm not a sleepwalker somebody else must have said that :-)

  13. #13
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Quote Originally Posted by defragster View Post
    From this version:
    Code:
    void setup() {
      Serial.begin(8600);
      if (CrashReport) {
        Serial.println(CrashReport);
        CrashReport.clear();
      }
      void *p1 = malloc(100);
      void *p2 = malloc(100);
      int *p3 = 99;
    [...]
    }
    
    void loop() {}
    You don't want to do that.

  14. #14
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    Quote Originally Posted by Frank B View Post
    You don't want to do that.
    Just setting the pointer to a known value? I didn't write anything there?

    Same result:
    Code:
    
    
    
    Code:
    ONE:: 20203068 202030d0 0
    TWO:: 20203068 202030d0 0
    CrashReport:
      A problem occurred at (system time) 0:3:4
      Code was executing from address 0x26FC
      CFSR: 82
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x1790AB14
      Temperature inside the chip was 29.34 C
      Startup CPU clock speed is 600MHz
      Reboot was caused by auto reboot after fault or bad interrupt detected
    
    ONE:: 20203068 202030d0 0
    TWO:: 20203068 202030d0 0

  15. #15
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Hm. re-thinking about the other both, arm_dcache_flush_delete() and arm_dcache_flush(), i'd probably leave them as they are, as they do not destroy things and it's wanted that everything gets flushed down to RAM.
    Edit: oh, well....wait.. they may flush a cache line too less. right? (or am I'm thinking wrong..)

    @Tim: To exactly what address points p3 at this line? ;-)
    Edit: It crashes due to an other reason.
    Last edited by Frank B; 09-03-2021 at 07:28 AM. Reason: Can->may

  16. #16
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Tim, the problem is, in my example, that p3 is not valid.
    But there is no obvious reason for that.

    The heap is corrupted.
    Remove the cache delete and everything is fine.
    A miracle for most users.
    I'm pretty sure you can make it crash without ever writing anything to the adresses where the pointers point to.

  17. #17
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    Quote Originally Posted by Frank B View Post
    Hm. re-thinking about the other both, arm_dcache_flush_delete() and arm_dcache_flush(), i'd probably leave them as they are, as they do not destroy things and it's wanted that everything gets flushed down to RAM.

    Yes, was confused how the write before delete cases caused any problem.

    @Tim: To exactly what address points p3 at this line? ;-)
    Edit: It crashes due to an other reason.
    I removed the assignment - that gave the pointer an initial value on declaration {int *p3 = 99;} - and the fault was the same. Faulted at the same code address. That code addr2line gives ??:?

    Assuming the problem is that stepping backwards for the cache boundary start is messing up the linked list for malloc() mgmt for *p2 and allocating for *p3 causes a crash trying to follow the list?

    No Crash with ::
    Code:
      char *p2 = (char *)malloc(100);
    // ...
      arm_dcache_delete((void *)((uint32_t)&p2[0x1e]&0xfffffe0), 100); // bad delete size and location - though it shows saving the bytes before *p2 lets it work
    If that is the case then malloc() just needs to give alloc *ptr's on 32 byte boundaries (for 1062's ??), or anyone smart enough to use arm_dcache_delete() needs to have a documentation note to enlighten them.

    @all : I saw a note about better addr2line from a newer version?

  18. #18
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    @Frank - the crash is in Allocating for *p3 because the link list gets corrupted by the delete. It never gets past the printf("THREE::..." >> See p#14 output
    >> Without that HACK EDIT of the start ADDRESS

    Here is the full code for above:
    Code:
    void setup() {
      Serial.begin(8600);
      if (CrashReport) {
        Serial.println(CrashReport);
        CrashReport.clear();
      }
      void *p1 = malloc(100);
      char *p2 = (char *)malloc(100);
      int *p3;
      Serial.printf("ONE:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
      arm_dcache_delete((void *)((uint32_t)&p2[0x1e]&0xfffffe0), 100);
      Serial.printf("TWO:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
    
      p3 = malloc(100); //<- yes i know there is no nullptr-check. but trust me, teensy 4 has 300 bytes free. really!
      Serial.printf("THREE:: %x %x %x\n", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
    
      Serial.printf("%x %x %x", (intptr_t) p1, (intptr_t) p2, (intptr_t) p3);
      delay(100);
      *p3 = 7;
    }
    
    void loop() {}

  19. #19
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Quote Originally Posted by Frank B View Post
    I'm pretty sure you can make it crash without ever writing anything to the adresses where the pointers point to.
    Indeed :-)
    Code:
    void setup() {
     Serial.begin(8600);
     if (CrashReport) {
      Serial.println(CrashReport);
      CrashReport.clear();
     }
     void *p1 = malloc(100);
     void *p2 = malloc(100);
     Serial.printf("%x %x\n", (intptr_t) p1, (intptr_t) p2);
     arm_dcache_delete(p1, 100);
     Serial.printf("%x %x\n", (intptr_t) p1, (intptr_t) p2); //p2 is still valid. of course.
     if (p2) free(p2); //<- crash for no obvious reason. Note: This free() can be  be hundrets of lines later. Try to debug such things.
    }
    
    void loop() {}
    I think this example is better than my first one. And p2 is checked, it's perfectly valid.
    Last edited by Frank B; 09-03-2021 at 09:25 AM.

  20. #20
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    15,074
    As written *p2 was trashed as soon as the dcache_delete started to write - because it had to back up from the pointer to the prior 32B boundary.

    malloc() stores REF data just before the returned pointer. That ref data helps manage memory, and when trashed nothing works.

    After delete, try viewing at address of *p1 through *p2 allocation with :: github.com/KurtE/MemoryHexDump

    Zzzzz's

  21. #21
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Exactly the heap is corrupted.
    Try to debug that, without a debugger, when the free() happens 500 lines later. Probably in a library.

    Its a serious problem.
    It wasn't that easy to find the issue when I debugged my code with printf..
    Last edited by Frank B; 09-03-2021 at 08:37 AM.

  22. #22
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Now, sleep well, take a look again tomorrow, think of all kinds of checks.. null pointers.. digitalWrite..and tell me again that it is an documentation issue.
    Ok. You can declare any bug as doc. Issue..

  23. #23
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    One last goodie:
    Code:
    void setup() {
     memcpy(0,0,0); // or memset, or...
     Serial.println("Bubu");
    }
    
    void loop() {}
    What happens.. and why?

  24. #24
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,070
    No doubt that arm_dcache_delete() is under-documented and dangerous when misused with non 32 byte aligned buffers. Do you have a better version?

    Off the top of my head:

    if (alignment or size isn't right)
    arm_dcache_flush_delete()
    else
    ...

  25. #25
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,277
    Yes mentioned above.
    Start: add +32
    End: add the "and"
    Use a while loop instead of do..while.
    This may delete less than wanted - but that is OK.

    For the other functions: just make sure they flush enough cache lines.

    I could post new code for all three.
    But.. no.. I've lost interest to make PRs for some time. I'm just a user. I can just say please "fix it" :-)
    Will use the time to play guitar.
    Last edited by Frank B; 09-03-2021 at 01:51 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •