Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 7 of 7

Thread: Is it possible to bitshift using ARM native bitshift operators?

  1. #1

    Is it possible to bitshift using ARM native bitshift operators?

    Looking in the user manual for the Teensy 3.x type processor uers guide, I have found 3 interesting commands that could possibly speed up bit shift operations for time critical code. The three specific operators for ARM assembler are

    LSR = Logical Shift Right
    ASR = "arithmetic" right shift
    LSL = Logical Shift Left

    excluding...
    RORS
    ROR
    RRXS
    RRX

    The three are quite interesting, but I am alittle taken because all examples reference uint32_t & int32_t only. What about uint8_t and int8_t and using the "LSL" and LSR" commands specifically. Do any experts in the group have experience using these commands and how should / could I make assembler code compatable with the Arduino or C++ language? Theres "__asm__" but how do I create such a function that inserts a value determining how far to shift similar to 1<<3 & 1>>3.

    Thanks
    BLMinTenn

  2. #2
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,565
    normally the compiler does that automatically.
    and there are many more operations which can use shifts.
    then, often 8 bit is slower than 32 bit.
    so if you want fast code, use 32bit, uint32_t, not uint8_t.

  3. #3
    Senior Member
    Join Date
    Nov 2012
    Posts
    1,068
    When you use the << or >> operators in C/C++ code, the compiler will use those shift instructions anyway. For example, in some test code I used the statement "r = s << 3;" and the compiler executed the shift in one assembler instruction "lsls r3, r3, #3".
    The 3.x Teensys use a Cortex M4 processor which has a barrel shifter. This means that all shift instructions execute in one clock cycle. I don't see that there's much room for optimization here.

    What time critical code do you have in mind?

    Pete

  4. #4
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    19,922
    Last time I checked, the compiler does indeed detect this as a rotate and implement it with 1 instruction:

    Code:
    __attribute__((always_inline))
    static inline uint32_t lrotate(uint32_t num, uint32_t r)
    {
            return (num >> r) | (num << (32 - r));
    }

  5. #5
    Junior Member
    Join Date
    Apr 2014
    Location
    Seattle, WA
    Posts
    12
    You can quickly see what different compilers will do, using Godbolt compiler explorer.
    Just type in a non-inline function and ignore any function-call overhead, like the return at the end (bx lr).

    https://godbolt.org/z/vPLCD-

  6. #6
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    422
    Yes, this is a really useful tool. To see what modern compilers can do (and how tricky speed optimizations can be), try this:

    https://godbolt.org/z/Gi9vSE

    Good old Gauss would probably smile in his grave :-) https://betterexplained.com/articles...bers-1-to-100/

  7. #7
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    5,565
    The -O3 -ffast-math is not even needed just -Os or even -Os1 is enough.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •