Help converting arm assembler for teensy 3.6

Status
Not open for further replies.

neutron7

Well-known member
I have been using this sine approximation by Jasper Vijn as a function, which works well. http://www.coranac.com/2009/07/sines/

Code:
//fast sine approximation by Jasper Vijn
/// A sine approximation via a third-order approx.
/// @param x    Angle (with 2^15 units/circle)
/// @return     Sine value (Q12)
static inline int32_t SIN3(int32_t x) __attribute__((always_inline, unused));
static inline int32_t SIN3(int32_t x)
{
    // S(x) = x * ( (3<<p) - (x*x>>r) ) >> s
    // n : Q-pos for quarter circle             13
    // A : Q-pos for output                     12
    // p : Q-pos for parentheses intermediate   15
    // r = 2n-p                                 11
    // s = A-1-p-n                              17

    static const int qN = 13, qA= 12, qP= 15, qR= 2*qN-qP, qS= qN+qP+1-qA;

    x= x<<(30-qN);          // shift to full s32 range (Q13->Q30)

    if( (x^(x<<1)) < 0)     // test for quadrant 1 or 2
        x= (1<<31) - x;

    x= x>>(30-qN);

    return x * ( (3<<qP) - (x*x>>qR) ) >> qS;
}

I would like to use the assembler version, It is supposed to be even faster than a lookup table and Linear interpolation.
I have no Idea how to translate that in to code that can be used in the arduino IDE. I do use a couple of other assembler functions which I have adapted from the PJRC audio library, they look nothing like this!

Code:
@ ARM assembly version, using n=13, p=15, A=12

@ A sine approximation via a third-order approx.
@ @param r0   Angle (with 2^15 units/circle)
@ @return     Sine value (Q12)
    .arm
    .align
    .global isin_S3a
isin_S3a:
    mov     r0, r0, lsl #(30-13)
    teq     r0, r0, lsl #1
    rsbmi   r0, r0, #1<<31
    mov     r0, r0, asr #(30-13)
    mul     r1, r0, r0
    mov     r1, r1, asr #11
    rsb     r1, r1, #3<<15
    mul     r0, r1, r0
    mov     r0, r0, asr #17
    bx      lr

cheers to anyone who can help me learn how to translate it!
 
The C version is fine. GCC generates almost identical code compared to the assembly version (uses 'eors' instead of 'teq').

The C version can be inlined, so it's potentially faster than the assembly version.
 
Last edited:
The C version is fine. GCC generates almost identical code compared to the assembly version (uses 'eors' instead of 'teq').

The C version can be inlined, so it's potentially faster than the assembly version.

Thanks, I will leave it as is for now then.
 
Status
Not open for further replies.
Back
Top