Gaussian Distributed Random Numbers for Teensy 4.1

hexdro

New member
Hey all!

I am currently implementing a particle filter on Teensy 4.1 and am looking for the best way to generate Gaussian-distributed random numbers.

Any advice would be appreciated.

Thanks a lot!
 
I would use the 50 high bits of Xorshift64* and the Marsaglia polar method.

Xorshift64* is a very fast pseudorandom number generator, and when you only use the 32 high bits, the results pass all tests in the BigCrush randomness test suite. Here, I'm using 50 high bits, so the results might not pass the MatrixRank test, but that is not a big deal: neither does Mersenne Twister (MT19937), which is the most commonly used PRNG in simulations.

Code:
#include <stdint.h>
#include <math.h>

typedef struct {
    float   x;
    float   y;
} vec2f;

/* The Xorshift64* state.  Must be initialized to a non-zero value! */
uint64_t  prng_state;

vec2f  prng_normal(void)
{
    vec2f     v;
    float     s;
    uint64_t  x;

    do {
        /* Generate next 64-bit number */
        x = prng_state;
        x ^= x >> 12;
        x ^= x << 25;
        x ^= x >> 27;
        prng_state = x;
        x *= UINT64_C(2685821657736338717);

        /* Extract 25 highest bits, and construct a float, -1.0f .. +1.0f from it. */
        v.x = ((float)(x >> 39) - 16777216.0f) / 16777216.0f;

        /* Extract bits 15..39, and construct another float. */
        v.y = ((float)((x >> 14) & 0x1FFFFFF) - 16777216.0f) / 16777216.0f;

        /* The sum of their squarest must be greater than zero but smaller than one. */
        s = v.x*v.x + v.y*v.y;
    } while (s >= 1.0f || s == 0.0f);

    /* Scale by the exponential random variable, divided by the square root of s. */
    s = sqrtf(-2.0f * logf(s) / s);
    v.x *= s;
    v.y *= s;

    return v;
}
In setup(), you need to initialize prng_state to a nonzero unsigned 64-bit value; the Teensy 4.1 hardware random number generator is optimal for this; just make sure you use a loop to set the prng_state from the hardware random number generator in case it happens to produce a zero.

Then, each call to prng_normal() will return a pair of floats in a vec2f structure. Each of them is independent, and normally distributed with mean 0 and variance and standard deviation 1 (µ=0, ρ=1). It's just that the Marsaglia polar method produces them in pairs.

The assembly this generates for Cortex-M4 (at godbolt.org) is quite compact; and in addition to that, each pair produced only requires a single sqrtf() and a single logf() call. So, I do believe this is suitable for use on Teensy 4.1.

I ran the exact same code on x86-64 (Linux) for a billion points, collecting the ones between -5.0 and +5.0 into 1001 bins (0.01 bin width) to get a histogram of the generated distributions. Both histograms, and the combined histogram, all matched the expected exp(-0.5*x*x)/sqrt(2*pi) distribution. In other words, I claim this is good enough for any use, including scientific simulations, where precision is limited to that of the float type anyway.
 
Back
Top