You are using an out of date browser. It may not display this or other websites correctly.

You should upgrade or use an alternative browser.

You should upgrade or use an alternative browser.

- Thread starter hexdro
- Start date

Xorshift64* is a very fast pseudorandom number generator, and when you only use the 32 high bits, the results pass all tests in the BigCrush randomness test suite. Here, I'm using 50 high bits, so the results might not pass the MatrixRank test, but that is not a big deal: neither does Mersenne Twister (MT19937), which is the most commonly used PRNG in simulations.

Code:

```
#include <stdint.h>
#include <math.h>
typedef struct {
float x;
float y;
} vec2f;
/* The Xorshift64* state. Must be initialized to a non-zero value! */
uint64_t prng_state;
vec2f prng_normal(void)
{
vec2f v;
float s;
uint64_t x;
do {
/* Generate next 64-bit number */
x = prng_state;
x ^= x >> 12;
x ^= x << 25;
x ^= x >> 27;
prng_state = x;
x *= UINT64_C(2685821657736338717);
/* Extract 25 highest bits, and construct a float, -1.0f .. +1.0f from it. */
v.x = ((float)(x >> 39) - 16777216.0f) / 16777216.0f;
/* Extract bits 15..39, and construct another float. */
v.y = ((float)((x >> 14) & 0x1FFFFFF) - 16777216.0f) / 16777216.0f;
/* The sum of their squarest must be greater than zero but smaller than one. */
s = v.x*v.x + v.y*v.y;
} while (s >= 1.0f || s == 0.0f);
/* Scale by the exponential random variable, divided by the square root of s. */
s = sqrtf(-2.0f * logf(s) / s);
v.x *= s;
v.y *= s;
return v;
}
```

Then, each call to prng_normal() will return a pair of floats in a vec2f structure. Each of them is independent, and normally distributed with mean 0 and variance and standard deviation 1 (µ=0, ρ=1). It's just that the Marsaglia polar method produces them in pairs.

The assembly this generates for Cortex-M4 (at godbolt.org) is quite compact; and in addition to that, each pair produced only requires a single sqrtf() and a single logf() call. So, I do believe this is suitable for use on Teensy 4.1.

I ran the exact same code on x86-64 (Linux) for a billion points, collecting the ones between -5.0 and +5.0 into 1001 bins (0.01 bin width) to get a histogram of the generated distributions. Both histograms, and the combined histogram, all matched the expected exp(-0.5*x*x)/sqrt(2*pi) distribution. In other words, I claim this is good enough for any use, including scientific simulations, where precision is limited to that of the float type anyway.