// Eight alternating Xoshiro256+ states benefitting from SIMD.
// Code from: http://prng.di.unimi.it/xoshiro256plus.c
// Speed comparison: http://prng.di.unimi.it/#speed
// where it is presented as the very fastest of the whole benchmark.
// Note: it fails PractRand BRank at 512 MiB.
// It mentions that the lowest three bits fail linearity tests,
// but it claims to be faster that way.
// I kept it because it lets us compare the SIMD version (which is the fastest).
typedef struct prng_state prng_state;
// Writes a 64-bit little endian integer to dst
static inline void
// buf's size must be a multiple of 8 bytes.
static inline void
// The original code has this to say:
//
// > The state must be seeded so that it is not everywhere zero. If you have
// > a 64-bit seed, we suggest to seed a splitmix64 generator and use its
// > output to fill s.
//
// We force to have at least one bit set.
// Since SHISHUA can handle any seed, including the zero seed and the seed that
// minimizes the amounts of bits set in the state after initialization, it seems
// fair. Ignoring bad splitmix64 gammas would hide severe seeding faults.
void