Hopefully I didn't screw anything up, it's been a while since I've done a release. There are a couple new PRNGs there - very small, very fast, very light-weight ones, but requiring fast multiplication. mrsf64 & mrsf32 are so small you can keep their state in just 2 registers. mrc64 & mrc32 & mrc16 are a little bigger and a little slower, but slightly higher quality. Notably I did not include new random access PRNGs like I had intended to. I did incorporate code for random access to LFSRs / GFSRs...
Hopefully I didn't screw anything up, it's been a while since I've done a release. There are a couple new PRNGs there - very small, very fast, very light-weight ones, but requiring fast multiplication. mrsf64 & mrsf32 are so small you can keep their state in just 2 registers. mrc64 & mrc32 & mrc16 are a little bigger and a little slower, but slightly higher quality. Notably I did not include new random access PRNGs like I had intended to. I did incorporate code for random access to LFSRs / GFSRs...
v0.96 is up
rarns64::raw64() bug
Different results when 1 byte is skipped
Big bug in Practrand 0.94 and 0.95 ?
On further thought, I have that naming scheme I may rename them to randi32, randi32_fast, randi64, and, if I add it, randi64_fast. I think I avoided those kinds of names to maximize the difference relative to raw8/raw16/raw32/raw64, but at this point that doesn't matter anymore. To be clear, randli_fast / randi64_fast could be added, I'd just have to #ifdef different codepaths for different compilers, and in rare cases (mostly 32 bit compilers) randli_fast aka randi64_fast would end up super slow...
I think I heard that most pc processors always calculate the upper mult, even when not asked... not sure for what % that is true, but it does seem to be quite fast on most CPUs. Sounds unlikely to me. All of this is limited by understanding, but this is how I understand the general picture. Integer wide multiplication almost always involves a different opcode than regular low multiplication. That makes it easy to save power by not calculating the upper half for low integer multiplication, which is...