- 2025-03-29

I've had a little time and energy in the last few months to work on PractRand. One of the results:

In trying to have a really broad range of PRNGs tested with PractRand and other test suites for comparison, I noticed that PR0.95 was badly underperforming on non-power-of-2 combined LCGs - they were failing TestU01 SmallCrush in a few seconds while passing multiple multiple days of PractRand testing with clean bills of health. This was due to the heavy use of birthday tests in TestU01.

So I've gone back to my birthday test code and cleaned it up and refined it in to BirthdaySystematic2, which I'm actually happy enough with stick it in the core test set, though as one of the low-priority tests (along with BRank, mod3n, and TMFn). Compared to other birthday tests it offers a few advantages:

  • it can skip over adjustable portions so that I can adjust how much testing gets done per megabyte of data, important for calibrating its priority in PractRand. In the core test it's probably going in set to only test 1 out of every 64 kilobytes, that's little enough it doesn't impact performance significantly.

  • the size of integers it operates on is adjusted intelligently at testing time, so it's always acts as something at least vaguely close to an optimal configuration for the amount of data it has.

  • once it maxes out what it can do as a straight birthday test on a reasonable amount of memory (currently set to 64 MB), it starts filtering its input in an attempt to act like a birthday test on bigger values in a larger sortbuffer. This filtering seems effective - if I give it a 67 bit np2clcg, with the filtering enabled it finds bias after 2 terabytes (at full strength, not the throttled version going in to the core tests), whereas if I just have it keep combining 64 bit birthday tests it still hasn't found major bias yet despite 8 times as much data tested.

  • p-values seem pretty decent even without going through the calibration system. It makes an effort to avoid excessively quantized p-values on all but the shortest test runs, calculates p-values accurately for small tests, and on larger tests intelligently switches between normal-approximations for common p-values and conservative asymptotic bounds for extreme cases.

Of course, it's still just a birthday test - I don't think it's capable of anything useful on non-linear PRNGs (I think it's basically detecting pieces of linear lattice structures), and even on purely linear PRNGs it's often outperformed by other tests - the only ones it really shines against were already avoided for being slow. Still, it significantly cuts down on the number of linear PRNGs that were managing to perform significantly better than they should against PractRand.