Issues with new buffered stdin
statistical tests & psuedo- random number generators (RNGs, PRNGs)
Status: Beta
Brought to you by:
orz
Hi,
the buffered stdin needs some minor fixes. For example, when supplying 129 MiB of data, RNG_test is crashing:
RNG_output jsf64 135266304 | RNG_test stdin
RNG = RNG_stdin, PractRand version 0.92, seed = 0x18ab74f8
test set = normal, folding = standard(unknown format)
rng=RNG_stdin, seed=0x18ab74f8
length= 128 megabytes (2^27 bytes), time= 2.3 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/16:cross R= -2.1 p =1-1.1e-3 unusual
...and 150 test result(s) without anomalies
Segmentation fault (core dumped)
Looking into RNG_from_name.h I see several issues:
Word read() { Word rv = *(pos++); if (pos == end) refill(); return rv; }
Thanks!
Jirka
It's an error that the ended flag wasn't checked properly, and likewise that it wasn't declaring an error when fread returns zero. I'll fix that for 0.93. For reference, the way it was intended to go (and thus the fix I'm using now that you've pointed it out that I didn't finish that code):
std::size_t n = std::fread(&buffer[0], sizeof(Word), BUFF_SIZE, stdin);
if (n < BUFF_SIZE) ended = true;
gets changed to this:
if (ended) read_failed();
std::size_t n = std::fread(&buffer[0], sizeof(Word), BUFF_SIZE, stdin);
if (!n) read_failed();
if (n < BUFF_SIZE) ended = true;
The other stuff...
This doesn't make sense. fread can't return any number of bytes - it can only return multiples of the unit specified in its parameter. If some uses RNG_stdin, and it buffers up a single byte, and they then call RNG_stdin::raw32(), it will call _stdin_reader<uint8>::read() 4 times, and the 2nd one should call read_failed() (though admittedly, it's not being called in 0.92...), aborting the program. That is the desired behavior. </uint8>
The only way that can happen is if fread returned no data... which should have (but didn't in .92) already called read_failed() at that point.
What error do you see occurring other than an EOF? I'm having a hard time picturing an error reading from stdin that I would care enough about to want more information.
Process the remaining data? RNG_stdin* is already returning the remaining data until the last integer it can read from input. If you mean the tests... they can't conceive of data in units smaller than 1 kilobyte, and I see no reason to change that. If RNG_test were to pay more attention to what goes on inside of RNG_stdin it could generate an interim result after the last full kilobyte of input and display that... but I don't really see the point. If the user has a special file size they like to handle, they can specify that with a command line option and get an interim result at the end. I don't see any reason to make RNG_test care about file input to make it automatically generate interim results at the maximum possible point for limited data scenarios.
Hi,
I agree with your fix. Some my points were silly, I should not post any messages so late in the night. Sorry about that.
I have implemented your fix with two minor modifications. Please consider adding those to the 0.93 release.
I really like when application informs me why it has ended. It especially holds for long running programs like RNG_test which can run for days and weeks. You really want these programs to be verbose about the reasons why they have ended.
Fixed RNG_from_name.h is included.Please consider it for 0.93 release.
Thanks
Jirka
Thanks. I'll stick it in, with minor modifications:
changed identitation to 1 tab per indentation level, to match PractRand code
initialized blocks_read to zero on construction
removed line feed from end of perror string
changed "B" to "bytes" in short length message
I'm assuming the only changes were to _stdin_reader.
Thanks including it!
And yes, I have only modified _stdin_reader class.