SDCC currently uses the 32- LCG given as an example in the C standard as implementation for rand().
However, such a PRNG is not the state of the art. There are generators that give output with better statistical properties. Another problem is that is requires a (for our targets costly) 32-bit multiplication.
We should look into alternatives, e.g. xorshifts generators.
Philipp
Philip you may want to have a look at this:
http://z88dk.cvs.sourceforge.net/viewvc/z88dk/z88dk/libsrc/_DEVELOPMENT/stdlib/z80/random/
We're using a Marsaglia xor generator (asm_random_uniform_xor_32.asm) for rand() which is very fast on the z80 and will be fast on any small uP. A link to Marsaglia's paper is included in the program source. This rand() is likely hundreds of times faster than sdcc's current LCG generator.
The random number generator is kept separate from rand() so that library code can generate random numbers without impacting on the program's seed. The two other random number generators there are mainly for people writing games as they return 8-bit random numbers and since this is not related to the C standard this may not be applicable to sdcc but the CMWC is a surprisingly compact, fast and very high quality generator.
Here are two suggestions implemented in C:
The first one uses the lower bits of a generator from Marsaglia's paper, the second one uses a cheaper generator of similar style. The first has a period of 2^32-1, while the second has just 2^16-1. The second one is faster. Code size for the mcs51, ds390, hc08 and s08 ports is smaller for the second one, while code size is about the same for both in the stm8, z80, z180, r2k, r3ka, tlcs90 ports.
Both generators are much faster than the current one, and don't introduce a dependency on long multiplication.
Philipp
In #9838, I implemented an xorshift 32 with a = 10, b = 9, c = 25. rand() returns the lower 15 bits.
Philipp
Hi Philip,
the article lists a few possible shift arguments. Among them are some which include byte-wise shifts (by 8, 16 or 24) which should be good for compact code.
Actually I had high hopes for the one with a shift of 24 (because that leaves only one byte to be handled for that shift), but actually it worked out worse than the 10 9 25 version you have chosen:)
So I tried some others and did a crude check for the code size covering some ports:
This (for current SDCC code generation) would suggest 8 7 23 as sweet spot.
Thanks, that was fun (distracting me from Lexica on Android:)
sorry, meant to say 8 9 23. (8 7 23 actually is worse)
Thanks for the data. I only looked into code size for z80 (where 8 9 23 is also better) and stm8 (where it is worse).
My goal was to allow generation and writing of efficient code.
Philipp
Are you saying that a shift by a literal 7 is not implemented as a shift by 1 in the opposite direction?
Depends.
A shift by 7 could be done as one rotate followed by one and with mask 0x01. But sometimes there are problems with rotate and and:
In the backends (stm8 and z80-related) there are two different code generation functions for each shift direction:
genLeftShiftLiteral / genRightShiftLiteral: Generates efficent code using everything the processor has to offer (shift, roate, swap, mult, and-masking, etc). Handles only shifts with 8 or 16 bit results by literals.
genLeftShift / genRightShift: Generates correct code using mostly shift instructions and moving byte around. Handles shifty by unknown amounts and shifts with 32-bit or wider results.
Philipp
Edit: hl has left shift and rotate only (via add hl, hl and adc hl, hl), but not right shift and rotate.
Last edit: Philipp Klaus Krause 2017-02-11