From: Christophe Rhodes <csr21@ca...>  20120709 17:21:26

Jeffrey Cunningham <jeffrey@...> writes: > Isn't an "extra density of floats around 0"  by definition  a > nonuniform distribution? Not in the sense that I mean. Or possibly "yes", but welcome to the world of floating point: because floating point numbers are a discrete set, not a continuum, there's no such thing as a continuous uniform distribution using floats, only particular approximations. Briefly, a floating point number is represented by sign, mantissa and exponent. Sign is always +1 or 1. What remains is the mantissa, which is a representation of the bits after the decimal point in the binary number 1.xxxxxxxxxxxxxx..., and the exponent, which indicates the power to which 2 must be raised to get you the number you want. (I am eliding lots of details here; there are references where you can read them). So for example, 0.5 is represented as the sign/mantissa/exponent triple (1, 0, 1); 0.75 is (1, 10000000..._2, 1), 0.875 is (1, 11000000..._2, 1), and so on. The point here is that the density of representable floating points /changes/ as the exponent changes: there are as many machine floats between 0.25 and 0.5 as there are between 0.5 and 1, and this isn't some kind of transfinite sleight of hand, this is a finite, countable set. So, near zero, there are more possible floats than there are near 1. Obviously, the RNG compensates for that, by having the possible floats near zero be generated with lower probability than those near one. But the point is that there is more than one way of performing that compensation: the appropriate probability mass could be distributed evenly over all possible floats within an evenlyspaced region, or a single representative in a region could be selected as an archetype  and if so, which representative? SBCL's strategy for generating numbers between 0 and 1 isn't so utterly stupid as you seem to think; it makes one particular choice, by selecting floats between 1 and 2 (which does have a uniform density of representable floats), and then subtracting 1. This has the effect of emphasizing the probability of generating 0 compared with the floating point numbers which are in fact representable in the region of 0, but that effect has potential virtues too (such as preserving the basic symmetry of the region near 0 and near 1 in the conceptual uniform distribution). > I would think this is highly undesirable behavior in a uniform RNG and > should be corrected. Tell you what: please specify unambiguously, paying reference to the hardware representations, the behaviour of the RNG when given a singlefloat 1.0 argument, and justify why the specified behaviour is better than all other behaviours in all circumstances. Cheers, Christophe 