On Sat, Apr 23, 2011 at 10:43 AM, Nghia Ho <nghiaho12@...> wrote:
> Hi all,
> I came across an old post in 2006 about RAND_MAX being 16bit only. I did a
> quick printf("%d\n", RAND_MAX) and I get 32767. I'm using the mingw that
> came with the latest CodeBlock for Windows as of writing, gcc version 4.4.1
I see the same value, 32767 (= 2^15 - 1), for RAND_MAX on a later (4.5.2)
mingw version of g++. (On a linux version of g++ 4.4.5 I get RAND_MAX of
2^32 - 1.)
A couple of comments:
First, for the library maintainers to provide an updated version of rand
that increases RAND_MAX from 2^15 - 1 to, say, 2^31 - 1, they can't,
for example, just change various 16-bit integers that appear in the
algorithm to 32-bit integers -- they would have to develop a whole new
Second, most implementations of rand() are not great anyway (some
in the past have been notoriously bad -- I'm not sure how the windows
version ranks), so for work where the quality of results matters, you
don't want to use rand().
> What's the story there? It's giving me a lot of problems when using
> random_shuffle() for large data because it doesn't shuffle properly. I
> thought I was going nuts! I wrote a test program to verify the problem:
Well, you've found your problem: RAND_MAX is only 2^15 - 1. (And
you get the expected result on linux, because there RAND_MAX is
> The program generates 100,000 data points and assigns values from 0 to
> 99,999. It then sub-samples 10% of this data, which we expect to have a
> uniform distribution if we plot a histogram of 10 bins.
> On Windows 7 I get:
> Random shuffle
> 0 - 360
Ouch. RAND_MAX is too small on windows.
> Not a uniform distriubtion at all ! But on Linux I get:
> Random shuffle
> 0 - 1033
Ahh... RAND_MAX is better on linux.
> What we expect.
> So the question is, how do I get more than 16bit from rand() ? This seems
> like a serious flaw.
Well, in the example you gave, you're using c++, so follow Greg's
suggestion of using a better, more modern random number generator
from a c++ library. Greg suggested boost, which is great, but if you're
not already using it, it's a little bit inconvenient because you have to deal
with about 100 MB of boost bloat just to get a simple function like random
It might be more convenient to use <tr1/random>, which almost certainly
came from boost anyway (and is supported by my copy of mingw g++).
Or, you can turn on -std=c++0x, and use <random> from the new standard
(which is almost certainly the same as boost::random and tr1::random).
If the use-case you actually care about is indeed random_shuffle, the
random-shuffle algorithm takes a random number generator as an
optional argument, so you can plug in a better, longer-period generator
to get better results. You can either roll your own (not recommended,
except as an educational exercise) or use one from <random>.
If you really need to use rand(), then you could build your own 32-bit
generator to plug into random_shuffle by combining together two
values from rand(), but this is not recommended except as an
expedient, as it further degrades the quality of the random numbers.
Pseudo-random numbers are a little bit subtle. If you need high-quality
results, you need to use a high-quality random number generator, best
professionally written (such as those in <random>), and know a little bit
about what you are doing.
If you want to tell us a little more about your actual use case, we can
probably give you some pointers about safe ways to proceed.