## sbcl-devel

 [Sbcl-devel] random number generation in a tight loop From: Justin Grant - 2009-03-07 14:53:11 ```This prints the same random number 100 times in the tight loop : (dotimes (n 100) (print (random 100 (make-random-state t)))) It seems like (make-random-state t) is no more granular than one second. This will print 100 different numbers but it takes 100 seconds ! : (dotimes (n 100) (print (random 100 (make-random-state t))) (sleep 1)) btw - random number generation in a tight loop works as it should with clozure and allegro. -Justin p.s. - Also, why does the sleep function only accept seconds and not milliseconds ? Am I completely missing something here ? ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Stas Boukarev - 2009-03-07 15:26:24 ```Justin Grant writes: > p.s. - Also, why does the sleep function only accept seconds and not > milliseconds ? Am I completely missing something here ? > (sleep 0.001) -- With Best Regards, Stas. ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Attila Lendvai - 2009-03-07 15:30:38 ```> p.s. - Also, why does the sleep function only accept seconds and not > milliseconds ? Am I completely missing something here ? you probably didn't unlearn some wrong ideas yet... :) the unit of time measurement is seconds. (sleep 0.1) -- attila ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Justin Grant - 2009-03-07 15:35:08 ```Thanks for clarifying (sleep ...) ;-) Back to the original question : The algorithm behind (make-random-state ...) seems tied to the current second. Why ? Can it be changed on sbcl ? How do other folks do fast random number generation on sbcl ? -Justin Zach Beane wrote: > On Sat, Mar 07, 2009 at 06:52:59AM -0800, Justin Grant wrote: >> p.s. - Also, why does the sleep function only accept seconds and not >> milliseconds ? Am I completely missing something here ? > > (sleep 0.001) > > Zach > ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Gary King - 2009-03-07 15:39:46 ```cl-variates might meet your needs (it's cross platform but could use tuning for speed, etc.). HTH, On Mar 7, 2009, at 10:34 AM, Justin Grant wrote: > Thanks for clarifying (sleep ...) ;-) > > Back to the original question : > The algorithm behind (make-random-state ...) seems tied to the current > second. Why ? Can it be changed on sbcl ? > How do other folks do fast random number generation on sbcl ? > > -Justin > > > Zach Beane wrote: >> On Sat, Mar 07, 2009 at 06:52:59AM -0800, Justin Grant wrote: >>> p.s. - Also, why does the sleep function only accept seconds and not >>> milliseconds ? Am I completely missing something here ? >> >> (sleep 0.001) >> >> Zach >> > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San > Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > -Strategies to boost innovation and cut costs with open source > participation > -Receive a \$600 discount off the registration fee with the source > code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > Sbcl-devel mailing list > Sbcl-devel@... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel -- Gary Warren King, metabang.com Cell: (413) 559 8738 Fax: (206) 338-4052 gwkkwg on Skype * garethsan on AIM * gwking on twitter ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Christophe Rhodes - 2009-03-07 15:42:11 ```Justin Grant writes: > This prints the same random number 100 times in the tight loop : > (dotimes (n 100) (print (random 100 (make-random-state t)))) > > It seems like (make-random-state t) is no more granular than one second. Under what circumstances do you want to make many, many distinct random state objects within a very short timespan of each other? > This will print 100 different numbers but it takes 100 seconds ! : > (dotimes (n 100) (print (random 100 (make-random-state t))) (sleep 1)) > > btw - random number generation in a tight loop works as it should with > clozure and allegro. I would express random number generation in a tight loop as (let ((state (make-random-state t))) ; optional (dotimes (n 100) (print (random 100 state)))) rather than anything which generates random states in the middle of the loop. If this isn't adequate, perhaps because you have specific needs for your random numbers (say that they be guaranteed generated from some source of entropy, rather than from a PRNG) then you should say so. > p.s. - Also, why does the sleep function only accept seconds and not > milliseconds ? Am I completely missing something here ? That the seconds argument to sleep is a non-negative REAL not an INTEGER, or more generally, the documentation for the SLEEP function? Try (describe 'sleep) at the REPL. Best, Christophe ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Justin Grant - 2009-03-07 15:53:36 ```Thanks for the advice. I'm curious, with sbcl, why does this do what's expected : (let ((state (make-random-state t))) ; optional (dotimes (n 100) (print (random 100 state)))) but this does not (unless the (sleep 1.0) call is included) ? : (dotimes (n 100) (print (random 100 (make-random-state t)))) (this works on clozure and allegro just fine) -Justin Christophe Rhodes wrote: > Justin Grant writes: > >> This prints the same random number 100 times in the tight loop : >> (dotimes (n 100) (print (random 100 (make-random-state t)))) >> >> It seems like (make-random-state t) is no more granular than one second. > > Under what circumstances do you want to make many, many distinct > random state objects within a very short timespan of each other? > >> This will print 100 different numbers but it takes 100 seconds ! : >> (dotimes (n 100) (print (random 100 (make-random-state t))) (sleep 1)) >> >> btw - random number generation in a tight loop works as it should with >> clozure and allegro. > > I would express random number generation in a tight loop as > (let ((state (make-random-state t))) ; optional > (dotimes (n 100) > (print (random 100 state)))) > rather than anything which generates random states in the middle of > the loop. If this isn't adequate, perhaps because you have specific > needs for your random numbers (say that they be guaranteed generated > from some source of entropy, rather than from a PRNG) then you should > say so. > >> p.s. - Also, why does the sleep function only accept seconds and not >> milliseconds ? Am I completely missing something here ? > > That the seconds argument to sleep is a non-negative REAL not an > INTEGER, or more generally, the documentation for the SLEEP function? > Try (describe 'sleep) at the REPL. > > Best, > > Christophe > ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Christophe Rhodes - 2009-03-07 16:51:23 ```Justin Grant writes: > I'm curious, with sbcl, why does this do what's expected : > > (let ((state (make-random-state t))) ; optional > (dotimes (n 100) > (print (random 100 state)))) > > but this does not (unless the (sleep 1.0) call is included) ? : > > (dotimes (n 100) > (print (random 100 (make-random-state t)))) Because a RANDOM-STATE object represents a certain state of a pseudo-random number generator; the usual method of interacting with such a generator is to use it to draw multiple random numbers, one after the other. A good pseudo-random number generator will have sufficient internal state that it is difficult to predict the next number from the generator given the previous numbers (up to some limit: for example, the Mersenne Twister generator used in SBCL has period 2^19937 or so -- which should be enough for now). The RANDOM function uses the state in a RANDOM-STATE object to produce such a pseudo-random number, altering the state of RANDOM-STATE in the process, so that the next call to RANDOM will produce a different number. MAKE-RANDOM-STATE with a T argument generates a new random state which has been "randomly initialized by some means" (quoting from the CLHS page for MAKE-RANDOM-STATE). In SBCL, the 32-bit seed used to initialize the Mersenne Twister's random state in this case is the low 32 bits of the universal time. So if you create 100 new random states at exactly the same time (to within one second) you will generate 100 identical states; if you then draw one number from each of those states, you will draw the same number 100 times. > (this works on clozure and allegro just fine) That depends on your definition of "just fine". I ask you again: why are you generating 100 new random-state objects within such a short timeframe? Cheers, Christophe ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Justin Grant - 2009-03-08 00:14:27 ```> In SBCL, the 32-bit seed used to > initialize the Mersenne Twister's random state in this case is the low > 32 bits of the universal time. So if you create 100 new random states > at exactly the same time (to within one second) you will generate 100 > identical states; if you then draw one number from each of those > states, you will draw the same number 100 times. > Ah thanks ! I get it. > why are you generating 100 new random-state objects within such a short > timeframe? > I'm simulating lots of dice being rolled. The results are being used for a solving algorithm I'm implementing for a game. -Justin ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-08 04:10:47 ```Justin Grant wrote, On 8/3/09 1:14 PM: > I'm simulating lots of dice being rolled. The results are being used for > a solving algorithm I'm implementing for a game. The Mersenne Twister is supposed to produce a stream of uncorrelated numbers. For any one simulation you should be able to create just one random-state object for the simulation and use it for all the dice, and different dice will not be correlated. If you want to be able to reproduce a run exactly you can use print to output the value of the random-state object and read to recreate it. ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-08 01:16:46 ```Christophe Rhodes wrote, On 8/3/09 5:51 AM: > That depends on your definition of "just fine". I ask you again: why > are you generating 100 new random-state objects within such a short > timeframe? I agree that Justin's test case makes little sense, but there are use cases for calling (make-random-state t) more than once within a second. If you combine two simulation programs into one, you might be running two independent random streams that are initialized within one second of each other. Or you might run multiple instances of the same simulation program on a cluster of machines. The latter one in particular seems like a reasonable use case in which the current implementation will cause problems. The situation is made worse by the fact that there is no portable way defined to initialize a random-state with your own seed that you can ensure is unique. So how would one run simulations in parallel on a cluster? Admittedly, the Common Lisp spec does not explicitly say that you get a different value every time you call (make-random-state t), but the hyperspec does say "that has been randomly initialized by some means". It is surprising behavior if things which have been independently randomly initialized consistently have the same value. I'm not sure what the right thing to do here is. Given that implementations do seed from (get-universal-time) it would not be portable for anyone to count on different behavior. On the other hand, there doesn't seem to be any portable way to make sure that you are initializing with a properly random seed. cmucl reads from /dev/urandom to initialize the seed in (make-random-state t) with an ignore-errors to fall back to (get-universal-time) on systems that don't have a working /dev/urandom. That seems to be a simple way to avoid surprising behavior on most platforms without being worse than the current implementation. -- Sidney Markowitz http://sidney.com/ ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Rupert Swarbrick - 2009-03-08 11:27:36 Attachments: Message as HTML     Message as HTML
 Re: [Sbcl-devel] random number generation in a tight loop From: Gábor Melis - 2009-03-09 15:10:26 ```On Domingo 08 Marzo 2009, Sidney Markowitz wrote: > Christophe Rhodes wrote, On 8/3/09 5:51 AM: > > That depends on your definition of "just fine". I ask you again: > > why are you generating 100 new random-state objects within such a > > short timeframe? > > I agree that Justin's test case makes little sense, but there are use > cases for calling (make-random-state t) more than once within a > second. If you combine two simulation programs into one, you might be > running two independent random streams that are initialized within > one second of each other. Or you might run multiple instances of the > same simulation program on a cluster of machines. > > The latter one in particular seems like a reasonable use case in > which the current implementation will cause problems. The situation > is made worse by the fact that there is no portable way defined to > initialize a random-state with your own seed that you can ensure is > unique. So how would one run simulations in parallel on a cluster? > > Admittedly, the Common Lisp spec does not explicitly say that you get > a different value every time you call (make-random-state t), but the > hyperspec does say "that has been randomly initialized by some > means". It is surprising behavior if things which have been > independently randomly initialized consistently have the same value. > > I'm not sure what the right thing to do here is. Given that > implementations do seed from (get-universal-time) it would not be > portable for anyone to count on different behavior. On the other > hand, there doesn't seem to be any portable way to make sure that you > are initializing with a properly random seed. > > cmucl reads from /dev/urandom to initialize the seed in > (make-random-state t) with an ignore-errors to fall back to > (get-universal-time) on systems that don't have a working > /dev/urandom. > > That seems to be a simple way to avoid surprising behavior on most > platforms without being worse than the current implementation. > > -- Sidney Markowitz > http://sidney.com/ I agree. diff --git a/src/code/target-random.lisp b/src/code/target-random.lisp index f0b740f..596d109 100644 --- a/src/code/target-random.lisp +++ b/src/code/target-random.lisp @@ -67,6 +67,29 @@ (setf *random-state* (%make-random-state)) (/show0 "returning from !RANDOM-COLD-INIT")) +;;; GENERATE-SEED +;;; +;;; Generate a random seed that can be used for seeding the generator. +;;; If /dev/urandom is available, it is used to generate random data +;;; as the seed. Otherwise, the current time is used as the seed. +;;; +;;; The /dev/urandom device exists on Linux, FreeBSD, Solaris 8 and +;;; later. (You need to have patch 112438-01 for Solaris 8 to get that +;;; device, though). It returns pseudorandom data with entropy that +;;; the kernel collects from the environment. Unlike the related +;;; /dev/random, this device does not block when the entropy pool has +;;; been depleted. +(defun generate-seed () + ;; On some systems (as reported by Ole Rohne on cmucl-imp), + ;; /dev/urandom isn't what we think it is, so if it doesn't work, + ;; silently generate the seed from the current time. + (or (ignore-errors + (with-open-file (rand "/dev/urandom" + :direction :input + :element-type '(unsigned-byte 32)) + (read-byte rand))) + (logand (get-universal-time) #xffffffff))) + (defun make-random-state (&optional state) #!+sb-doc "Make a random state object. If STATE is not supplied, return a copy @@ -94,9 +117,7 @@ (copy-random-state state)) ((member t) (/show0 "T clause") - (%make-random-state :state (init-random-state - (logand (get-universal-time) - #xffffffff))))))) + (%make-random-state :state (init-random-state (generate-seed))))))) ;;;; random entries ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Christophe Rhodes - 2009-03-09 15:55:18 ```Gábor Melis writes: > On Domingo 08 Marzo 2009, Sidney Markowitz wrote: >> cmucl reads from /dev/urandom to initialize the seed in >> (make-random-state t) with an ignore-errors to fall back to >> (get-universal-time) on systems that don't have a working >> /dev/urandom. >> >> That seems to be a simple way to avoid surprising behavior on most >> platforms without being worse than the current implementation. > > I agree. > > [patch] If we're actively aiming to make things better, I'd suggest reading more than one 32-bit value from the entropy pool to improve the number of possible seed values. Otherwise we're only a short e-mail conversation away from someone complaining that when they generate 65536 random states, two of them give the same random number stream... Cheers, Christophe ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Paul Khuong - 2009-03-09 16:09:01 ```On 9-Mar-09, at 11:55 AM, Christophe Rhodes wrote: > If we're actively aiming to make things better, I'd suggest reading > more than one 32-bit value from the entropy pool to improve the number > of possible seed values. Otherwise we're only a short e-mail > conversation away from someone complaining that when they generate > 65536 random states, two of them give the same random number stream... But then again, if you need that many random states, and are really worried about similarity or correlation, you probably should be using a PRNG that offers streams/substreams. Even if your randomly-generated states don't collide, you still have to pray that they will be far enough in the sequence of states, which is just as lossy. For simulation purposes the interface CL (and most other standard libraries) offers is obviously inadequate and shouldn't/couldn't be used for anything but the most basic tasks. Paul Khuong ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-09 20:16:23 ```Paul Khuong wrote, On 10/3/09 5:08 AM: > For simulation purposes the interface CL (and most other standard > libraries) offers is obviously inadequate and shouldn't/couldn't be > used for anything but the most basic tasks. Clearly if I were using CL for simulations in serious academic research I would have to make sure that the prng I used was up to the task, which might mean having more than 32 bits of initial state if I need to run more than tens of thousands of independent identical simulations. That means that if I want my code to be platform independent, I would have to write a library for research-grade simulation and it would have some sort of random number generator that did not depend on the underlying Lisp's random function. I haven't used BioLisp (a.k.a. Biolingua) yet, but a quick grep through their source seems to indicate that they do not define their own research-grade prng, which I find quite surprising. But why not make it easier for people to use sbcl in such a library, as well as take care of more people who have somewhat less demanding requirements? CMUCL has two internal functions, one to initialize the seed from a 32 bit integer, and one to initialize it from a vector of integers. The latter allows the Mersenne Twister to be initialized with the full rage of initial states that the algorithm allows, something in the thousands of bits, I believe. Since sbcl already uses a high quality statistical prng for random, it would not take much to do the same thing as CMUCL to get a better quality initialization. I think that's all it would take to make sbcl's prng up to research simulation quality, and is a pretty simple change. Some references: The MT19937 reference implemenation in C: http://cybertiggyr.com/gene/jmt/mt19937int.c The CMUCL source file that defines vec-init-random-state http://common-lisp.net/cgi-bin/viewcvs.cgi/src/code/rand-mt19937.lisp?rev=1.20&root=cmucl&view=markup ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-09 22:39:09 ```Sidney Markowitz wrote, On 10/3/09 9:15 AM: > Some references: > > The MT19937 reference implemenation in C: > http://cybertiggyr.com/gene/jmt/mt19937int.c > > The CMUCL source file that defines vec-init-random-state > > http://common-lisp.net/cgi-bin/viewcvs.cgi/src/code/rand-mt19937.lisp?rev=1.20&root=cmucl&view=markup The link I gave to the reference implementation is way out of date, missing improvements that were introduced later, including seeding from a vector which is why I was pointing to it in the first place. The updated refefence (from 2002), which is the C code that CMUCL's implementation is based on, is at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.c See the function init_by_array there. BTW, there is a newer version of the algorithm, SFMT19937 published in 2007, that is twice as fast in the non-SIMD C versions and with some other better characteristics http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/ -- Sidney Markowitz http://sidney.com ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Christophe Rhodes - 2009-03-09 20:44:40 ```Paul Khuong writes: > On 9-Mar-09, at 11:55 AM, Christophe Rhodes wrote: >> If we're actively aiming to make things better, I'd suggest reading >> more than one 32-bit value from the entropy pool to improve the number >> of possible seed values. Otherwise we're only a short e-mail >> conversation away from someone complaining that when they generate >> 65536 random states, two of them give the same random number stream... > > But then again, if you need that many random states, and are really > worried about similarity or correlation, you probably should be using > a PRNG that offers streams/substreams. Even if your randomly-generated > states don't collide, you still have to pray that they will be far > enough in the sequence of states, which is just as lossy. If I have two arbitrary random states of the MT, drawn from a uniform distribution over all the possible MT states. p(state dinstance < x) ~ x * 2^{-19936} so if I have one (privileged) random state and N others, p(min(state distance) < x) ~ (1 - x * 2^{-19936})^N for x ~ 2^32, N ~ anything really, do we not still have a vanishingly small probability that any of the states is anywhere near any of the others? Formally things don't start getting bad until Nx is approaching 2^{19900}, even if you are awake enough to do the calculation of the order statistics properly and not simply choose a privileged random state? > For simulation purposes the interface CL (and most other standard > libraries) offers is obviously inadequate and shouldn't/couldn't be > used for anything but the most basic tasks. So shouldn't we be providing an extension for seeding a good simulation-adequate PRNG? And, while we're at it, a good cryptographic PRNG (which MT is not...)? Best, Christophe ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-09 21:07:17 ```Christophe Rhodes wrote, On 10/3/09 9:44 AM: > for x ~ 2^32, N ~ anything really, do we not still have a vanishingly > small probability that any of the states is anywhere near any of the > others? I think that the problem occurs when I want to run some tens or hundreds of thousands or more independent simulations to generate statistics on their results. That's not too hard to do on a big cluster. It means that if I run the simulation more that 2^32 times I still only have 2^32 samples. It doesn't matter that each of the 2^32 states is far from the others when you need more states. I agree that an extension that is defined to be a good way to seed a simulation quality prng, as well as cryptographic extensions, would be useful, but it seems like a very small change to make-random-state would make the implementation of CL random much more useful for many research simulation purposes even without such extensions. -- Sidney Markowitz http://sidney.com ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Christophe Rhodes - 2009-03-10 07:39:14 ```Sidney Markowitz writes: > Christophe Rhodes wrote, On 10/3/09 9:44 AM: >> for x ~ 2^32, N ~ anything really, do we not still have a vanishingly >> small probability that any of the states is anywhere near any of the >> others? > > I think that the problem occurs when I want to run some tens or hundreds > of thousands or more independent simulations to generate statistics on > their results. I'm agreeing with you, but I don't think you read my message carefully enough: the fact that I assume that I can distribute states randomly over the entire state space of the MT necessarily means that I need to be able to seed with ~19937 bits of state. In the light of that, do you agree with the analysis I posted, or is Paul's concern with respect to MT not being a streaming PRNG warranted? Cheers, Christophe ```
 Re: [Sbcl-devel] random number generation in a tight loop From: Sidney Markowitz - 2009-03-10 11:51:16 ```Christophe Rhodes wrote, On 10/3/09 8:38 PM: > In the light of that, do you agree with the analysis I posted, or is > Paul's concern with respect to MT not being a streaming PRNG > warranted? Oh, yes, I didn't read what you said carefully enough, but also I was being a bit naive about Paul's concern. I don't understand why it is not considered good enough to select randomly from amongst MT's 2^19937 states, but since statisticians have invented SPRNGs to get streams and substreams with provable properties, they must have some reason for not accepting the 2^19937 bit state space of MT as being a way to get streams. I would like to see an explanation as to why MT is not good enough. I still think that as long as sbcl is uaing MT for random, it might as well do what CMUCL does and initialize it with full 19937 bit state vectors. But I also agree with your response to Paul's concern that we should have extensions that provide a good simulation-grade SPRNG as well as a cryptographic PRNG. I'm very surprised that Biolisp (a.k.a. Biolingua) does not already provide the former. -- Sidney Markowitz http://sidney.com ```