From: Todd M. <jm...@st...> - 2002-07-23 18:55:27
|
Eric Maryniak wrote: >Dear crunchers, > >According to the _Numpy_ manual for RandomArray.seed(x=0, y=0) >(with /my/ emphasis): > > The seed() function takes two integers and sets the two seeds > of the random number generator to those values. If the default > values of 0 are used for /both/ x and y, then a seed is generated > from the current time, providing a /pseudo-random/ seed. > >Note: in numarray, the RandomArray2 package is provided but it's >description is not (yet) included in the numarray manual. > >I have some questions about this: > >1. The implementation of seed(), which is, by the way, identical > both in Numeric's RandomArray.py and numarray's RandomArray2.py > seems to contradict it's usage description: > The 2 in RandomArray2 is there to support side-by-side testing with Numeric, not to imply something new and improved. The point of providing RandomArray2 is to provide a migration path for current Numeric users. To that end, RandomArray2 should be functionally identical to RandomArray. That should not, however, discourage you from writing a new and improved random number package for numarray. > > >---cut--- >def seed(x=0,y=0): > """seed(x, y), set the seed using the integers x, y; > Set a random one from clock if y == 0 > """ > if type (x) != IntType or type (y) != IntType : > raise ArgumentError, "seed requires integer arguments." > if y == 0: > import time > t = time.time() > ndigits = int(math.log10(t)) > base = 10**(ndigits/2) > x = int(t/base) > y = 1 + int(t%base) > ranlib.set_seeds(x,y) >---cut--- > > Shouldn't the second 'if' be: > > if x == 0 and y == 0: > > With the current implementation: > > - 'seed(3)' will actually use the clock for seeding > - it is impossible to specify 0's (0,0) as seed: it might be > better to use None as default values? > >2. With the current time.time() based default seeding, I wonder > if you can call that, from a mathematical point of view, > pseudo-random: > >---cut--- >$ python >Python 2.2.1 (#1, Jun 25 2002, 20:45:02) >[GCC 2.95.3 20010315 (SuSE)] on linux2 >Type "help", "copyright", "credits" or "license" for more information. > >>>>from numarray import * >>>>from RandomArray2 import * >>>>import time >>>>numarray.__version__ >>>> >'0.3.5' > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(1) >... print >... >1027434978.406238 >(102743, 4979) > >1027434979.400319 >(102743, 4980) > >1027434980.400316 >(102743, 4981) > >1027434981.40031 >(102743, 4982) > >1027434982.400308 >(102743, 4983) >---cut--- > > It is incremental, and if you use default seeding within > one (1) second, you get the same seed: > >---cut--- > >>>>for i in range(5): >>>> >... time.time() >... RandomArray2.seed() >... RandomArray2.get_seed() >... time.sleep(0.1) >... print >... >1027436537.066677 >(102743, 6538) > >1027436537.160303 >(102743, 6538) > >1027436537.260363 >(102743, 6538) > >1027436537.360299 >(102743, 6538) > >1027436537.460363 >(102743, 6538) >---cut--- > >3. I wonder what the design philosophy is behind the decision > to use 'mathematically suspect' seeding as default behavior. > Using time for a seed is fairly common. Since it's an implementation detail, I doubt anyone would object if you can suggest a better default seed. > > Apart from the fact that it can hardly be called 'random', I also > have the following problems with it: > > - The RandomArray2 module initializes with 'seed()' itself, too. > Reload()'s of RandomArray2, which might occur outside the > control of the user, will thus override explicit user's seeding. > Or am I seeing ghosts here? > Overriding a user's explicit seed as a result of a reload sounds correct to me. All of the module's top level statements are re-executed during a reload. > > - When doing repeated run's of one's neural net simulations that > each take less than a second, one will get identical streams of > random numbers, despite seed()'ing each time. > Not quite what you would expect or want. > This is easy enough to work around: don't seed or re-seed. If you then need to make multiple simulation runs, make a separate module and call your simulation like: import simulation RandomArray2.seed(something_deterministic, something_else_deterministic) for i in range(number_of_runs): simulation.main() > > - From a purist software engineering point of view, I don't think > automagical default behavior is desirable: one wants programs to > be deterministic and produce reproducible behavior/output. > I don't know. I think by default, random numbers *should be* random. > > If you use default seed()'ing now and re-run your program/model > later with identical parameters, you will get different output. > When you care about this, you need to set the seed to something deterministic. > > In Eiffel, object attributes are always initialized, and you will > almost never have irreproducible runs. I found that this is a good > thing for reproducing ones bugs, too ;-) > This sounds like a good design principle, but I don't see anything in RandomArray2 which is keeping you from doing this now. > >To summarize, my recommendation would be to use None default arguments >and use, when no user arguments are supplied, a hard (built-in) seed >tuple, like (1,1) or whatever. > Unless there is a general outcry from the rest of the community, I think the (existing) numarray extensions (RandomArray2, LinearAlgebra2, FFT2) should try to stay functionally identical with Numeric. > >Sometimes a paper on a random number generator suggests seeds (like 4357 >for the MersenneTwister), but of course, a good random number generator >should behave well independently of the initial seed/seed-tuple. >I may be completely mistaken here (I'm not an expert on random number >theory), but the random number generators (Ahrens, et. al) seem 'old'? >After some studying, we decided to use the Mersenne Twister: > An array enabled version might make a good add-on package for numarray. > > > http://www-personal.engin.umich.edu/~wagnerr/MersenneTwister.html > http://www.math.keio.ac.jp/~matumoto/emt.html > >PDF article: > > http://www.math.keio.ac.jp/~nisimura/random/doc/mt.pdf > > M. Matsumoto and T. Nishimura, > "Mersenne Twister: A 623-dimensionally equidistributed uniform > pseudorandom number generator", > ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, > January pp.3-30 1998 > >There are some Python wrappers and it has good performance as well. > >Bye-bye, > >Eric > Bye, Todd |