Dear people at Schematics,
Mike Sperber forwarded me a short discussion related to using
pseudo random values with given probability distribution.
In other words, given probabilities p[0], .., p[n-1] summing to one,
construct a pseudo-random process for values x in {0..n-1}
such that Pr{x = k} = p[k].
I would like to bring to your attention that there is a method to
draw the next pseudo-random value in O(1), after having
prepared a datastructure of size O(n). This method is known
as the "method of aliases" by A. J. Walker (Electronic Letters 1974).
It will be much faster than the code currently used in Schemathics
which uses time O(n) for drawing the next random value.
For convenience, I just wrote some code for this :-)
If there would be a good implementation for priority queues
around, the preparation phase could be improved...
Cheers,
Sebastian.
; Random Source for Discrete Distribution
; =======================================
;
; Sebastian.Egner@..., 2-May-2003
; in R5RS-Scheme using
; SRFI-23 (error),
; SRFI-27 (random sources),
; SRFI-42 (eager comprehensions)
; load this into Scheme48 0.57 with SRFI-27:
; ,config ,load srfi-27.scm
; ,open srfi-27 srfi-23
; ,load ec.scm ; from srfi.schemers.org, SRFI-42
; (random-source-make-discretes s w)
; given a source s of random bits in the sense of SRFI-27
; and a vector w of n >= 1 non-negative real numbers,
; this procedure constructs and returns a procedure rand
; such that (rand) returns the next integer x in {0..n-1}
; from a pseudo random sequence with
;
; Pr{x = k} = w[k] / Sum(w[i] : i)
;
; for all k in {0..n-1}.
(define (random-source-make-discretes s w)
(define (check-weights w)
(if (not (and (vector? w)
(>= (vector-length w) 1)
(forall?-ec (:vector wi w)
(and (number? wi) (not (negative? wi))))))
(error "bad vector of weights" w)))
(define (make-cutoff-alias w)
(define eps (expt 10 -6))
(define n (vector-length w))
(define cutoff (make-vector n 1))
(define alias (make-vector n 0))
(define avg-w (/ (sum-ec (:vector wi w) wi) n))
(define b (vector-of-length-ec n (:vector wi w) (- wi avg-w)))
(let loop ((sum-abs-b (sum-ec (:vector bi b) (abs bi))))
(if (< sum-abs-b eps)
(list cutoff alias)
(let ((imin 0) (bmin (vector-ref b 0))
(imax 0) (bmax (vector-ref b 0)))
(do-ec (:vector bi (index i) b)
(begin
(if (< bi bmin)
(begin (set! imin i) (set! bmin bi)))
(if (> bi bmax)
(begin (set! imax i) (set! bmax bi)))))
(vector-set! cutoff imin (+ 1 (/ bmin avg-w)))
(vector-set! alias imin imax)
(vector-set! b imin 0)
(vector-set! b imax (+ bmin bmax))
(loop (+ sum-abs-b (- (abs bmin)) (- (abs bmax))
(+ (abs (vector-ref b imax)))))))))
(check-weights w)
(let* ((c-a (make-cutoff-alias w))
(cutoff (car c-a))
(alias (cadr c-a))
(randi (random-source-make-integers s))
(randu (random-source-make-reals s))
(n (vector-length cutoff)))
(lambda ()
(let ((i (randi n)))
(if (< (randu) (vector-ref cutoff i))
i
(vector-ref alias i))))))
; example
;(define r1
; (random-source-make-discretes
; default-random-source
; (vector-ec (: i 0 29) (* 0.1 (expt 0.9 i))))) ; Geo(0.1)
;
; now (r1) -> x in {0..29} with Pr{x = i} = w[i]/Sum(w[j] : j)
; where w[j] = 0.1 0.9^j.
;
;(define (hist rand n iters) ; gather a histogram
; (let ((h (make-vector n 0)))
; (do-ec (:range i iters)
; (:let x (rand))
; (vector-set! h x (+ (vector-ref h x) 1)))
; h))
;
; ,time (hist r1 30 1000000)
; -> a vector approximating the geometric distribution
----
Dr. Sebastian Egner
Senior Scientist
Philips Research Laboratories
Prof. Holstlaan 4 (postbox WY63, kamer WY 6-55)
5656 AA Eindhoven
The Netherlands
tel: +31 40 27-42069 *** new telephone number
fax: +31 40 27-42566 *** new fax number
email: sebastian.egner@... |