|
From: Richard M K. <kr...@pr...> - 2022-09-07 21:04:11
|
Hi all,
I think I'd like a dx make-list for smallish list lengths (well,
actually dx copy-seq, subseq, reverse, append, mapcar... but I'll take a
stack allocated list full of nil; I can implemlement dx versions of all
of the rest with that).
By inspection, this seems to work:
(defvar *max-dx-list-length* 50) ;; reasonable?
(defmacro with-list ((var length) &body body
&aux (recurse (gensym)) (len (gensym)))
`(labels ((,recurse (n acc fun)
(if (zerop n)
(funcall fun acc)
(let ((acc (cons nil acc)))
(declare (dynamic-extent acc))
(,recurse (1- n) acc fun)))))
(declare (inline ,recurse)
(dynamic-extent (function ,recurse)))
(let ((,len ,length))
(if (< -1 ,len *max-dx-list-length*)
(,recurse ,len nil (lambda (,var) ,@body))
(let ((,var (make-list ,len)))
,@body)))))
A couple of timing tests are below, using the default optimization
settings in a fresh SBCL process. Looks like this avoids consing under
those conditions, and may offer a speedup besides. It also seems that
the inline declaration is important to minimize both CPU usage and
consing. (Not demonstrated below.)
Questions:
0. Because this recursion is always bounded and smallish, I think I
don't care about tail call elimination as a way to avoid running out of
stack. Should I care about TCE for other reasons though?
1. IDK whether or how TCE & DX declarations might interact under
different optimization settings. I'm not sure I care too much, but what
I definitely don't want is an interaction that'll corrupt the Lisp image
(when I'm obeying DX discipline, of course). I can't figure any of this
out by looking around in the source, and the manual doesn't really
clarify. Anybody know if these can interact in a way that could be
dangerous?
2. If the answer is "you must use a higher-than-default debug
optimization that inhibits TCE in order for that sort of trickery to be
reliable", I think that's fine. Evaluating the timing measurements with
(debug 3) optimization seems to slow things down a bit, but I'm okay
with that (demonstration below).
3. Or if the answer is "don't do any of this; it might work but isn't
supported", that'd be good to know, too.
Thanks in advance for any insights.
Regards,
Richard
;; Baseline, using make-list. The GC call is just to minimize noise.
* (progn
(sb-ext:gc :full t)
(time
(let ((sum 0))
(dotimes (i 1000 sum)
(dotimes (j 50)
(let ((list (make-list j)))
(setq sum (+ sum (length list)))))))))
Evaluation took:
0.011 seconds of real time
0.011280 seconds of total run time (0.006578 user, 0.004702 system)
100.00% CPU
29,231,880 processor cycles
19,604,736 bytes consed
1225000
;; Using the WITH-LIST macro:
* (progn
(sb-ext:gc :full t)
(time
(let ((sum 0))
(dotimes (i 1000 sum)
(dotimes (j 50)
(with-list (list j)
(setq sum (+ sum (length list)))))))))
Evaluation took:
0.002 seconds of real time
0.002374 seconds of total run time (0.002374 user, 0.000000 system)
100.00% CPU
6,151,518 processor cycles
0 bytes consed
1225000
;; Try it again with high DEBUG optimization
* (declaim (optimize (debug 3)))
NIL
;; Apparently identical to the measurements under the default
;; optimization settings.
* (progn
(sb-ext:gc :full t)
(time
(let ((sum 0))
(dotimes (i 1000 sum)
(dotimes (j 50)
(let ((list (make-list j)))
(setq sum (+ sum (length list)))))))))
Evaluation took:
0.011 seconds of real time
0.011380 seconds of total run time (0.006574 user, 0.004806 system)
100.00% CPU
29,493,228 processor cycles
19,604,736 bytes consed
1225000
;; Apparently more CPU is consumed in this case, but still no
;; consing; I'll take it.
* (progn
(sb-ext:gc :full t)
(time
(let ((sum 0))
(dotimes (i 1000 sum)
(dotimes (j 50)
(with-list (list j)
(setq sum (+ sum (length list)))))))))
Evaluation took:
0.003 seconds of real time
0.003615 seconds of total run time (0.003615 user, 0.000000 system)
133.33% CPU
9,368,840 processor cycles
0 bytes consed
1225000
|