I looked at the BITVECTOR benchmark in cl-bench in order to see if it
could easily be optimized. Unfortunately I saw that 90% of the time
seems to be spent on cache-misses(?) due to the result-array of the
BIT-AND operation not having dynamic extend.
Introducing the dummy variable in the code below makes it roughly 10x
faster.
The code from cl-bench:
(locally
(declare (optimize (speed 3) (safety 0) (debug 0)))
(defun bench-bitvectors (&optional (size 1000000) (runs 700))
(declare (fixnum size))
(let ((zeros (make-array size :element-type 'bit :initial-element 0))
(ones (make-array size :element-type 'bit :initial-element 1))
(xors (make-array size :element-type 'bit)))
(dotimes (runs runs)
(bit-xor zeros ones xors)
(bit-nand zeros ones xors)
(let ((dummy (make-array size :element-type 'bit)))
(declare (dynamic-extent dummy))
(bit-and zeros xors dummy))
#+nil (bit-and zeros xors)))
(values)))
Now, since BIT-AND is in the COMMON-LISP package and thus can not be
redefined, would it be possible to detect that the result of BIT-AND is
not used and to allocate the result array using dynamic extent?
astor
|