Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.
Close
From: Nikodemus Siivola <nikodemus@ra...>  20090502 15:55:57

Here's the testcase: (defun t1 (x i) (declare (type (vector *) x) (fixnum i)) (declare (optimize speed)) (values (aref x i) (setf (aref x i) i))) (defun t2 (x i) (declare (type (vector fixnum) x) (fixnum i)) (declare (optimize speed)) (values (aref x i) (setf (aref x i) i))) (defun t3 (x i) (declare (type (vector fixnum 50000) x) (fixnum i)) (declare (optimize speed)) (values (aref x i) (setf (aref x i) i))) (defun t4 (x i) (declare (type (vector * 50000) x) (fixnum i)) (declare (optimize speed)) (values (aref x i) (setf (aref x i) i))) (defvar *vector* (makearray 50000 :initialelement 1 :elementtype 'fixnum)) (defun test (fun) (declare (function fun)) (let ((v *vector*)) (dotimes (i 50000) (funcall fun v i)))) T1 clearly outperforms all others, which does not fit my expectations at least. This is due to a couple of misbehaving transforms: A. Transforms on HAIRYDATAVECTOR(REFSET)/CHECKBOUNDS open code the bounds check generating a call to ARRAYDIMENSION. When the array dimensions are known at compile time this is allows eliding the bounds check, or making it more efficient. It also allows the transform B below to fire. B. Transforms on HAIRYDATAVECTOR(REFSET) generate a full call to %DATAVECTORANDINDEX, and eliminate runtime dispatch on array widetag. It seems that dispatch on the widetag is more efficient! If I change the transforms to open code bound checking only if dimensions are known at compile time, and inline the ARRAYHEADERP check, things become less horrible  but T1 remains the fastest. Has something bitrotted since the transformations were written, or what gives? Cheers,  Nikodemus 
From: Christophe Rhodes <csr21@ca...>  20090502 16:21:24

Nikodemus Siivola <nikodemus@...> writes: > Has something bitrotted since the transformations were written, or > what gives? I think it's fair to say that the transforms are mostly written with the idea to optimize for simplearrays. So: * is your intuition better served if instead of (VECTOR A B) you use (SIMPLEARRAY A (B)) in your declarations? * what happens to the timings if you pass in a nonsimple VECTOR to your benchmark functions? Best, Christophe 
From: Nikodemus Siivola <nikodemus@ra...>  20090503 05:38:32

2009/5/2 Christophe Rhodes <csr21@...>: > Nikodemus Siivola <nikodemus@...> writes: > >> Has something bitrotted since the transformations were written, or >> what gives? > > I think it's fair to say that the transforms are mostly written with > the idea to optimize for simplearrays. So: > > * is your intuition better served if instead of (VECTOR A B) you use > (SIMPLEARRAY A (B)) in your declarations? To fair a degree, yes. Though eg. (defun t1s (x i) (declare (type (simplearray *) x) (vi i) (optimize speed)) (values (aref x i) (setf (aref x i) i))) outperforms (defun t4s (x i) (declare (type (simplearray * (50000)) x) (vi i) (optimize speed)) (values (aref x i) (setf (aref x i) i))) as the latter makes full calls to ARRAYRANK and ARRAYDIMENSION to typecheck the array. > * what happens to the timings if you pass in a nonsimple VECTOR to > your benchmark functions? The difference is, if possible, even more pronounced in favor of T1 than before. Cheers,  Nikodemus 