From: Knut F. <Knu...@gm...> - 2008-12-29 13:40:48
Attachments:
bounds-p-tran.lisp
|
Hi, I was a bit curious after reading the claim somewhere that SBCL-compiled code could compete with gcc in terms of speed; and after playing around with it a bit I must say that I'm impressed. Congratulations. :-) However, here's a case I found to be handled sub-optimally: (defun test (i j) (declare (fixnum i j)) (let ((a (make-array '(10 10)))) (declare (type (simple-array * (10 10)) a)) (dotimes (n 10000000) (unless (array-in-bounds-p a i j) ; need to do something with the result, so that the compiler ; doesn't optimize the call away altogether (return))))) I.e., using array-in-bounds-p in a tight loop on an array of known dimensions. Disassembling #'test shows that it contains a dynamic call to #'array-in- bounds-p, that I think can and should be optimized away in this case. After browsing through the SBCL source a bit, I came up with an automatic translation rule (see attached file bounds-p-tran.lisp). I successfully tested this with SBCL 1.0.22 and 1.0.23, measuring roughly a 56-76x (sic) speed-up of the above test case (which is then compiled with no dynamic function calls left at all). Given my limited knowledge of Lisp and SBCL internals, it could do with some review, but I think it may be eligible for inclusion into src/compiler/array-tran.lisp (or possibly something like contrib/compiler- extras.lisp if there are issues). Knut |
From: Erik H. <eh...@gm...> - 2008-12-29 13:51:13
|
On Mon, Dec 29, 2008 at 2:40 PM, Knut Franke <Knu...@gm...> wrote: > Hi, > > I was a bit curious after reading the claim somewhere that SBCL-compiled code > could compete with gcc in terms of speed; and after playing around with it a > bit I must say that I'm impressed. Congratulations. :-) > > However, here's a case I found to be handled sub-optimally: > > (defun test (i j) > (declare (fixnum i j)) > (let ((a (make-array '(10 10)))) > (declare (type (simple-array * (10 10)) a)) > (dotimes (n 10000000) > (unless (array-in-bounds-p a i j) > ; need to do something with the result, so that the compiler > ; doesn't optimize the call away altogether > (return))))) > > I.e., using array-in-bounds-p in a tight loop on an array of known dimensions. > Disassembling #'test shows that it contains a dynamic call to #'array-in- > bounds-p, that I think can and should be optimized away in this case. I don't see any declarations w.r.t. speed, safety, etc. Does a declaration for maximum speed (and minimum of the others) not achieve the same? > After browsing through the SBCL source a bit, I came up with an automatic > translation rule (see attached file bounds-p-tran.lisp). I successfully tested > this with SBCL 1.0.22 and 1.0.23, measuring roughly a 56-76x (sic) speed-up of > the above test case (which is then compiled with no dynamic function calls > left at all). Given my limited knowledge of Lisp and SBCL internals, it could > do with some review, but I think it may be eligible for inclusion into > src/compiler/array-tran.lisp (or possibly something like contrib/compiler- > extras.lisp if there are issues). Bye, Erik. |
From: Paul K. <pk...@gm...> - 2008-12-29 14:59:30
|
On 29-Dec-08, at 8:40 AM, Knut Franke wrote: > I.e., using array-in-bounds-p in a tight loop on an array of known > dimensions. > Disassembling #'test shows that it contains a dynamic call to > #'array-in- > bounds-p, that I think can and should be optimized away in this case. ARRAY-IN-BOUNDS-P is only used by SBCL in an out of line PCL function. The reason ARRAY-IN-BOUNDS-P isn't optimized at all is simply that it's not used (internally) by anything that's expected to be efficient (e.g. inline array access), and it seems no one yet has complained about its performance. The transformation probably makes sense for arrays of small rank, if only to save space, but is there a practical reason you want ARRAY-IN-BOUNDS-P to be fast? There might be a better way. Paul Khuong |
From: Knut F. <Knu...@gm...> - 2008-12-29 15:51:53
|
> ARRAY-IN-BOUNDS-P is only used by SBCL in an out of line PCL > function. The reason ARRAY-IN-BOUNDS-P isn't optimized at all is > simply that it's not used (internally) by anything that's expected to > be efficient (e.g. inline array access), and it seems no one yet has > complained about its performance. The transformation probably makes > sense for arrays of small rank, if only to save space, but is there a > practical reason you want ARRAY-IN-BOUNDS-P to be fast? There might be > a better way. Not an _entirely_ practical reason. I was mainly curious about how SBCL compares to GCC, and whether CL might be an option for future projects involving extensive calculations and backtracking searches; so I wrote a simple netwalk-puzzle-solving algorithm in C++ and Lisp. This involves storing both the puzzle and candidate solutions in rank 2 arrays of known sizes and doing lots of bounds checks on them. Possibly the algorithm and/or data representation could be improved; and certainly all occurences of array-in- bounds-p could be substituted with (and (< -1 i size) (< -1 j size)) by hand, but even if you (defconstant size ...) I feel that this is less readable and harder to maintain (assuming something similar crops up in production code) than a declaration plus compiler optimization. I appreciate this is partially a matter of personal preference, though; and in any case not a huge issue. Anyway, I've achieved what I wanted (having run GCC- and SBCL-generated code at comparable speeds and found out how to tweak SBCL's compiler if need be), and if I ever happen to need an efficient array-in-bounds-p for real calculations, I can just include the transformation in my .sbclrc (which is way cool btw - how many compilers can you hack so easily? ;-). I just thought that this idiom might be common enough for other users to profit from including the transformation (possibly adding some dependency on optimization settings and/or array rank). Knut |