From: Nathan F. <fr...@cs...> - 2005-10-12 03:57:12
|
While writing and optimizing a program a year ago, I noticed a suboptimality in the code SBCL generates. Consider the following function: (defun gene-w (gene age) (declare (type (simple-array fixnum (50)) gene)) (declare (type (integer 0 3) age)) (aref gene (+ age 20))) This function compiles to essentially the following PPC assembly: ; 40A9AD8C: 39590050 ADDI $FDEFN,$A1,80 ; 90: 386A0001 ADDI $NL0,$FDEFN,1 ; 94: 7F18182E LWZX $A0,$A0,$NL0 Which is all well and good. Except that the two ADDIs could really be folded into a single ADDI, thus eliminating an instruction. This requires recognizing the pattern (AREF <thing> (+ <index> <constant)) and ensuring that <constant> will fit into the immediate field of the ADDI after the adjustment: (- (* VECTOR-DATA-OFFSET N-WORD-BYTES) OTHER-POINTER-LOWTAG) is made. The attached patch attempts to implement just such an optimization for the PPC; doing the same for other architectures should be relatively straightforward. At a high level, the patch adds a new support routine to SB-VM that implements the check for fitting into the immediate field of an ADDI. (For the x86, this would be the displacement field of an effective address.) A transform is added for DATA-VECTOR-REF which checks to see if the index is off the form (+ <foo> <constant>) and changes the call into (DATA-VECTOR-REF-WITH-OFFSET ...). New VOPs are then introduced to implement this new function. The attached patch is incomplete: VOPs for DATA-VECTOR-REF-WITH-OFFSET have not been implement for float or complex arrays, nor for arrays with signed elements smaller than a word. The corresponding transformation for DATA-VECTOR-SET has not yet been implemented, although the low-level VOP support is present. But the basic idea works well enough; the above example compiles to: ; 100131AC: 38790051 ADDI $NL0,$A1,81 ; B0: 7F18182E LWZX $A0,$A0,$NL0 Which is what we really wanted from our compiler anyway. The patch could probably be enhanced in several ways: 1) Complete support for *all* specialized arrays, natch; 2) Proper use of :INFO in the D-V-R-W-O VOPs to avoid lying to the VOP machinery about how the EXTRA-OFFSET argument is passed in. This change would also enable the VOPs to more precisely declare the types of EXTRA-OFFSET, which is a win; 3) (ambitious) Totally replace DATA-VECTOR-REF with D-V-R-W-O, since the former is just the latter with an offset of 0. I thought about doing this before I started, but then balked because of possible difficulties with constant folding (thinking that the real function D-V-R-W-O would have to be defined somewhere, which would mean that the 'offset' parameter would not be a constant...). But after implementing D-V-R-W-O, I see that the "real" function for D-V-R does not exist. So maybe this would be feasible after all; 4) Repeated application of the transformation: e.g. recognizing that (AREF A (+ (+ X 5) 6)) should not be (D-V-R-W-O A (+ X 5) 6) but instead (D-V-R-W-O A X 11). And so on in other ways, too. So, do people feel that this is a reasonable thing to include in the code base? This seems like a lot of code to add to each of the backends, although the impact would be mitigated somewhat if improvement 3) above was realized. Comments and questions welcome. -- Nathan | From Man's effeminate slackness it begins. --Paradise Lost The last good thing written in C was Franz Schubert's Symphony Number 9. --Erwin Dieterich |