This simple, but nevertheless probably buggy, patch builds upon Juho's recent improvements to the allocation on x86-86.  It usually ends up using 6 bytes less than the current state of affairs (LEA vs MOV + ADD).  My sbcl.core ends up 0.1% slimmer!

astor

Index: macros.lisp
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/compiler/x86-64/macros.lisp,v
retrieving revision 1.12
diff -u -r1.12 macros.lisp
--- macros.lisp 3 Nov 2005 12:41:07 -0000       1.12
+++ macros.lisp 3 Nov 2005 20:05:13 -0000
@@ -169,9 +169,13 @@
            (allocation-tramp alloc-tn size))
           (t
            (inst mov temp-reg-tn free-pointer)
-           (unless (and (tn-p size) (location= alloc-tn size))
-             (inst mov alloc-tn size))
-           (inst add alloc-tn temp-reg-tn)
+           (if (tn-p size)
+               (if (location= alloc-tn size)
+                   (inst add alloc-tn temp-reg-tn)
+                   (inst lea alloc-tn
+                         (make-ea :qword :base temp-reg-tn :index size)))
+               (inst lea alloc-tn
+                     (make-ea :qword :base temp-reg-tn :disp size)))
            (inst cmp end-addr alloc-tn)
            (inst jmp :be NOT-INLINE)
            (inst mov free-pointer alloc-tn)