Thread: [Sbcl-devel] question about compiler internals

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi!

First of all, sorry for my weird english ;)

As far as I can see, SBCL doesn't able to do reg/mem operations like
`add rax, [rsi + 123]'  (I'm assuming x86-64 architecture). Instead of
it, compiler emits 2 operations: `mov temp, [rsi + 123]; add rax, temp'

Here is an example:

(defun sum (a b)
  (declare (type fixnum a)
	   (type (array fixnum) b)
	   (optimize (speed 3) (safety 0) (debug 0)))
  (+ a (aref b 1)))

SBCL produces such VOPs:

    7: DATA-VECTOR-REF-WITH-OFFSET/SIMPLE-ARRAY-SIGNED-BYTE-61 t35[RDX]
                                                               t36[RDI]
                                                               {0}
                                                               => t37[RAX]
    8: MOVE-TO-WORD/FIXNUM t37[RAX] => t38[RCX]
    9: FAST-+/SIGNED=>SIGNED A!14[S2]>t39[RAX] t38[RCX] => t40[RAX]

and such code:

...
;       79:       488B443A01       MOV RAX, [RDX+RDI+1]
;       7E:       488BC8           MOV RCX, RAX
;       81:       48C1F903         SAR RCX, 3
;       85:       488B45E8         MOV RAX, [RBP-24]
;       89:       4801C8           ADD RAX, RCX

While the optimal code will be `mov rax, ...; add rax, [rdx+rdi+1]'. 
Well, this example is not ideal, because there is a type coercion
before addition...

The same thing is with floating point calculations: SBCL always loads
both values to xmm registers and than performs needed operation on
them.

In many cases it will be faster to use reg/mem operation instead of
reg/reg. Seems, SBCL is able to generate proper machine instructions
(at least, sse-related emitters), but arithmetic/logical vops are not 
ready to take arguments of storage class other than *-reg. It's not a
big deal to modify these vops, but I've trapped on another problem:

SBCL coerces, for example, double-stack to double-reg with help of
proper define-move-op (define-move-fun?), which is simple `movsd'
for that case. AFAIU, only such define-move-op'ed vops can calculate
effective addresses in compile time.

For example, double-stack to double-reg coercion is done with this vop:

(define-move-fun (load-double 2) (vop x y)
  ((double-stack) (double-reg))
  (inst movsd y (ea-for-df-stack x)))

It reduces to a single machine command `movsd y, xxx'. Effective
address for x value (stored in the stack) was calculated in compile
time.

When I hijack standard double-float subtraction vop with my own and
try to use the same `ea-for-df-stack' in it, compiler emits runtime
double-stack to double-reg coercion and, thus, kills performance
cruelly.

I'm totally stuck here and asking for help. Is it possible to do
reg/mem operations at all in SBCL?

Many thanks!
--
wbr, Vitaly

Thread: [Sbcl-devel] question about compiler internals

Common Lisp compiler and runtime

sbcl-devel