|
From: Borja F. <bor...@gm...> - 2010-12-05 18:40:11
|
I'm currently discussing this issue in the llvmdev mailing list because it
seems there's no easy way to work with register pairs. You can check it in
the thread with title "Register Pairing". Basically this is the code im
testing, it shows how some regs arent paired correctly missing movw
insertion opportunities:
typedef short t;
extern t mcos(t a);
extern t mdiv(t a, t b);
t foo(t a, t b)
{
short p1 = mcos(b);
short p2 = mcos(a);
return mdiv(p1&p2, p1^p2);
}
This C code produces (for purists, ignore the fact that it's using scratch
regs between calls, this is unimplemented):
; a<- r25:r24 b<--r23:r22
mov r18, r24
mov r19, r25 <-- can be combined into a movw r19:r18, r25:r24
mov r25, r23
mov r24, r22 <-- can be combined into a movw r25:r24, r23:r22
call mcos
; here we have the case i was explaining, pairs dont match because they're
the other way round, function result is in r25:r24
; but it's storing the hi part in r20 instead of r21, so we cant insert a
movw
mov r20, r25
mov r21, r24 <--- should be mov r21, r25; mov r20, r24 to be able to
insert a movw
mov r25, r19
mov r24, r18 <-- can be combined into a movw r25:r24, r19:r18
call mcos
; same problem as above, again it's moving the hi part in r25 into r18
instead of r19 so we've lost another movw here
mov r18, r25 <-- should be mov r19, r25
and r18, r20
mov r19, r24 <-- should be mov r18, r24
and r19, r21
mov r23, r25 <-- this *
eor r23, r20
mov r22, r24 <-- * and this could be combined into movw r23:r22,
r25:r24
eor r22, r21
mov r25, r18
mov r24, r19 <-- because the result returned by the second call to
mcos is stored in r18:r19 (other way round)
; we've lost another movw
call mdiv
ret
John if you have any suggestions they're welcome. I've also asked how to
combine two 8 bit instructions into a 16 bit one, mainly for movw and
adiw/sbiw. I wrote a function pass that searches 2 moves in a row and
combines them into a movw, but if other instructions get in between the
moves like it happens in the previous example (note those ands and xors)
then they get missed. GCC has an easy way of handling the register pairing
issue, Lang in the mailing list suggested using his register allocator which
is able to work with constraints like the ones we have, look in the src for
PBQP.
|