From: Borja F. <bor...@gm...> - 2010-12-05 18:40:11
|
I'm currently discussing this issue in the llvmdev mailing list because it seems there's no easy way to work with register pairs. You can check it in the thread with title "Register Pairing". Basically this is the code im testing, it shows how some regs arent paired correctly missing movw insertion opportunities: typedef short t; extern t mcos(t a); extern t mdiv(t a, t b); t foo(t a, t b) { short p1 = mcos(b); short p2 = mcos(a); return mdiv(p1&p2, p1^p2); } This C code produces (for purists, ignore the fact that it's using scratch regs between calls, this is unimplemented): ; a<- r25:r24 b<--r23:r22 mov r18, r24 mov r19, r25 <-- can be combined into a movw r19:r18, r25:r24 mov r25, r23 mov r24, r22 <-- can be combined into a movw r25:r24, r23:r22 call mcos ; here we have the case i was explaining, pairs dont match because they're the other way round, function result is in r25:r24 ; but it's storing the hi part in r20 instead of r21, so we cant insert a movw mov r20, r25 mov r21, r24 <--- should be mov r21, r25; mov r20, r24 to be able to insert a movw mov r25, r19 mov r24, r18 <-- can be combined into a movw r25:r24, r19:r18 call mcos ; same problem as above, again it's moving the hi part in r25 into r18 instead of r19 so we've lost another movw here mov r18, r25 <-- should be mov r19, r25 and r18, r20 mov r19, r24 <-- should be mov r18, r24 and r19, r21 mov r23, r25 <-- this * eor r23, r20 mov r22, r24 <-- * and this could be combined into movw r23:r22, r25:r24 eor r22, r21 mov r25, r18 mov r24, r19 <-- because the result returned by the second call to mcos is stored in r18:r19 (other way round) ; we've lost another movw call mdiv ret John if you have any suggestions they're welcome. I've also asked how to combine two 8 bit instructions into a 16 bit one, mainly for movw and adiw/sbiw. I wrote a function pass that searches 2 moves in a row and combines them into a movw, but if other instructions get in between the moves like it happens in the previous example (note those ands and xors) then they get missed. GCC has an easy way of handling the register pairing issue, Lang in the mailing list suggested using his register allocator which is able to work with constraints like the ones we have, look in the src for PBQP. |