From: Thiemo S. <th...@ne...> - 2008-11-20 11:20:36
|
Paul Khuong wrote: > On 19-Nov-08, at 4:50 PM, Vitaly Mayatskikh wrote: > > >> On Wed, Nov 19, 2008 at 3:27 PM, Vitaly Mayatskikh > >> <v.m...@gm...> wrote: > >>> With double-reg in scs SBCL always chooses double-reg :( I wanted to > >>> enforce it to use objects directly. And a pair of `move to double- > >>> reg' > >>> + `op with double-reg' should have different cost comparing to `op > >>> with double-stack/descriptor-reg', I think. > > > > ; disassembly for RAY-SPHERE > > ; 028588E3: F20F104E01 MOVSD XMM1, [RSI+1] ; no- > > arg-parsing entry point > > ; 8E8: F20F105609 MOVSD XMM2, [RSI+9] > > ; 8ED: F20F105E11 MOVSD XMM3, [RSI+17] > > ; 8F2: F20F106201 MOVSD XMM4, [RDX+1] > > ; 8F7: F20F106A09 MOVSD XMM5, [RDX+9] > > ; 8FC: F20F107211 MOVSD XMM6, [RDX+17] > > ; 901: F20F5CCC SUBSD XMM1, XMM4 > > ; 905: F20F5CD5 SUBSD XMM2, XMM5 > > ; 909: F20F5CDE SUBSD XMM3, XMM6 > > ; 90D: F20F106701 MOVSD XMM4, [RDI+1] > > ; 912: F20F106F09 MOVSD XMM5, [RDI+9] > > ; 917: F20F107711 MOVSD XMM6, [RDI+17] > > > > but I want to see here such code: > > > > ; disassembly for RAY-SPHERE > > ; 028588E3: F20F104E01 MOVSD XMM1, [RSI+1] ; no- > > arg-parsing entry point > > ; 8E8: F20F105609 MOVSD XMM2, [RSI+9] > > ; 8ED: F20F105E11 MOVSD XMM3, [RSI+17] > > ; 901: F20F5CCC SUBSD XMM1, [RDX+1] > > ; 905: F20F5CD5 SUBSD XMM2, [RDX+9] > > ; 909: F20F5CDE SUBSD XMM3, [RDX+17] > > While I agree that the latter is more esthetic, have you actually made > sure that there is any difference in performance between the two > versions? In the general case, it would be useful to save a couple > registers by loading from memory in the arithmetic instructions. It also reduces Icache pressure a bit. A simple testcase is unlikely to account for this. Thiemo |