> I think it's the sbbl %reg,%reg idiom. I'll add it to > my list of nasties (I've never seen this particular nasty before). > > It might be possible to make the x86 insn decoder aware of this idiom > and treat it like "movl $0, %reg; sbbl %reg,%reg", which would fix > this. It already specially understands "xor %reg,%reg". There are several other analogous cases. They occur often in low-level graphics programming, or anything which makes intensive use of Carry bit, which is intimately connected with "unsigned less than". sub %reg,%reg # faster [!] than xor on 8086 [hence, a habit] or $-1,%reg # smallest way to load 0xffffffff shr %reg1 # Carry = bottom_bit adc %reg2,%reg2 # bottom_bit = Carry and %reg,%reg # equivalent to "test %reg,%reg" or %reg,%reg # equivalent to "test %reg,%reg" testl %reg,%reg; js ... # depends on the (1<<31) bit only addl %reg,%reg; jc ... # depends on the (1<<31) bit only addl %reg,%reg; js ... # depends on the (1<<30) bit only alu ...,%reg; jpe ... # depends only on low 8 bits and $0,mem # forces datacache load on PentiumPlain # "mov $0,mem" bypasses cache if a miss on PentiumPlain or $~0,mem # forces datacache load on PentiumPlain # "mov $~0,mem" bypasses cache if a miss on PentiumPlain and mem,%reg # the 0 bits in mem initialize # the corresponding bits in %reg or mem,%reg # the 1 bits in mem initialize # the coresponding bits in %reg [decimal arithmetic instructins are also peculiar] -- John Reiser, jreiser@BitWagon.com |