|
From: Nicholas N. <nj...@ca...> - 2002-11-22 12:37:23
|
On Fri, 22 Nov 2002, Julian Seward wrote:
> [S2: suggestion of smarter code for Jz et al]
>
> Another thing which would surely help is to generate smarter code for
> the Jz / Jnz UINSTRS. Observation to be made here is that at least for
> the Zero flag, we might as well directly test the bit in %EFLAGS in
> memory -- in fact this would be a lot cheaper than hauling %EFLAGS into
> %eflags: [note; this illustration is without t-chaining, but is equally
> valid in t-chaining's presence]
>
> Jzo $addr
> -->
> testl $(1<<N), 32(%ebp) -- 32(%ebp) is %EFLAGS
> -- and N is the offset of the Z flag in
> -- %EFLAGS
> jz-8 %eip+6
> movl $addr, %eax
> ret
> %eip+6:
>
> in contrast to the current scheme
>
> pushl 32(%ebp)
> popfl
> jnz-8 %eip+6
> movl $addr, %eax
> ret
> %eip+6:
>
> In fact the new scheme not only avoids the horrible popfl; it also
> saves an insn!
Nice idea. But doesn't it clash with the lazy EFLAGs idea? viz:
37: ANDL %eax, %ebx (-wOSZACP)
pushl 32(%ebp) ; popfl
andl %eax, %ebx
pushfl ; popl 32(%ebp) ***
39: Jzo $0x402047AC (-rOSZACP)
pushl 32(%ebp) ; popfl ***
jnz-8 %eip+6
movl $0x402047AC, %eax
ret
If Jzo doesn't have the "pushl 32(%ebp) ; popfl" then you can't remove the
"pushfl ; popl 32(%ebp)" from ANDL -- instead of cutting 4 instructions,
it only cuts 2.
Another thing: will the ANDL's "pushfl ; popl 32(%ebp)" always be
necessary? Ie. do the condition codes have to be saved in EFLAGs before
the end of the BB? If so, then lazy EFLAG updating can only save two
instructions, not four, in which case the new Jz/Jnz idea seems better
because it saves another instruction anyway...
> Some complicated tests (not-below, etc) would still have to be done the
> old way, but we could cover the tests associated with single flags
> (zero/nonzero, sign/nonsign, carry/nocarry) and I think that would
> catch most _uses_ of the condition codes.
FYI, from "valgrind --skin=none --trace-codegen=00001 true":
all jumps 1908
JMPo 916
Jzo 407
Jnzo 150
JMPo-c 177
JMPo-r 121
Jle 20
Jnbo 25
Jbo 13
Jnbeo 12
Jns 16
Jnleo 9
JMPo-sys 7
sum of those listed: 1873 (35 not mentioned)
Of the 726 Jccs, 77% are Jz/Jnz.
N
|