|
From: Jeremy F. <je...@go...> - 2002-11-21 01:34:09
|
On Wed, 2002-11-20 at 16:15, Julian Seward wrote:
> This is the third of two messages about t-chaining. Get a stiff
> whisky before reading further; you'll need it.
>
> Using the attached prog, it's easy to show that on my P3, each
> (pushf ; popf) pair takes 22 cycles. I assume that's 11 each,
> although verifying that. Considering that the P6 pipe is alleged
> to be about 10 stages, I strongly suspect they both cause a pipe
> flush. Ie, each is as bad as a mispredicted jump.
Yes, the P3 optimisation guide says that pushf/popf are complex
instructions (ie, microcoded). They don't stall the pipeline per-se,
but they limit the decode rate and serialise stuff which needn't be
serialised.
Given gems such as:
pushl 32(%ebp)
popfl
subl $0x3E7, %eax
pushfl
popl 32(%ebp)
pushl 32(%ebp)
popfl
jnle-8 %eip+13
what was the problem with lazy save/restore of the flags again?
Still, is this really enough to slow down the whole block? What about
AGI stalls?
J
|