Re: [Squeak-VMdev] Versiojn 4 changes

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 08 Apr 2004, at 00:54, Ned Konz wrote:

> On a related note, does it seem wasteful to anyone but me that we do 
> the
> following in primBytecodeAdd:

Probably not. ;)

You introduce an additional branch into the critical path, by checking 
both operands for overflow instead of checking just the result.

> Seems like we could save the shifts in most cases by looking at the 
> top two
> bits of the receiver and argument; if the sign bits are different or 
> the high
> bits (B30) are both the same as the sign bits we aren't going to get 
> any
> overflow.

You end up with exactly the same number of instructions anyway.

Current version:

         lwz     r3,0xfffc(r27)
         lwz     r4,0(r27)
         and     r28,r3,r4
         andi.   r9,r28,0x1
         beq     <fail>
         srawi   r5,r3,1
         srawi   r0,r4,1
         add     r4,r5,r0
         rlwinm  r2,r4,1,0,30
         xor.    r9,r4,r2
         blt     <fail>
         ori     r6,r2,0x1
         stwu    r6,0xfffc(r27)
         <dispatch>

Nedified version:

         lwz     r3,0xfffc(r27)
         lwz     r4,0(r27)
         xor.    r0,r3,r4
         blt     <fail>
         rlwinm  r5,r3,1,0,30
         xor.    r2,r5,r3
         blt     <fail>
         rlwinm  r6,r4,1,0,30
         xor.    r2,r6,r4
         blt     <fail>
         add     r7,r3,r4
         addi    r3,r7,0xffff
         stwu    r3,0xfffc(r27)
         <dispatch>

While it probably won't impact speed on a decent implementation of the 
CPU (the additional branch will be predicted correctly) it won't 
increase speed either (you haven't reduced the overall number of data 
hazards the pipeline has to deal with).

Cheers,
Ian