|
From: Jeremy F. <je...@go...> - 2002-12-11 01:51:11
|
On Tue, 2002-12-10 at 17:31, Julian Seward wrote: > Hey? That seems like too many instructions to me. The idea is that the > cache entries are arranged so as to cause lookup failures on misalignment, > so that the testl and jnz are not needed. Yep, you're right. > This is not so good (trashes a second reg), so perhaps your code is better > here. OTOH, providing enough spare regs exist, all reasonable machines > have 2 ALUs capable of doing the andls in parallel, so the sequence should be > fast. I made the ACCESS UInstr take two args: the address and the rounded address, so that I didn't have to scrounge for a pair of temps. It would help if AND accepted a Lit32 argument though. > movl %vv, %temp > movl %vv, %temp2 > andl $MASK, %temp -- cache index, as before > andl $(~2), %temp2 -- dump bit 1 of address (~2 == 111...11101b) > cmpl cache(%temp), %temp2 > jz done > slow: > > The andl $(~2) is the subtlety. For the lowest two bits it gives the mapping > 00 -> 00, 01 -> 01, 10 -> 00, 11 -> 01 > So if the address was 2-aligned (00, 10) it produces 00, which can potentially > match the cache[] entry. Nice. J |