|
From: Nicholas N. <nj...@ca...> - 2002-11-19 10:57:36
|
On 18 Nov 2002, Jeremy Fitzhardinge wrote:
> > 3. (nobody mentioned this, but I think it is very significant): the inability
> > of the code generator to consider groups of UInstrs together when
> > generating code.
>
> I'm not sure its all that important. It would reduce the actual
> instruction count and register pressure, but I'm not sure it would do
> all that much for modern CPUs (the Via C3 being the exception, and it
> does need all the help it can get).
I think it is important, ie. the fact that Valgrind virtualises all the
registers and then loses where they were in the UCode.
Look at DynamoRIO's translation of this small basic block:
add %eax, %ecx
cmp $4, %eax
jle $0x40106f
becomes
frag7: add %eax, %ecx
cmp $4, %eax
jle $0x40106f
stub1: mov %eax, eax-slot # spill %eax
mov &dstub1, %eax # store ptr to stub table
jmp context_switch
stub2: mov %eax, eax-slot # spill %eax
mov &dstub2, %eax # store ptr to stub table
jmp context_switch
Apart from control flow instructions, the resulting code (before
instrumentation and/or trace-level optimisation) is basically the same as
the original code.
Valgrind will turn it into UCode something like this:
GETL %ECX, t14
ADDL %EAX, t14 (-wOSZACP)
PUTL t14, %ECX
INCEIPo $2
GETL %EAX, t10
SUBL $0x1, t10 (-wOSZACP)
INCEIPo $3
Jleo $0x40106f (-rOSZACP)
JMPo $0x4013F9B4
There will be some optimisations, but then each UCode instr will become
one or more x86 instructions. The example isn't so convincing, but repeat
the exercise on a basic block with 10 or 20 instructions and you'll notice
the difference more.
I feel that this must be doing a lot of damage. I don't think it's a
coincidence that the Null skin increases code size by a factor of about 5
and also slows programs by a similar factor, and that Memcheck increases
code size by a factor of about 12--13 and slows programs down by a similar
factor.
> An interesting mechanism to create would be one which measures the
> (dynamic) frequency of adjacent uinstr pairs, to search for common
> sequences (pairs, to start with) which would be worth putting special
> code-generation effort into. Of course the results would depend on what
> instrumentation had been added by the skin.
I just did that... a patch for vg_from_ucode is attached, as are the
(sorted) results for a quick spin of konqueror with --skin=none...
N
|