|
From: Nicholas N. <nj...@cs...> - 2008-03-25 08:33:40
|
On Mon, 24 Mar 2008, Nuno Lopes wrote: > Today I've written a simple peephole optimizer that works after the > instruction selection and before register allocation. So it takes an array > of instructions and outputs another array of instructions. It's also > independent of the platform. The patch is available at: > http://web.ist.utl.pt/nuno.lopes/valgrind_vex_peephole_optimizations.txt > Currently it only removes redudant MOVs between virtual registers that can > be propagated forward. Imagine this: > > 9 movl %vr16,%vr85 ; %vr16 isn't referenced below this line > 10 subl $0x4,%vr85 > > (maps %vr85 to %vr16) > > the movl is removed and translated into: > 9 subl $0x4,%vr16 Surely, whether %vr16 is reference again later isn't important; but whether %vr85 is referenced again later is important? > With only this simple transformation, I was able to reduce the number of > instructions of a simple block (instrumented with memcheck) from 120 to 106, > which is quite good. The number of register spills is also hugely reduced > (on a x86 host)! (memcheck creates some unnecessary moves because of the > dirty handler arguments). The 120-to-106 reduction is on the final generated code? Huh. The register allocator can get rid of many register-to-register moves (as the PLDI paper explains), but perhaps it misses some. Or many. Perhaps you could send an example block, showing before and after? > The caveats? Well it segfaults *after* compiling all the blocks, which is > weird.. I get the following error: > > ==28498== Invalid read of size 1 > ==28498== at 0x4015508: (within /lib/ld-2.6.1.so) > ==28498== by 0x4013CB5: (within /lib/ld-2.6.1.so) > ==28498== by 0x400134E: (within /lib/ld-2.6.1.so) > ==28498== by 0x40009A6: (within /lib/ld-2.6.1.so) > ==28498== Address 0x10090188 is not stack'd, malloc'd or (recently) free'd > ==28498== > ==28498== Process terminating with default action of signal 11 (SIGSEGV) > ==28498== Access not within mapped region at address 0x10090188 > ==28498== at 0x4015508: (within /lib/ld-2.6.1.so) > ==28498== by 0x4013CB5: (within /lib/ld-2.6.1.so) > ==28498== by 0x400134E: (within /lib/ld-2.6.1.so) > ==28498== by 0x40009A6: (within /lib/ld-2.6.1.so) > > Do you know what might be causing the problem? I can't think of anything, other than the transformation has a bug in it. > Also, after this is working correctly (i.e. fixing this segfault and running > the tests), do you think this could be incorporated in valgrind/VEX? Quite possibly, although Julian has the final say. It would depend on whether it actually improves performance -- it's always hard to predict whether code optimisations have a genuine effect. Nick |