|
From: Paul M. <pa...@sa...> - 2004-03-09 12:01:50
|
Well, I found out why valgrind was running sooo sloooowly on PPC. It was spending all of its time (well, 430 out of 520 seconds on one test) in VG_(invalidate_translations). Some background: the PPC architecture doesn't require the i-cache to snoop anything, and in all current implementations it doesn't. Instead, software has to use the cache management instructions when it changes some instructions in memory and then wants to execute them. Notably there is the dcbst (data cache block store) instruction, which makes sure that a given cache line is written back to memory if it is modified, and the icbi (instruction cache block invalidate) instruction, which invalidates a cache line if it is present in the icache. These instructions can be used in userland, and are used by glibc for example when changing a PLT entry to branch directly to the desired procedure. In valgrind, I made the icbi instruction emulation call VG_(invalidate_translations) to get rid of any translation of any instructions in that cache line. But that routine scans the whole of vg_tt and the whole of every sector in the TC, and it is stunningly slow (and it also completely trashes the L2 cache for good measure :). I made two changes: I arranged for all the blocks that chain to a given BB to be linked together in a linked list, and I modified the icbi emulation to scan only the part of the TT which could correspond to the cache line being invalidated. The first change means that when deleting a block, I can unchain all the BBs that were chained to it without scanning the whole TC. (This also helps when discarding a sector of TC.) The second change means that we don't scan the whole 2.4MB of the TT, only about 256 bytes of it (plus a bit more, depending on how full the TT is; I keep scanning until I find an empty entry). With this change, the time to start up mozilla and quit is reduced from 520 seconds to 60 seconds. Openoffice also starts up much much more quickly. The latest tarball and patch are in http://ozlabs.org/~paulus/ as usual. Paul. |