|
From: Nicholas N. <nj...@cs...> - 2005-10-19 14:39:59
|
On Wed, 19 Oct 2005, Julian Seward wrote: > Check this. I was looking at the cg profiles for a self hosted run > on ppc32 and I noticed that of about 5.2M L2 write misses, it billed > 3.8M of them to just one function: the 10-line fn invalidateFastCache > in m_transtab.c. > > So I halved the size of the fast cache, and got a good 8% speedup for > nulgrind on ppc32. Not bad! I wonder if it carries over to x86/amd64 > though -- most of the invalidations are due to simulating icbis. > > > Numbers for start/quit Qt designer on Mac Mini (1250 MHz, 512K L2 iirc) > > With #define VG_TT_FAST_BITS 15 (halved size): > > real 1m15.269s > user 0m59.611s > sys 0m0.808s > > > With #define VG_TT_FAST_BITS 16 (default): > > real 1m17.949s > user 1m5.138s > sys 0m0.852s The user time is down 8%, but the real time is only down about 3%... what does this mean? Are these times consistent over multiple runs? N |