|
From: Mehmet B. <mb...@gm...> - 2007-11-20 20:26:05
|
Thanks a lot for the reply, Josef! You are right, a cache line of 16384 doesn't make sense... I wasn't thinking, so just passed what papi_mem() reported to me... I checked the specs of the CPU (Opteron 240) and learned that both L1 and L2 cache lines are just 64 bytes, which sounds more likely. The Valgrind incorrectly assumes that the L2 cache is 8-way associative, which should be 16-way instead, that's why I don't use its default settings. I don't have cpuid as a command, maybe that's what confuses Valgrind... This time, Valgrinds overestimates number of cache misses, even with the corrected settings: PAPI : 99,653 L2 miss Valgrind : 1,609,350 L2 miss !!! I am really out of ideas, I am very much willing to use valgrind to explain my results where PAPI can't provide enough details... Thanks a lot, -Memo On Nov 20, 2007 5:10 AM, Josef Weidendorfer <Jos...@gm...> wrote: > On Monday 19 November 2007, Mehmet Belgin wrote: > > I run Valgrind using: > > valgrind --tool=cachegrind --I1=65536,2,64 --D1=65536,2,64 > > --L2=1048576,16,16384 ./mycode > > Are you sure you want to simulate a L2 cache linesize of 16384? > > In general, cachegrind is using the parameters of your CPU (detected > by cpuid), so if you just want to simulate the cache of your CPU, > there is no requirement to specify the parameters explicitly. > > Josef > > > > |