|
From: Julian S. <js...@ac...> - 2012-09-24 18:35:50
|
On Monday, September 24, 2012, Carl E. Love wrote: > I was just looking into the POWER architectures a bit more to make sure > they would be reasonably easy to support. Just to be clear, are there > any Valgrind restrictions on the cache sizes, specifically must be a > power of 2? > > I see the Power 5 has an L2 unified cache of 1.875MB and and L3 unified, > shared cache of size 36MB. I was doing some cache studies last year and > I remember there being issues where the cache size must be a power of 2. > I don't remember what tool it was now that had that restriction. > > Similarly, must the Valgrind cache associativity be a power of 2? The > POWER 5 processor's L2 cache is 10-way set associative. The same thing happens with server-level CPUs from Intel and AMD. I think Florian is presenting a general mechanism that allows recording of the cache details, regardless of whether they are something the various tools can handle, or not. And I think that's the right approach. As you say, though, some of the cache simulators have problems with non-power-of-2 sizes or associativities (can't remember which), so that the number of cache sets isn't a power of 2. So far that has been kludged up by postprocessing the cache info so as (in the right circumstances) increase the stated associativity by 50% (eg, a factor of 3/2) and decreasing the number of lines by the same factor, so as to make the number of lines be a power of 2 whilst not changing the overall capacity of the cache that is simulated. This kind of gets around the problem for cache sizes of (eg) 12MB (viz, 3/2 * 8MB) but does not fix it for cache sizes of (eg) 10MB since there is no code to do rescaling for the ratio 5/4. This stuff (+ big comment) is in get_caches_from_CPUID in cachegrind/cg-x86-amd64.c. If you or anybody else wants to do the 5/4 rescaling case, pls feel free :) I suppose this stuff should get lifted out, as part of Florian's reorg, and made general. J |