|
From: Carl E. L. <ce...@li...> - 2012-09-24 19:11:45
|
On Mon, 2012-09-24 at 20:34 +0200, Julian Seward wrote: > On Monday, September 24, 2012, Carl E. Love wrote: > > > I was just looking into the POWER architectures a bit more to make sure > > they would be reasonably easy to support. Just to be clear, are there > > any Valgrind restrictions on the cache sizes, specifically must be a > > power of 2? > > > > I see the Power 5 has an L2 unified cache of 1.875MB and and L3 unified, > > shared cache of size 36MB. I was doing some cache studies last year and > > I remember there being issues where the cache size must be a power of 2. > > I don't remember what tool it was now that had that restriction. > > > > Similarly, must the Valgrind cache associativity be a power of 2? The > > POWER 5 processor's L2 cache is 10-way set associative. > > The same thing happens with server-level CPUs from Intel and AMD. > > I think Florian is presenting a general mechanism that allows recording of > the cache details, regardless of whether they are something the various > tools can handle, or not. And I think that's the right approach. > > As you say, though, some of the cache simulators have problems with > non-power-of-2 sizes or associativities (can't remember which), so that > the number of cache sets isn't a power of 2. So far that has been kludged > up by postprocessing the cache info so as (in the right circumstances) > increase the stated associativity by 50% (eg, a factor of 3/2) and > decreasing the number of lines by the same factor, so as to make the > number of lines be a power of 2 whilst not changing the overall capacity > of the cache that is simulated. > > This kind of gets around the problem for cache sizes of (eg) 12MB > (viz, 3/2 * 8MB) but does not fix it for cache sizes of (eg) 10MB > since there is no code to do rescaling for the ratio 5/4. > > This stuff (+ big comment) is in get_caches_from_CPUID in > cachegrind/cg-x86-amd64.c. If you or anybody else wants to do the 5/4 > rescaling case, pls feel free :) > > I suppose this stuff should get lifted out, as part of Florian's reorg, > and made general. > > J Yup, from the general recording of cache size, line size, type, associativity the data structures seem to cover all of the info needed for POWER. The hope would be to see if some of the lower code requirements on powers of two could be removed with the code restructuring. I haven't really dived into the cachegrind and other tool implementations to see why the restrictions are there or what it would take to change the restrictions. Good to know that the power of 2 issue is not just a POWER problem. > |