|
From: Paul M. <pa...@sa...> - 2005-10-15 07:47:34
|
Nicholas Nethercote writes: > If this part of the simulation is taking a lot of time (and I'd like to > see profiling evidence) I'd suggest keeping the current algorithm and > having several specialised versions, for associativities of 1, 2, 4, and > 8. It should be ugly but doable with some macro magic. My point was that if the cpu we are running on uses this pseudo-LRU algorithm, cachegrind should use it too, because it will give results that more accurately reflect what the actual hardware will do. I'm pretty sure the PPC970 (G5) uses pseudo-LRU replacement for the L2 cache, which is 8-way set associative. I have no idea what intel or AMD cpus use, though. Paul. |