Re: [Valgrind-users] Cache conflict detection support in cachegrind

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 2023-01-29, Paul Floyd wrote:

> My recommendations for this are:
> 
> 1/ PMU/PMC (performance monitoring unit/counter) event counting tools (perf record on Linux, pmcstat on FreeBSD, Oracle Studio collect on Solaris, don't know for macOS). These can record events such as cache misses with the associated callstacks. You can then use tools HotSpot and 
> perfgrind/kcachegrind (I hae used HotSpot but not perfgrind).
> 
> The big advantage of this is that the PMCs are part of the hardware and the overhead of doing this is minor. The only slight limitation is that then number of counters is limited.

Another disadvantage: the hardware does not know which accesses
belong to the target code versus which accesses belong to
the code of valgrind itself.

Even if the hardware could separate accesses on that basis, it does not know
about stack frames.  Allocating a stack frame shortly after CALL, and
discarding it shortly before RETURN, can be significant reasons for
cache misses, either immediately or in the near future.

Then there are system calls, which might significantly alter cache contents.
Sometimes the resulting cache misses should be included (they most certainly
do affect wall clock time), but in some other cases you may wish that the
operating system was ignored.

If the target program uses threads, then using memory for inter-thread
communication (semaphore, mutex, pipeline, etc.) becomes another factor.