|
From: Josef W. <Jos...@gm...> - 2003-05-05 17:39:19
|
Hi Nick, what's needed in cachegrind to support multiple processor caches and coherency protocols among them? I have a wish item here, and perhaps it's quite easy to implement. Motivation: Multithreaded (PThread) programs are handled quite fine with cachegrind, but the results can be misleading because only one cache hierarchy is simulated: If the real program will be run on a 2-processor machine and we have 2 threads, there should be 2 caches (for each processor) simulated. The default configuration could be to simulate as many caches as there are processors in your machine, and use a simple static round robin mapping from threads to the simulated caches. Items I think that have to be done: 1. Reserve some bits from the tag value of each cache entry for state bits of the coherence protocol for this entry (should always be fine because there's no direct-mapped cache with a cacheline-size of 1 byte). 1. allocate multiple "static cache_t2 I1, D1, L2", 2. switch the cache_t2 structures on a thread switch, 3. change cachesim_##L##_doref to handle a cache coherence protocol (E.g. invalidating cache entries of remote caches on writes). Do you think this is doable/useful at all, or am I overlooking something? Josef |