|
From: Howard C. <hy...@sy...> - 2006-11-30 10:33:56
|
Josef Weidendorfer wrote: > On Tuesday 26 September 2006 11:50, Howard Chu wrote: >> Has anyone written a cache simulator for cachegrind that tracks caches >> in separate CPU cores? > > No. > It would be nice to have it, and it should not be very difficult to > add to cachegrind/callgrind. > > However, there are two caveats: > - You have to map VG threads to processors. This needs the simulation > of some scheduling strategy inside of valgrind. An easy one is > roundrobin assignment: proc = thread % procnumber, but this usually does > not match reality because a sane scheduler takes the load of a thread > into account with the goal of equally distributing threads loads to > processors. > - In VG, threads are scheduled in a sequentially order, which does not > match any reality on a SMP machine where threads can run simultaneously. > This can have large influences on the number of coherency misses you are > interested in (false cache sharing). Moreover, VGs scheduling > interval should be small to get reasonable results, which slowdowns the > simulation even more. This seems to be the greater problem, since VG's scheduler only runs one thread at a time. Right, using a smaller scheduling interval would probably help. We'd want to make sure that each thread executes for exactly the same number of cycles, to have any hope of simulating simultaneous execution. >> The main thing I'm interested in would be to see >> how often cache line sharing occurs in a multithreaded program. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/ |