In the SPAA paper submission, we claimed that we expected the vast majority of cache misses to be in public memory.
We can now confirm that stats from cachegrind back up this case.
Running the red-black-tree demo, with a tree depth of 10, we found that all but 0.5% of cache misses were for addresses in public memory.
These stats were only for a uniprocessor (cachegrind doesn't do SMP yet); however we see no reason why the results should not transfer to SMP.
Here are some provisional performance counter stats, running on a 4-way SPARC box. We'll hopefully soon be getting similar stats for the 106-way box and produce some pretty graphs.
When running on 4 processors, with a set power of 20. We get the following counts per executed transaction:
L2 misses per transaction:
L1 misses per transaction:
TLB misses per transaction:
Fraser=13.0 Ennals=2.9 ... read more
We didn't quite have time to put cache miss stats in the submitted SPAA paper. We should hopefully be putting those stats up soon...
We recently submitted a paper to SPAA05 describing the core algorithm behind LibLTX.
The paper itself, and the source code for the STM are now available for download.
I'll be putting up a first version of the source code as soon as I've tidied it up a bit and finished off the paper.
Watch this space...