|
From: Philippe W. <phi...@sk...> - 2012-06-17 16:46:52
|
On Sun, 2012-06-17 at 08:14 -0700, John Reiser wrote: > On 06/17/2012, Philippe Waroquiers wrote: > > > I am however not at all convinced that only protecting the STORE8 > > will avoid false positive (we could maybe tolerate a small rate > > of false negative when running in parallel). > > Even if protecting the STORE8 is good enough, then any parallel > > algorithm which works a lot with single bytes might be slowed > > down by a factor 10 or similar. > > Use LoadLocked and StoreConditional instructions, as on MIPS. > These sense the state of the cache line, and implement a "greedy" > solution. Your code provides the fixup when greedy fails > (usually: try again, after re-fetch and re-modify.) Are the LL/SC instructions not suffering from the same performance degradation as the x86/amd64 atomic instructions ? (cfr the unacceptable performance degration in memcheck when adding one atomic instruction in the STORE helperc). I have very bad knowledge of all these atomic things and similar, but I am guessing that there is no order of magnitude difference of performance between these approaches. >From what I understood, on x86/amd64, we would need such operations to be faster by two order of magnitude to be usable for memcheck. Philippe |