Re: [Valgrind-users] Execution of a dirty helper: atomic?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Mon, Dec 31, 2012 at 4:48 AM, Philippe Waroquiers
<phi...@sk...> wrote:
>
> It is even not ok to use an atomic instruction : first tests have
> shown that having one atomic instruction on this path makes a
> multi-threaded Valgrind slower than a serialised Valgrind.

You mean a multi-threaded Valgrind is slower even when running
multiple threads?  Wow.

In that case, there is only one way to handle this:  Take advantage of
the fact that the vast majority of memory accesses (i.e. on the stack)
are per-thread.  And others are "owned" at any point in time.

So I think you need to introduce some concept of "memory pool",
"memory pool owner", and "transfer of ownership".  So each thread
would tend to own the pool corresponding to its own stack (most of the
time).  A thread locking a mutex then accessing a bunch of data will
tend to transfer ownership of that data to that thread.  And so on.

This will still require the use of atomic instructions at least, if
not mutexes.  (Mutexes have the advantage of implicitly handling the
"ownership" concept...)  But atomic instructions, and perhaps even
mutexes, should be reasonably fast as long as they do not involve any
contention between cores.  The trick here will be to parallelize
access to the relevant data structures (i.e. the V bits for each
pool).

Just my $0.02, which is about what it is worth.  Good luck :-)

 - Pat