|
From: Emilio C. <er...@gm...> - 2012-12-27 16:41:42
|
Hi all, I would like to know if Valgrind assures atomic execution of a dirty helper. In more detail I am interested in the situation: --- access to a memory cell call to a dirty helper X() [inserted by unsafeIRDirty_0_N] execution of X() --- Is this executed atomically by Valgrind? I know that Valgrind serialises execution so that only one thread is running at a time but this is another problem. Emilio. |
|
From: Julian S. <js...@ac...> - 2012-12-28 10:42:42
|
It is hard to answer this question since it is not really clear what you mean by "atomic" here. Can you clarify? J On Thursday, December 27, 2012, Emilio Coppa wrote: > Hi all, > > I would like to know if Valgrind assures atomic execution of a dirty > helper. In more detail I am interested in the situation: > > --- > access to a memory cell > call to a dirty helper X() [inserted by unsafeIRDirty_0_N] > execution of X() > --- > > Is this executed atomically by Valgrind? > I know that Valgrind serialises execution so that only one thread is > running at a time but this is another problem. > > Emilio. |
|
From: Emilio C. <er...@gm...> - 2012-12-28 11:02:30
|
Hi, It is hard to answer this question since it is not really clear what > you mean by "atomic" here. Can you clarify? > I mean "atomic" in the sense of "no thread switching/interleaving". Emilio. |
|
From: Julian S. <js...@ac...> - 2012-12-28 15:57:46
|
> I mean "atomic" in the sense of "no thread switching/interleaving". In that case, yes it is atomic. That is, V will not switch threads within a superblock. It can only switch threads at superblock boundaries. J |
|
From: John R. <jr...@bi...> - 2012-12-28 16:51:48
|
On 12/28/2012 07:57 AM, Julian Seward wrote: > >> I mean "atomic" in the sense of "no thread switching/interleaving". > > In that case, yes it is atomic. That is, V will not switch threads > within a superblock. It can only switch threads at superblock boundaries.To Today this is true. However, there are serious investigations and efforts into making coregrind/memcheck multi-threaded. One likely candidate will allow multiple simultaneous threads on multiple CPU cores. Each CPU core may switch logical threads only at a superblock boundary, but mutual exclusion between threads on different CPU cores is not guaranteed. For some purposes this will look like "interleaving". -- |
|
From: Emilio C. <er...@gm...> - 2012-12-30 21:46:01
|
Thank both of you for your answers. Each CPU core may switch logical threads only at a superblock boundary, > but mutual exclusion between threads on different CPU cores is not > guaranteed. > For some purposes this will look like "interleaving". I will be very interested on how memcheck will approach this interleaving :) Emilio. |
|
From: Philippe W. <phi...@sk...> - 2012-12-31 12:49:01
|
On Sun, 2012-12-30 at 22:45 +0100, Emilio Coppa wrote: > Thank both of you for your answers. > > > Each CPU core may switch logical threads only at a superblock > boundary, > but mutual exclusion between threads on different CPU cores is > not guaranteed. > For some purposes this will look like "interleaving". > > > I will be very interested on how memcheck will approach this > interleaving :) Currently, the "really" multi-threaded valgrind (see https://bugs.kde.org/show_bug.cgi?id=301830) is blocked. There are many global data structures in Valgrind which are not thread safe, but I think most of them can easily be made thread safe (typically by a mutex). However, the main memcheck data structure (which maintains the V-bit) is accessed so often that it is not acceptable (perf wise) to use a mutex. It is even not ok to use an atomic instruction : first tests have shown that having one atomic instruction on this path makes a multi-threaded Valgrind slower than a serialised Valgrind. So, in summary, there is no solution for a multi-threaded memcheck. Ideas welcome :). (a prototype of the "none" tool worked reasonably well in multi-threaded, but that is quite useless). Philippe |
|
From: Patrick J. L. <lop...@gm...> - 2012-12-31 17:42:52
|
On Mon, Dec 31, 2012 at 4:48 AM, Philippe Waroquiers <phi...@sk...> wrote: > > It is even not ok to use an atomic instruction : first tests have > shown that having one atomic instruction on this path makes a > multi-threaded Valgrind slower than a serialised Valgrind. You mean a multi-threaded Valgrind is slower even when running multiple threads? Wow. In that case, there is only one way to handle this: Take advantage of the fact that the vast majority of memory accesses (i.e. on the stack) are per-thread. And others are "owned" at any point in time. So I think you need to introduce some concept of "memory pool", "memory pool owner", and "transfer of ownership". So each thread would tend to own the pool corresponding to its own stack (most of the time). A thread locking a mutex then accessing a bunch of data will tend to transfer ownership of that data to that thread. And so on. This will still require the use of atomic instructions at least, if not mutexes. (Mutexes have the advantage of implicitly handling the "ownership" concept...) But atomic instructions, and perhaps even mutexes, should be reasonably fast as long as they do not involve any contention between cores. The trick here will be to parallelize access to the relevant data structures (i.e. the V bits for each pool). Just my $0.02, which is about what it is worth. Good luck :-) - Pat |