|
From: Bart V. A. <bar...@gm...> - 2008-05-22 12:58:37
|
Hello Julian, If I remember correctly you added some time ago Imbe_BusLock / Imbe_BusUnlock to VEX. Were these intended to be instruction-set independent, or only to handle x86 bus locking ? In the last case, how should tools like Helgrind or exp-drd handle atomic ppc instructions ? See also http://bugs.kde.org/show_bug.cgi?id=162354. Bart. |
|
From: Julian S. <js...@ac...> - 2008-05-27 23:28:17
|
> If I remember correctly you added some time ago Imbe_BusLock / > Imbe_BusUnlock to VEX. Were these intended to be instruction-set > independent, or only to handle x86 bus locking ? In the last case, how > should tools like Helgrind or exp-drd handle atomic ppc instructions ? > See also http://bugs.kde.org/show_bug.cgi?id=162354. Well, the difficulty is that the powerpc way of doing atomic test-and-set (etc) is completely different from the x86/amd64 way. x86 and amd64 make it easy, by allowing a 1-byte LOCK prefix byte (0xF0) in front of the instruction. Vex sees that and puts Imbe_BusLock and Imbe_BusUnlock around the translation of the instruction, so helgrind and drd (and anybody else who cares) can see the locking. ppc has no direct equivalent. There is nothing to indicate that any single instruction is atomic. In any case, since ppc only allows simple load and store instructions, and not load-op-store insns, it would be useless to say that a single load or store is atomic (so what?) Instead ppc supplies what amounts to a programmable bus snoop mechanism, the lwarx and stwcx. instructions. These can be used to create atomic test-and-set sequences, etc, all the primitives you need. Google for these; there are many examples on the net. lwarx and stwcx. are used together to create such sequences, but there is no real constraint on what insns go between them, either statically or dynamically. So there is no easy way, at JIT time, to guarantee to observe that a given sequence represents an atomic test-and-set (or whatever). At a guess I'd say it's undecideable in general. That said, it is probably possible to do better than at present by using some kind of idiom recognition scheme, or IR analysis. But neither of those will be simple or completely robust. J |
|
From: Bart V. A. <bar...@gm...> - 2008-05-28 06:10:23
|
On Wed, May 28, 2008 at 1:22 AM, Julian Seward <js...@ac...> wrote: > lwarx and stwcx. are used together to create such sequences, but > there is no real constraint on what insns go between them, either > statically or dynamically. So there is no easy way, at JIT time, > to guarantee to observe that a given sequence represents an > atomic test-and-set (or whatever). At a guess I'd say it's > undecideable in general. That said, it is probably possible to > do better than at present by using some kind of idiom recognition > scheme, or IR analysis. But neither of those will be simple or > completely robust. I am familiar with how lwarx and stwcx work. But as far as I understand the VEX source code currently no information is passed by VEX to Valgrind tools about the bus snoop mechanism used by lwarx and stwcx. Do you think it would be a good idea to modify VEX such that it passes the following information to tools: * For lwarx instructions, the address being watched on the bus. * For stwcx instructions, the address for which the bus has been watched and whether or not another CPU has accessed that address since the bus watch started. Bart. |
|
From: Julian S. <js...@ac...> - 2008-06-02 09:07:26
|
On Wednesday 28 May 2008 08:10, Bart Van Assche wrote: > On Wed, May 28, 2008 at 1:22 AM, Julian Seward <js...@ac...> wrote: > > lwarx and stwcx. are used together to create such sequences, but > > there is no real constraint on what insns go between them, either > > statically or dynamically. So there is no easy way, at JIT time, > > to guarantee to observe that a given sequence represents an > > atomic test-and-set (or whatever). At a guess I'd say it's > > undecideable in general. That said, it is probably possible to > > do better than at present by using some kind of idiom recognition > > scheme, or IR analysis. But neither of those will be simple or > > completely robust. > > I am familiar with how lwarx and stwcx work. But as far as I > understand the VEX source code currently no information is passed by > VEX to Valgrind tools about the bus snoop mechanism used by lwarx and > stwcx. Do you think it would be a good idea to modify VEX such that it > passes the following information to tools: > * For lwarx instructions, the address being watched on the bus. > * For stwcx instructions, the address for which the bus has been > watched and whether or not another CPU has accessed that address since > the bus watch started. It would be easy enough to modify vex to pass supply the relevant info, for example * for load instructions, whether they are a normal load or a lwarx * for store instructions, the same plus the tool gets to see all loads and stores anyway, if it wants. It seems to me that the above is not the real problem. The real problem is, even if you have all that information available, how can it be used to infer which pieces of memory are being atomically modified? J |
|
From: Bart V. A. <bar...@gm...> - 2008-06-03 18:33:42
|
On Mon, Jun 2, 2008 at 11:01 AM, Julian Seward <js...@ac...> wrote:
> It would be easy enough to modify vex to pass supply the relevant info,
> for example
>
> * for load instructions, whether they are a normal load or a lwarx
> * for store instructions, the same
>
> plus the tool gets to see all loads and stores anyway, if it wants.
>
> It seems to me that the above is not the real problem. The real problem
> is, even if you have all that information available, how can it be used
> to infer which pieces of memory are being atomically modified?
The aforementioned information could be used by a tool as follows:
1. Assume that lwarx and stwcx instructions are always used as
follows: do { lwarx ...; stwcx ...; } while (reservation failed);
2. Given the above assumption, if the stwcx instruction performed a
store, then it was an atomic store.
I'm not sure however that (1) is the only way in which lwarx and stwcx
instructions are used.
Bart.
|
|
From: Bart V. A. <bar...@gm...> - 2008-06-26 07:46:35
|
On Wed, May 28, 2008 at 1:22 AM, Julian Seward <js...@ac...> wrote:
>
> Well, the difficulty is that the powerpc way of doing atomic
> test-and-set (etc) is completely different from the x86/amd64
> way. x86 and amd64 make it easy, by allowing a 1-byte LOCK prefix
> byte (0xF0) in front of the instruction. Vex sees that and puts
> Imbe_BusLock and Imbe_BusUnlock around the translation of the
> instruction, so helgrind and drd (and anybody else who cares) can
> see the locking.
Hello Julian,
What is your opinion about the patch below ? This patch allows
Helgrind and DRD to recognize stwcx instructions as atomic: if stwcx
performs a store, bus lock and bus unlock events are passed to
Valgrind tools around the actual store.
Index: priv/guest-ppc/toIR.c
===================================================================
--- priv/guest-ppc/toIR.c (revision 1856)
+++ priv/guest-ppc/toIR.c (working copy)
@@ -4896,7 +4896,9 @@
whether rS is stored is dependent on that value. */
/* Success? Do the (32bit) store */
+ stmt( IRStmt_MBE(Imbe_BusLock) );
storeBE( mkexpr(EA), mkSzNarrow32(ty, mkexpr(rS)) );
+ stmt( IRStmt_MBE(Imbe_BusUnlock) );
// Set CR0[LT GT EQ S0] = 0b001 || XER[SO]
putCR321(0, mkU8(1<<1));
Bart.
|
|
From: Julian S. <js...@ac...> - 2008-06-30 10:33:42
|
> What is your opinion about the patch below ? This patch allows
> Helgrind and DRD to recognize stwcx instructions as atomic: if stwcx
> performs a store, bus lock and bus unlock events are passed to
> Valgrind tools around the actual store.
> /* Success? Do the (32bit) store */
> + stmt( IRStmt_MBE(Imbe_BusLock) );
> storeBE( mkexpr(EA), mkSzNarrow32(ty, mkexpr(rS)) );
> + stmt( IRStmt_MBE(Imbe_BusUnlock) );
Well, I see that it would cause drd to not complain about stwcx
accesses, by causing it to ignore them:
case Ist_Store:
if (instrument && ! bus_locked)
{
instrument_store(bb, [...]
Problem is this really just uses Imbe_Bus{Lock,Unlock} as markers
around stores which you want to ignore. They are however intended
to state that an imaginary lock is held (on an x86 machine) for
the duration of a read-modify-write instruction, and I don't want to
re-use them for a different purpose. So I've added Imbe_SnoopedStoreBegin
and Imbe_SnoopedStoreEnd for that purpose instead. (r1857, r8316).
J
|
|
From: Bart V. A. <bar...@gm...> - 2008-06-30 13:49:49
|
On Mon, Jun 30, 2008 at 12:26 PM, Julian Seward <js...@ac...> wrote: > Well, I see that it would cause drd to not complain about stwcx > accesses, by causing it to ignore them: It is on purpose that drd "ignores" atomic stores: in order to detect data races on atomic variables, it is sufficient to detect races between loads and regular stores w.r.t. atomic variables. It is not necessary to record any information about atomic stores. Bart. |
|
From: Julian S. <js...@ac...> - 2008-07-01 04:18:16
|
On Monday 30 June 2008 15:49, Bart Van Assche wrote: > On Mon, Jun 30, 2008 at 12:26 PM, Julian Seward <js...@ac...> wrote: > > Well, I see that it would cause drd to not complain about stwcx > > accesses, by causing it to ignore them: > > It is on purpose that drd "ignores" atomic stores: in order to detect > data races on atomic variables, it is sufficient to detect races > between loads and regular stores w.r.t. atomic variables. It is not > necessary to record any information about atomic stores. Do you have an example to illustrate why that is so, or perhaps a semi-formal argument? J |
|
From: Bart V. A. <bar...@gm...> - 2008-06-30 15:22:34
|
On Mon, Jun 30, 2008 at 12:26 PM, Julian Seward <js...@ac...> wrote: > So I've added Imbe_SnoopedStoreBegin > and Imbe_SnoopedStoreEnd for that purpose instead. (r1857, r8316). If I understand the VEX modifications correctly, all stwcx instructions will be translated into regular stores. Is this safe ? Will Valgrind never schedule another thread between lwarx and stwcx instructions ? Bart. |
|
From: Julian S. <js...@ac...> - 2008-07-01 04:26:31
|
On Monday 30 June 2008 17:22, Bart Van Assche wrote: > On Mon, Jun 30, 2008 at 12:26 PM, Julian Seward <js...@ac...> wrote: > > So I've added Imbe_SnoopedStoreBegin > > and Imbe_SnoopedStoreEnd for that purpose instead. (r1857, r8316). > > If I understand the VEX modifications correctly, all stwcx > instructions will be translated into regular stores. Is this safe ? What's the definition of "safe" here? All this change does is make it possible for tools to identify stores that come from stwcx instructions. What they do with that information is up to them. > Will Valgrind never schedule another thread between lwarx and stwcx > instructions ? Unlikely but not guaranteed. If it manages to translate the entire loop containing lwarx, stwcx. and the back branch in a single IRSB then no other thread will run in between, and that should usually be the case. At least, no other thread can be scheduled providing that none of the memory instructions in the loop take a segfault. J |