|
From: Josef W. <Jos...@gm...> - 2012-11-20 18:23:10
|
Am 20.11.2012 17:04, schrieb Petar Jovanovic:
> begin:
> ll $v1, 0($s0) -------------------
> bne $v1, $t9, kraj
> move $a0, $zero
> move $a0, $v0
> sc $a0, 0($s0) ------------------
> beqzl $a0, begin
> move $at, $at
>
> will fail. More precisely it will fail with Callgrind. More more precisely, it
> will fail with Callgrind and option "--cacheuse=yes". Some instrumentation that
> happens to the code between LL and SC will cause the subsequent SC to fail.
Hmm. In your example, there is no memory access in the RMW region.
However, Cachegrind/Callgrind do not call cache simulation functions
synchroniously, but collect them and call them in bunches. The only
way I see "--cacheuse=yes" making a difference is that the simulator
calls for previous memory accesses are moved within the RMW region.
Which makes sense as there is a branch there.
It may help if outstanding simulator calls get flushed before entering
the RWM region.
diff --git a/callgrind/main.c b/callgrind/main.c
index 41fcd9e..a68f069 100644
--- a/callgrind/main.c
+++ b/callgrind/main.c
@@ -1073,6 +1073,8 @@ IRSB* CLG_(instrument)( VgCallbackClosure*
dataTy = typeOfIRTemp(sbIn->tyenv, st->Ist.LLSC.result);
addEvent_Dr( &clgs, curr_inode,
sizeofIRType(dataTy), st->Ist.LLSC.addr );
+ /* flush events before LL, should help SC to succeed */
+ flushEvents( &clgs );
} else {
/* SC */
> Two ideas have been talked about to resolve the issue:
>
>
> A) leave RMW region in one translation block (i.e. if a branch is placed between
> LL and SC, do not stop there
I do no think this is supported by VEX without larger changes.
) as long as it fits under max-size block;
>
> B) try to emulate LL/SC differently.
As Valgrind is serializing threads, it should be enough to check if
there was a schedule point within the RMW region, and make SC fail only
in this case.
Josef
|