|
From: Petar J. <mip...@gm...> - 2012-11-20 16:04:42
|
hi everyone,
first of all, I apologize for a rather lengthy email.
Here is an issue to share and hopefully get some advice on.
Similar to other architectures, MIPS arch has a pair of instructions of
load-link and store-conditional, namely LL and SC.
We have been seeing some issues in which a program would end up in an infinite
loop due to SC failing each time. The probability to fail is closely related on
which compiler was used to compile Valgrind, but it will fail with any
eventually. With some native compilers, Valgrind always fails (i.e. stays in the
loop).
MIPS documentation lists some condition under which SC will fail (see down-
below more data). They also say SC may succeed or *fail* if "a memory access
instruction (load, store, or prefetch) is executed on the processor executing
the LL/SC."
We are not able to isolate the issue by writing a sequence that will fail, no
matter how much read/write memory access with put in a LL/SC region. Yet, in
some programs a sequence like this in the guest code:
lui $s0, 0x41
ori $s0, 0x00f0
move $t9, $zero
li $v0, 1
begin:
ll $v1, 0($s0) -------------------
bne $v1, $t9, kraj
move $a0, $zero
move $a0, $v0
sc $a0, 0($s0) ------------------
beqzl $a0, begin
move $at, $at
will fail. More precisely it will fail with Callgrind. More more precisely, it
will fail with Callgrind and option "--cacheuse=yes". Some instrumentation that
happens to the code between LL and SC will cause the subsequent SC to fail.
Two ideas have been talked about to resolve the issue:
A) leave RMW region in one translation block (i.e. if a branch is placed between
LL and SC, do not stop there) as long as it fits under max-size block;
B) try to emulate LL/SC differently.
A) would be quick, but would there be any side effects?
Any other ideas? Anybody had a similar issue on other architecture?
It may also be worth saying that GDB does LL/SC in one step, which means that
'si' will step from LL to SC directly, passing all instructions in between.
Any advice is welcome!
Thanks.
Petar
Part of MIPS documentation on LL/SC:
"If either of the following events occurs between the execution of LL and SC,
the SC fails:
• A coherent store is completed by another processor or coherent I/O module into
the block of synchronizable
physical memory containing the word. The size and alignment of the block is
implementation dependent, but it is
at least one word and at most the minimum page size.
• An ERET instruction is executed.
If either of the following events occurs between the execution of LL and SC,
the SC may succeed or it may fail; the
success or failure is not predictable. Portable programs should not cause one
of these events.
• A memory access instruction (load, store, or prefetch) is executed on the
processor executing the LL/SC.
• The instructions executed starting with the LL and ending with the SC do not
lie in a 2048-byte contiguous region of virtual memory. (The region does not
have to be aligned, other than the alignment required for instruction
words.)
The following conditions must be true or the result of the SC is
UNPREDICTABLE:
• Execution of SC must have been preceded by execution of an LL instruction.
• An RMW sequence executed without intervening events that would cause the SC to
fail must use the same address in the LL and SC. The address is the same if
the virtual address, physical address, and cacheability &
coherency attribute are identical.
"
|