|
From: Randy M. <rwm...@gm...> - 2006-05-03 15:16:51
|
Hi,
I trying to use valgrind on an embedded ppc system using a custom kernel.
First, because our kernel doesn't have Altivec enabled and the valgrind/VE=
X
altivec detection in ./coregrind/m_machine.c wasn't working,
I commented out the Altivec detection hardcoded: have_vmx =3D False;
If you are interested, I can reproduce this without the patch and file a bu=
g.
So now, valgrind /bin/ls works! Yipee!
I tried valgrind on a binary of interest and found an unhandled instruction
error as shown below. I looked at the svn latest src but it didn't appear
that these issues had been fixed. Please confirm.
dis_int_ldst_rev(PPC32)(opc2)
disInstr(ppc32): unhandled instruction: 0x7D204E2C
primary 31(0x1F), secondary 1580(0x62C)
=3D=3D18449=3D=3D Your program just tried to execute an instruction that Va=
lgrind
=3D=3D18449=3D=3D did not recognise. There are two possible reasons for th=
is.
=3D=3D18449=3D=3D 1. Your program has a bug and erroneously jumped to a non=
-code
=3D=3D18449=3D=3D location. If you are running Memcheck and you just sa=
w a
=3D=3D18449=3D=3D warning about a bad jump, it's probably your program's=
fault.
no.
=3D=3D18449=3D=3D 2. The instruction is legitimate but Valgrind doesn't han=
dle it,
=3D=3D18449=3D=3D i.e. it's Valgrind's fault. If you think this is the =
case or
=3D=3D18449=3D=3D you are not sure, please let us know.
Here I am.
=3D=3D18449=3D=3D Either way, Valgrind will now raise a SIGILL signal which=
will
=3D=3D18449=3D=3D probably kill your program.
=3D=3D18449=3D=3D
=3D=3D18449=3D=3D Process terminating with default action of signal 4
(SIGILL): dumping core
=3D=3D18449=3D=3D Illegal opcode at address 0x10001CD4
=3D=3D18449=3D=3D at 0x10001CD4: testRead16le (ovlBitField.h:646)
=3D=3D18449=3D=3D by 0xFF73058: __cUnit_run_test (cUnit.c:407)
=3D=3D18449=3D=3D by 0xFF739B8: g_node_traverse_pre_order (gnode.c:476)
=3D=3D18449=3D=3D by 0xFF7393C: g_node_traverse_pre_order (gnode.c:472)
=3D=3D18449=3D=3D by 0xFF7393C: g_node_traverse_pre_order (gnode.c:472)
=3D=3D18449=3D=3D by 0xFF729F4: cUnit_run_session (cUnit.c:197)
=3D=3D18449=3D=3D by 0xFF733FC: cUnit_TestRunner_int (cUnitRunnerUnix.c:=
125)
=3D=3D18449=3D=3D by 0x1000F3DC: main (ovlUnitTest.c:34)
If I run objdump on my test program, I see that the instruction is:
10001d24: 7d 20 4e 2c lhbrx r9,r0,r9
and I see that that comes from
OVL_INLINE uint_fast16_t ovlRead16le(const uint8_t *s) OVL_THROW
{
uint_fast16_t result;
#if __BYTE_ORDER =3D=3D __BIG_ENDIAN
__asm volatile(
" lhbrx %0,0,%1 \n" // Get half word and reverse the by=
tes
: "=3Dr" (result) // %0 - Output operand
: "r" (s) // %1 - Input operand
: "memory" // Consider memory clobberred for aliasing
);
...
So, I tried to hack in support for lhbrx:
[VEX]$ ls ./priv/guest-ppc32/toIR.c*
./priv/guest-ppc32/toIR.c ./priv/guest-ppc32/toIR.c.orig
[VEX]$ diff ./priv/guest-ppc32/toIR.c*
2612,2618c2612
< case 0x62C: // lhbrx (Load HW, Byte-Reverse Indexed)
< DIP("lwzx r%d,r%d,r%d\n", rD_addr, rA_addr, rB_addr);
< vex_printf("rwm: fixed lhbrx?....\n");
< // ?? rwm: putIReg( rD_addr, loadBE(Ity_I32, mkexpr(EA_reg)) );
< putIReg( rD_addr, unop(Iop_16Uto32,
< loadBE(Ity_I16, mkexpr(EA_reg))) );
< break;
---
>
7261,7263d7254
< case 0x62C: // lhbrx
< if (dis_int_load( theInstr )) goto decode_success;
< goto decode_failure;
but I don't really understand valgrind's design. Does my put putIReg call l=
ook
right? I'm reading the manual and docs/internals now...
Thanks,
// Randy
|
|
From: Cerion Armour-B. <ce...@op...> - 2006-05-03 16:02:43
|
On Wednesday 03 May 2006 17:16, Randy Macleod wrote:
> If I run objdump on my test program, I see that the instruction is:
> 10001d24: 7d 20 4e 2c lhbrx r9,r0,r9
>
...
> So, I tried to hack in support for lhbrx:
>
> [VEX]$ ls ./priv/guest-ppc32/toIR.c*
> ./priv/guest-ppc32/toIR.c ./priv/guest-ppc32/toIR.c.orig
> [VEX]$ diff ./priv/guest-ppc32/toIR.c*
> 2612,2618c2612
> < case 0x62C: // lhbrx (Load HW, Byte-Reverse Indexed)
> < DIP("lwzx r%d,r%d,r%d\n", rD_addr, rA_addr, rB_addr);
> < vex_printf("rwm: fixed lhbrx?....\n");
> < // ?? rwm: putIReg( rD_addr, loadBE(Ity_I32, mkexpr(EA_reg)) );
> < putIReg( rD_addr, unop(Iop_16Uto32,
> < loadBE(Ity_I16, mkexpr(EA_reg))) );
> < break;
> ---
>
> 7261,7263d7254
> < case 0x62C: // lhbrx
> < if (dis_int_load( theInstr )) goto decode_success;
> < goto decode_failure;
>
>
> but I don't really understand valgrind's design. Does my put putIReg call
> look right? I'm reading the manual and docs/internals now...
Getting there, but not quite... this insn does this (numbers are bytes, -'s
are cleared bytes)
[3|2|1|0] -> [-|-|2|3]
This insn was implemented a while back, but not tested... Take a look at
toIR.c:5126 from the trunk:
//zz case 0x316: // lhbrx (Load Half Word Byte-Reverse Indexed, PPC32 p449)
//zz vassert(0);
//zz
//zz DIP("lhbrx r%u,r%u,r%u\n", rD_addr, rA_addr, rB_addr);
//zz assign( byte0, loadBE(Ity_I8, mkexpr(EA)) );
//zz assign( byte1, loadBE(Ity_I8, binop(Iop_Add32, mkexpr(EA),mkU32
(1))) );
//zz assign( rD, binop(Iop_Or32,
//zz binop(Iop_Shl32, mkexpr(byte1), mkU8(8)),
//zz mkexpr(byte0)) );
//zz putIReg( rD_addr, mkexpr(rD));
//zz break;
Looks like it'll work, but not very nice, what with two loads... perhaps
something like this might do better (just a thought - not tested):
case 0x316: // lhbrx (Load Half Word Byte-Reverse Indexed, PPC32 p449)
DIP("lhbrx r%u,r%u,r%u\n", rD_addr, rA_addr, rB_addr);
assign( w1, loadBE(Ity_I32, mkexpr(EA)) );
assign( w2, binop(Iop_Or32,
binop(Iop_And32, mkU32(0x0000FF00),
binop(Iop_Shr32, mkexpr(w1), mkU8(8))),
binop(Iop_Shr32, mkexpr(w1), mkU8(24))
) );
putIReg( rD_addr, mkSzWiden32(ty, mkexpr(w2),
/* Signed */False) );
break;
Hope that gets you started.
btw, the reason I get to opcode 0x316 (not 0x62C) is because opc2 here doesn't
include bit0:
/* Extract 10-bit secondary opcode, instr[10:1] */
static UInt ifieldOPClo10 ( UInt instr) {
return IFIELD( instr, 1, 10 );
}
If you felt like going further and hacking up a test for this insn (something
similar to lwbrx, I guess) , it'll get added much more quickly!
Cerion
|
|
From: Julian S. <js...@ac...> - 2006-05-05 13:29:24
|
> I trying to use valgrind on an embedded ppc system using a custom kernel. Good. Various people have already used V on embedded ppcs with some success. > First, because our kernel doesn't have Altivec enabled and the > valgrind/VEX altivec detection in ./coregrind/m_machine.c wasn't working, > I commented out the Altivec detection hardcoded: have_vmx = False; > If you are interested, I can reproduce this without the patch and file a > bug. This is a bug - it should work properly with the svn trunk. If it doesn't, please file a bug report. > If I run objdump on my test program, I see that the instruction is: > 10001d24: 7d 20 4e 2c lhbrx r9,r0,r9 Fixed (vex r1608). Please update, rebuild V, try again, and let us know any other insns that you need (sthbrx?) J |
|
From: Julian S. <js...@ac...> - 2006-05-05 13:52:09
|
> > If I run objdump on my test program, I see that the instruction is:
> > 10001d24: 7d 20 4e 2c lhbrx r9,r0,r9
>
> Fixed (vex r1608).
I also did sthbrx. So {ld,st}{h,w}brx should now work.
J
|
|
From: Randy M. <rwm...@gm...> - 2006-05-09 20:11:32
Attachments:
testvg.c
|
Hi,
On 5/5/06, Julian Seward <js...@ac...> wrote:
>
> > > If I run objdump on my test program, I see that the instruction is:
> > > 10001d24: 7d 20 4e 2c lhbrx r9,r0,r9
> >
> > Fixed (vex r1608).
>
> I also did sthbrx. So {ld,st}{h,w}brx should now work.
Thanks, those changes work great.
I can now run valgrind on my embedded ppc apps!
Now to add to the suppressions list...
The altivec probing/signal handling is a bug in our 2.4.22 kernel.
I'm running on a G4 system but (for historical reasons) the
kernel hasn't been compiled with CONFIG_ALTIVEC.
This patch/email: <http://lkml.org/lkml/2004/5/15/19>
explains that older linux kernels didn't
handle a user-mode altivec instruciton being called
on a kernel without CONFIG_ALTIVEC.
I assume that valgrind doesn't want to worry about
a 2.4.22 bug.
I'll include my testvg.c code in case someone else
runs into this sort of problem.
// Randy
I've extracted a simple test program (attached) and see the vector
instruction generates a sigtrap rather than a sigill.
(I'm running 2.4.22 + patches --
don't talk to me about how old the kernel is, or my head will explode ;-) =
)
% gcc -static -o testvg-static testvg.c
# ./testvg-static
test float
done float
test fsqrt
sigill fsqrt
test frsqrte
done frsqrte
test altivec
Trace/breakpoint trap (core dumped)
[root@10.0.7.1 tmp]# dmesg -c
Bad trap at PC: 100006ec, SR: f932, vector=3Df00
On a ppc405 (no Altivec), I get the sigill altivec path.
All this makes sense if you read: <http://lkml.org/lkml/2004/5/15/19>
|