|
From: Carl E. L. <ce...@li...> - 2014-01-23 20:59:45
|
IBM is working on supporting a PPC Little Endian (LE) and Big Endian
(BE) storage model. I am working to port the current PPC BE to LE. In
addition to the storage model change there are some API changes with
regards to function calls. Specifically they used to store the function
entry address at in the first location of the function, followed by the
code. The first address of the function in the new API no longer has
the start address. This has resulted in my having to modify the PPC64
assembly routines in Valgrind to remove the extra load instruction so
the 64-bits worth of instructions are not loaded into the PC.
Currently when I run valgrind with the ls command, it reports that the
workload segfaults on a bad address. Specifically,
local host # valgrind ls
==2629== Memcheck, a memory error detector
==2629== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2629== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright
info
==2629== Command: ls
==2629==
==2629== Jump to the invalid address stated on the next line
==2629== at 0x38426520404C0004: ???
==2629== Address 0x38426520404c0004 is not stack'd, malloc'd or
(recently) free'd
==2629==
==2629==
==2629== Process terminating with default action of signal 11 (SIGSEGV)
==2629== Bad permissions for mapped region at address
0x38426520404C0004
==2629== at 0x38426520404C0004: ???
==2629==
==2629== HEAP SUMMARY:
==2629== in use at exit: 0 bytes in 0 blocks
==2629== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==2629==
==2629== All heap blocks were freed -- no leaks are possible
==2629==
==2629== For counts of detected and suppressed errors, rerun with: -v
==2629== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault
Notice the bad address is a 64-bit value which looks to me, based on
experience, to be instructions. I have looked through the assembly code
in Valgrind and at this point don't see any suspicious loads but clearly
I am missing it somewhere. I have tried multiple workloads and always
get the same segfault on the same address.
I suspect that the "bad code" is being run on the guest and is not part
of Valgrind binary running on the native machine. I have found and
fixed this type of issue in the Valgrind code running on the native
machine. I am wondering how to track what function is being run on the
guest and examine the register state after each guest instruction is
executed. Effectively, I want to single step through the execution of
the guest code one instruction at a time to see what function and where
the bad address gets generated. Any ideas? Thanks.
Carl Love
|
|
From: Philippe W. <phi...@sk...> - 2014-01-24 20:38:39
|
On Thu, 2014-01-23 at 12:59 -0800, Carl E. Love wrote: > I suspect that the "bad code" is being run on the guest and is not part > of Valgrind binary running on the native machine. I have found and > fixed this type of issue in the Valgrind code running on the native > machine. I am wondering how to track what function is being run on the > guest and examine the register state after each guest instruction is > executed. Effectively, I want to single step through the execution of > the guest code one instruction at a time to see what function and where > the bad address gets generated. Any ideas? Thanks. Not too sure what 'one instruction at a time' means. If you want to step one instruction at a time of the guest code, then use options --vgdb-error=0 --vgdb-=full and connect gdb to valgrind. The gdb will debug the guest program, and the guest registers will be up to date at each instruction as --vgdb=full automatically enables --vex-iropt-register-updates=allregs-at-each-insn With that, gdb will allow e.g. stepi and examiniation of all registers after each stepi instruction. Now, if you want to step one "JIT-ted" instruction at a time in the code valgrind has JIT-ted for a guest instruction, then I guess you will have to follow the scheduler jumps and all that stuff to reach and step in the JIT-ted code. That does not look to be a pleasant moment :(. Philippe |
|
From: Julian S. <js...@ac...> - 2014-02-06 18:35:46
|
Carl > regards to function calls. Specifically they used to store the function > entry address at in the first location of the function, followed by the > code. The first address of the function in the new API no longer has > the start address. Can you clarify what you mean? For ppc64-linux, I remember the ABI requires that taking the address of a function gives a pointer to a three word descriptor, the first word of which points to the code itself, the second is the TOC pointer, and the third I am not sure what it is for. Dealing with these descriptors is a lot of hassle in various bits of the code. Is this what has changed? If so, what did it change to? J |
|
From: Carl E. L. <ce...@li...> - 2014-02-06 19:25:09
|
On Thu, 2014-02-06 at 19:35 +0100, Julian Seward wrote:
> Carl
>
> > regards to function calls. Specifically they used to store the function
> > entry address at in the first location of the function, followed by the
> > code. The first address of the function in the new API no longer has
> > the start address.
>
> Can you clarify what you mean? For ppc64-linux, I remember the ABI requires
> that taking the address of a function gives a pointer to a three word
> descriptor, the first word of which points to the code itself, the second
> is the TOC pointer, and the third I am not sure what it is for. Dealing
> with these descriptors is a lot of hassle in various bits of the code.
>
> Is this what has changed? If so, what did it change to?
Yes, that is what has changed. They no longer have the three word
descriptor. Now the function pointer actually points to the entry point.
Basically the way it is done on the other architectures. I have been
making the changes to the inline assembly code, the #defines that
specify big endian versus little endian, the guest state data structure
that stores the endianess etc. I making progress, still getting a few
errors but at least the simple test case "valgrind ls" now runs to
completion without crashing. It does report some issues that I am
working to track down.
Carl Love
|
|
From: Peter B. <be...@vn...> - 2014-02-06 19:46:11
|
On Thu, 2014-02-06 at 19:35 +0100, Julian Seward wrote: > Carl > > > regards to function calls. Specifically they used to store the function > > entry address at in the first location of the function, followed by the > > code. The first address of the function in the new API no longer has > > the start address. > > Can you clarify what you mean? For ppc64-linux, I remember the ABI requires > that taking the address of a function gives a pointer to a three word > descriptor, the first word of which points to the code itself, the second > is the TOC pointer, and the third I am not sure what it is for. Dealing > with these descriptors is a lot of hassle in various bits of the code. > > Is this what has changed? If so, what did it change to? That is correct. In the old ELFv1 ABI, a function pointer was the address of the 3 element function descriptor you describe. In the new ELFv2 ABI, the function pointer is now just the address of the function code like most other architectures/ABIs. Peter |