|
From: Carl L. <ce...@us...> - 2023-08-31 22:38:32
|
On Thu, 2023-08-31 at 10:31 -0700, Carl Love wrote:
> Mark, Aaron:
>
> So, I tried running the doublefree test by hand with the intention of
> then adding some debug prints to see which routines were being
> called.
> I am seeing the following:
>
> valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
> current
>
> valgrind: m_debuginfo/image.c:1106 (vgModuleLocal_img_valid):
> Assertion 'img != NULL' failed.
> Segmentation fault
>
> I rolled back the git tree to the commit prior to the initial patch
> to
> do the lazy load,
>
> commit 6ce0979884a8f246c80a098333ceef1a7b7f694d
> Author: Paul Floyd <pj...@wa...>
> Date: Mon Jul 24 22:06:00 2023 +0200
>
> Bug 472219 - Syscall param ppoll(ufds.events) points to
> uninitialised byte(s
>
> Add checks that (p)poll fd is not negative. If it is negative,
> don't check
> the events field.
>
> I re-compliled, re-installed and tested again and get:
>
> valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
> current
> ==124807== Invalid free() / delete / delete[] / realloc()
> ==124807== at 0x409B680: free (vg_replace_malloc.c:974)
> ==124807== by 0x1000063B: main (doublefree.c:10)
> ==124807== Address 0x42f0040 is 0 bytes inside a block of size 177
> free'd
> ==124807== at 0x409B680: free (vg_replace_malloc.c:974)
> ==124807== by 0x1000063B: main (doublefree.c:10)
> ==124807== Block was alloc'd at
> ==124807== at 0x409858C: malloc (vg_replace_malloc.c:431)
> ==124807== by 0x1000061B: main (doublefree.c:8)
> ==124807==
>
> So it seems with the initial patch and the PPC patch we are hitting
> an
> assertion issue. I will try and pursue a bit more.
The system I was testing on is Power 8 BE
system is Red Hat Enterprise Linux Server 7.9 (Maipo)
The assertion is in function ML_(img_valid), file
coregrind/m_debuginfo/image.c. I put a print statement in before each
of the 18 calls to determine which of the calls fails. The failure is
in readelf.c, line ~ 609
Bool get_elf_symbol_info (... )
{
...
/* Now we want to know what's at that offset in the .opd
section. We can't look in the running image since it won't
necessarily have been mapped. But we can consult the oimage.
opd_img is the start address of the .opd in the oimage.
Hence: */
ULong fn_descr[2]; /* is actually 3 words, but we need only 2 */
VG_(printf)("CARLL, img_valid 2\n");
if (!ML_(img_valid)(escn_opd->img, escn_opd->ioff + offset_in_opd,
sizeof(fn_descr))) {
if (TRACE_SYMTAB_ENABLED) {
HChar* sym_name = ML_(img_strdup)(escn_strtab->img,
"di.gesi.6b", sym_name_ioff);
TRACE_SYMTAB(" ignore -- invalid OPD fn_descr offset: %s\n",
sym_name);
if (sym_name) ML_(dinfo_free)(sym_name);
}
return False;
}
...
The function is called from
static
__attribute__((unused)) /* not referred to on all targets */
void read_elf_symtab__ppc64be_linux(
struct _DebugInfo* di, const HChar* tab_name,
DiSlice* escn_symtab,
DiSlice* escn_strtab,
DiSlice* escn_opd, /* ppc64be-linux only */
Bool symtab_in_debug
)
{
...
}
in the same file. There is an #if def to select which of the two calls
to make
# if defined(VGP_ppc64be_linux)
read_elf_symtab = read_elf_symtab__ppc64be_linux;
# else
read_elf_symtab = read_elf_symtab__normal;
# endif
in function read_elf_object. Which is called from
di_notify_ACHIEVE_ACCEPT_STATE in debuginfo.c.
I believe we need to call read_elf_debug to actually load the image. I
am not seeing any calls to read_elf_debug. It is called in load_di,
addr_load_di and load_all_debuginfo. I don't see any of these
functions getting called. describe_IP calls load_di or addr_load_di;
find_DiCfSI will call load_di. Again, I don't see describe_IP or
find_DiCfSI being called.
----------------------------------------
So, I then tried to run the same test on a Power 8LE system Ubuntu
20.04.5 LTS (Focal Fossa). I get:
valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
current
valgrind: Fatal error at startup: a function redirection
valgrind: which is mandatory for this platform-tool combination
valgrind: cannot be set up. Details of the redirection are:
valgrind:
valgrind: A must-be-redirected function
valgrind: whose name matches the pattern: strlen
valgrind: in an object with soname matching: ld64.so.2
valgrind: was not found whilst processing
valgrind: symbols from the object with soname: ld64.so.2
valgrind:
valgrind: Possible fixes: (1, short term): install glibc's debuginfo
valgrind: package on this machine. (2, longer term): ask the
packagers
valgrind: for your Linux distribution to please in future ship a non-
valgrind: stripped ld.so (or whatever the dynamic linker .so is
called)
valgrind: that exports the above-named function using the standard
valgrind: calling conventions for this platform. The package you need
valgrind: to install for fix (1) is called
valgrind:
valgrind: On Debian, Ubuntu: libc6-dbg
valgrind: On SuSE, openSuSE, Fedora, RHEL: glibc-debuginfo
valgrind:
valgrind: Note that if you are debugging a 32 bit process on a
valgrind: 64 bit system, you will need a corresponding 32 bit
debuginfo
valgrind: package (e.g. libc6-dbg:i386).
valgrind:
valgrind: Cannot continue -- exiting now. Sorry.
When I put in my print statements, I see the call to
read_elf_symtab__normal instead of read_elf_symtab__ppc64be_linux as
expected. It appears that some of the image file is read as I see a
second call to di_notify_ACHIEVE_ACCEPT_STATE, read_elf_object which I
don't see on the BE system before the run fails.
Carl
|