From: John R. <jr...@bi...> - 2022-06-29 14:27:31
|
> I did not make up those strace logs in my head, all I am trying to do > is Debian bug triaging. Turns out I did a pretty bad job at it: > > 1. The original Debian bug report seems to be PEBCAK, and I'll close > the bug as wontfix ASAP, > 2. I was not paying attention to the gcc version I was using. The original bug report https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928224 specified the reproducing case "valgrind /bin/true". That now works for me: ----- $ valgrind /bin/true ==399== Memcheck, a memory error detector ==399== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==399== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info ==399== Command: /bin/true ==399== ==399== ==399== HEAP SUMMARY: ==399== in use at exit: 0 bytes in 0 blocks ==399== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==399== ==399== All heap blocks were freed -- no leaks are possible ==399== ==399== For lists of detected and suppressed errors, rerun with: -s ==399== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ----- in the environment: ----- $ valgrind --version valgrind-3.20.0.GIT $ gcc --version gcc (Debian 10.2.1-6) 10.2.1 20210110 $ uname -a Linux rpi2-20220121 5.10.0-15-armmp #1 SMP Debian 5.10.120-1 (2022-06-09) armv7l GNU/Linux ----- so the original bug report can be closed with "fixed in newer version" or something like that. > So if my understanding is correct I can make valgrind produce this > "Illegal instruction" using either gcc-11 or gcc-12 (Debian package > from sid), BUT I can make valgrind run using gcc-10 (again Debian > package from sid). This also seems to be hardware specific since armhf > binary + gcc-12 runs properly on arm64 (armhf chroot). Is it easy to install several versions (gcc-10, gcc-11, gcc-12, clang-13) at the same time, and switch among them by using something like CC=/path/to/gcc-12 ./configure Where can I find hints about this? > > Would you kindly indicate if you believe the bug should be reported > back to valgrind bug tracker or gcc bug tracker ? If that matters, > clang 13.0 seems to also mess up valgrind code and binaries produced > return this "Illegal instruction". SIGILL should be diagnosed using gdb to print the instruction stream and register contents ----- (gdb) run args... Program received signal SIGILL, Illegal instruction. (gdb) x/i $pc ## the faulting instruction (gdb) x/12i pc-6*4 ## disassemble the surrounding instructions (Gdb) x/12xw $pc-6*4 ## and in 32-bit raw hexadecimal (gdb) info reg ## content of all registers (gdb) x/16xw $sp ## dump the active end of the stack (gdb) bt ## source-level backtrace ----- But with valgrind you must just "continue" the deliberate SIGILL and SIGSEGV that valgrind uses. Here is an actual run: ----- $ gdb valgrind GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git Reading symbols from valgrind... (gdb) run /bin/true Starting program: /usr/local/bin/valgrind /bin/true process 426 is executing new program: /usr/local/libexec/valgrind/memcheck-arm-linux Program received signal SIGILL, Illegal instruction. vgPlain_machine_get_hwcaps () at m_machine.c:1719 1719 __asm__ __volatile__(".word 0xF3044F54"); /* VMAXNM.F32 q2,q2,q2 */ ## Notice that this SIGILL is from valgrind trying to determine ## the actual hardware capabilities. Valgrind knows what it is doing, ## so just 'continue' to let valgrind handle the SIGILL. (gdb) c Continuing. ==426== Memcheck, a memory error detector ==426== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==426== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info ==426== Command: /bin/true ==426== Program received signal SIGSEGV, Segmentation fault. ## valgrind deliberate 0x62c68cc0 in ?? () (gdb) x/i $pc => 0x62c68cc0: str r3, [r9] (gdb) p $r9 $1 = 3187663772 (gdb) p/x $r9 $2 = 0xbdffe39c (gdb) x/12i $pc-6*4 0x62c68ca8: ldr r3, [r8, #424] ; 0x1a8 0x62c68cac: mov r1, r3 0x62c68cb0: movw r2, #62156 ; 0xf2cc 0x62c68cb4: movt r2, #22528 ; 0x5800 0x62c68cb8: blx r2 0x62c68cbc: ldr r3, [r8, #24] => 0x62c68cc0: str r3, [r9] 0x62c68cc4: add r7, r9, #4 0x62c68cc8: mov r0, r7 0x62c68ccc: ldr r3, [r8, #428] ; 0x1ac 0x62c68cd0: mov r1, r3 0x62c68cd4: movw r2, #62156 ; 0xf2cc (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. ## valgrind deliberate 0x62c6cc0c in ?? () (gdb) x/i $pc => 0x62c6cc0c: str r9, [r11] (gdb) x/12i $pc-6*4 0x62c6cbf4: ldr r9, [r8, #416] ; 0x1a0 0x62c6cbf8: mov r1, r9 0x62c6cbfc: movw r2, #62156 ; 0xf2cc 0x62c6cc00: movt r2, #22528 ; 0x5800 0x62c6cc04: blx r2 0x62c6cc08: ldr r9, [r8, #16] => 0x62c6cc0c: str r9, [r11] 0x62c6cc10: add r9, r11, #4 0x62c6cc14: mov r0, r9 0x62c6cc18: ldr r3, [r8, #420] ; 0x1a4 0x62c6cc1c: mov r1, r3 0x62c6cc20: movw r2, #62156 ; 0xf2cc (gdb) c Continuing. ==426== ==426== HEAP SUMMARY: ==426== in use at exit: 0 bytes in 0 blocks ==426== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==426== ==426== All heap blocks were freed -- no leaks are possible ==426== ==426== For lists of detected and suppressed errors, rerun with: -s ==426== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) [Inferior 1 (process 426) exited normally] ----- You also can use something like objdump --disassemble=subroutine_name to be sure that the executing process matches the built software file. Right now I cannot reproduce SIGILL, so I cannot dig in further. Based on the software that I built and ran: valgrind is not to blame; the problem lies with the compiler, operating system, or hardware. (In the last two months I have had four hardware failures: a 5-port ethernet switch, the sound output on a 12-year old consumer desktop PC, the sound output on a 5-year old self-built x86_64 desktop, and the power brick for a RaspberryPi.) |