|
From: Jacek M. H. <jac...@gm...> - 2018-01-09 13:15:46
|
Dear Sirs, this is CentOS Linux release 7.4.1708 (Core) / 3.10.0-693.11.6.el7.x86_64 kernel / gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16). I tried to use two recent versions of valgrind, the most current GIT (as of today) and then the last release (3.13.0). Both versions refuse to run "valgrind --tool=exp-sgcheck" in the same way (I also tried to completely disable SELinux, if that matters): ---------------------------------------------------------------------- [...]$ valgrind --tool=exp-sgcheck /bin/ls ==20624== exp-sgcheck, a stack and global array overrun detector ==20624== NOTE: This is an Experimental-Class Valgrind Tool ==20624== Copyright (C) 2003-2017, and GNU GPL'd, by OpenWorks Ltd et al. ==20624== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for copyright info ==20624== Command: /bin/ls ==20624== exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible' happened. host stacktrace: ==20624== at 0x580179CD: show_sched_status_wrk (m_libcassert.c:355) ==20624== by 0x58017AE4: report_and_quit (m_libcassert.c:426) ==20624== by 0x58017C71: vgPlain_assert_fail (m_libcassert.c:492) ==20624== by 0x58010033: sg_instrument_IRStmt (sg_main.c:2332) ==20624== by 0x5800AE7F: h_instrument (h_main.c:683) ==20624== by 0x580340C1: tool_instrument_then_gdbserver_if_needed (m_translate.c:232) ==20624== by 0x58106EE1: LibVEX_FrontEnd (main_main.c:650) ==20624== by 0x581076EB: LibVEX_Translate (main_main.c:1185) ==20624== by 0x5803691C: vgPlain_translate (m_translate.c:1805) ==20624== by 0x58077A96: vgPlain_scheduler (scheduler.c:1056) ==20624== by 0x5808970A: run_a_thread_NORETURN (syswrap-linux.c:103) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 20624) ==20624== at 0x4015F5A: _dl_runtime_resolve_xsave (in /usr/lib64/ld-2.17.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. ---------------------------------------------------------------------- I haven't found any hints in the documentation about such an issue. Could you, please, help me, Best regards, Jacek. |
|
From: Philippe W. <phi...@sk...> - 2018-01-09 18:48:47
|
On Tue, 2018-01-09 at 14:15 +0100, Jacek M. Holeczek wrote:
> Dear Sirs,
> this is CentOS Linux release 7.4.1708 (Core) /
> 3.10.0-693.11.6.el7.x86_64 kernel / gcc (GCC) 4.8.5 20150623 (Red Hat
> 4.8.5-16).
> I tried to use two recent versions of valgrind, the most current GIT (as
> of today) and then the last release (3.13.0).
> Both versions refuse to run "valgrind --tool=exp-sgcheck" in the same
> way (I also tried to completely disable SELinux, if that matters):
>
> ----------------------------------------------------------------------
> [...]$ valgrind --tool=exp-sgcheck /bin/ls
> ==20624== exp-sgcheck, a stack and global array overrun detector
> ==20624== NOTE: This is an Experimental-Class Valgrind Tool
> ==20624== Copyright (C) 2003-2017, and GNU GPL'd, by OpenWorks Ltd et al.
> ==20624== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for
> copyright info
> ==20624== Command: /bin/ls
> ==20624==
>
> exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible'
> happened.
The switch statement around that line handles all possible values except 3:
Ist_LoadG, Ist_StoreG, Ist_LLSC
What is funny is that these 3 values are not really new (they have
been introduced in 2012 and 2009).
So, I guess _dl_runtime_resolve_xsave contains an instruction at or around
0x4015F5A that is translated in one of the 3 (unhandled) above values.
I could reproduce something similar with
valgrind --tool=exp-sgcheck ./memcheck/tests/amd64/xsave-avx
Can you do:
valgrind --tool=exp-sgcheck --trace-flags=11000000 /bin/ls
This will output a bunch of lines like:
==== SB 1639 (evchecks 10601) [tid 1] 0x108b73 do_setup_then_xsave /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xb73
==== SB 1640 (evchecks 10602) [tid 1] 0x108ae5 do_xsave /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xae5
==== SB 1641 (evchecks 10603) [tid 1] 0x108b19 do_xsave+52 /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xb19
Then redo the command but adding --trace-notbelow=1635
(where the 1635 is somewhat before the failing SB nr (in my case 1641).
Then create a bug in bugzilla and attach the trace obtained.
In my case, the problem is created by the instruction
0x108B26: xsave (%rsi)
which generates a bunch of guarded store Ist_StoreG.
I suppose (seeing the name of the function that causes the crash for you)
that it will similarly be the xsave instruction.
By having a bug in bugzilla with this info, you increase the chance
to have this problem not forgotten, and who knows, even solved one day :).
Thanks
Philippe
|
|
From: Jacek M. H. <jac...@gm...> - 2018-01-10 14:08:13
|
Dear Sirs, please find attached the output from: valgrind --tool=exp-sgcheck --trace-flags=11000000 /bin/ls Hope it helps, Best regards, Jacek. |