|
From: Radoslaw K. <ku...@9l...> - 2016-12-06 15:54:34
|
Hi everyone, we're trying to run valgrind on a multi-threaded binary that uses fuse to emulate a filesystem. We've found|--sim-hints=|fuse-compatible flag, which is great (resolves some problems with deadlocks), but then valgrind fails on an assert: vg_assert(0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))); (in syswrap_main.c) In our case (sci->flags & SfMayBlock) is true. When we added SfMayBlock to acceptable flags, and recompiled valgrind, everything worked fine. Do you have and idea what we can do to run our binary in a simple way (without any hacks in assertions...)? Is it some configuration issue to allow blocking operations or maybe it is a bug in valgrind itself? Cheers, Radek I attach valgrind log: --12525-- Valgrind options: --12525-- --tool=memcheck --12525-- -v --12525-- --sim-hints=fuse-compatible --12525-- --run-libc-freeres=no --12525-- Contents of /proc/version: --12525-- Linux version 2.6.32-504.el6.x86_64 (moc...@c6... <mailto:moc...@c6...>) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Wed Oct 15 04:27:16 UTC 2014 --12525-- --12525-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-sse3 --12525-- Page sizes: currently 4096, max supported 4096 --12525-- Valgrind library directory: /home2/me/tmp/valgrindXXX/lib/valgrind --12525-- Reading syms from /home2/me/binary.exe --12525-- Reading syms from /home2/me/tmp/valgrindXXX/lib/valgrind/memcheck-amd64-linux --12525-- object doesn't have a dynamic symbol table --12525-- Reading syms from /lib64/ld-2.12.so --12525-- Scheduler: using generic scheduler lock implementation. --12525-- Reading suppressions file: /home2/me/tmp/valgrindXXX/lib/valgrind/default.supp ==12525== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-12525-by-kot-on-beta35 ==12525== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-12525-by-kot-on-beta35 ==12525== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-12525-by-kot-on-beta35 ==12525== ==12525== TO CONTROL THIS PROCESS USING vgdb (which you probably ==12525== don't want to do, unless you know exactly what you're doing, ==12525== or are doing some strange experiment): ==12525== /home2/me/tmp/valgrindXXX/lib/valgrind/../../bin/vgdb --pid=12525 ...command... ==12525== ==12525== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==12525== /path/to/gdb /home2/me/binary.exe ==12525== and then give GDB the following command ==12525== target remote | /home2/me/tmp/valgrindXXX/lib/valgrind/../../bin/vgdb --pid=12525 ==12525== --pid is optional if only one valgrind process is running ==12525== --12525-- REDIR: 0x37458176d0 (ld-linux-x86-64.so.2:strlen) redirected to 0x380550c1 (vgPlain_amd64_linux_REDIR_FOR_strlen) --12525-- REDIR: 0x37458174e0 (ld-linux-x86-64.so.2:index) redirected to 0x380550db (vgPlain_amd64_linux_REDIR_FOR_index) valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed. host stacktrace: ==12525== at 0x38039548: show_sched_status_wrk (m_libcassert.c:343) ==12525== by 0x38039824: report_and_quit (m_libcassert.c:419) ==12525== by 0x38039A40: vgPlain_assert_fail (m_libcassert.c:485) ==12525== by 0x380933E0: vgPlain_client_syscall (syswrap-main.c:1938) ==12525== by 0x38090EB4: handle_syscall (scheduler.c:1118) ==12525== by 0x38090EB4: vgPlain_scheduler (scheduler.c:1435) ==12525== by 0x380C6F6F: thread_wrapper (syswrap-linux.c:103) ==12525== by 0x380C6F6F: run_a_thread_NORETURN (syswrap-linux.c:156) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 12525) ==12525== at 0x3745811BD1: _dl_get_origin (in /lib64/ld-2.12.so) ==12525== by 0x3745806385: expand_dynamic_string_token (in /lib64/ld-2.12.so) ==12525== by 0x37458063EE: decompose_rpath (in /lib64/ld-2.12.so) ==12525== by 0x37458068BC: _dl_init_paths (in /lib64/ld-2.12.so) ==12525== by 0x3745803169: dl_main (in /lib64/ld-2.12.so) ==12525== by 0x3745815B4D: _dl_sysdep_start (in /lib64/ld-2.12.so) ==12525== by 0x37458014A3: _dl_start (in /lib64/ld-2.12.so) ==12525== by 0x3745800B07: ??? (in /lib64/ld-2.12.so) |
|
From: Ivo R. <iv...@iv...> - 2016-12-06 16:23:37
|
2016-12-06 15:39 GMT+00:00 Radoslaw Kujawa <ku...@9l...>: > Hi everyone, > > we're trying to run valgrind on a multi-threaded binary that uses fuse to emulate a filesystem. We've found --sim-hints=fuse-compatible flag, which is great (resolves some problems with deadlocks), but then valgrind fails on an assert: > > vg_assert(0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))); (in syswrap_main.c) > > In our case (sci->flags & SfMayBlock) is true. When we added SfMayBlock to acceptable flags, and recompiled valgrind, everything worked fine. > > Do you have and idea what we can do to run our binary in a simple way (without any hacks in assertions...)? Is it some configuration issue to allow blocking operations or maybe it is a bug in valgrind itself? > > Can you isolate this problem in a very simple program so there is a reproducible test case? Alternatively, please provide output of running Valgrind with '--trace-syscalls=yes'. I. |
|
From: Radoslaw K. <ku...@9l...> - 2016-12-07 12:54:20
Attachments:
sfMayBlockExample.cpp
|
Hi Ivo, here is the log that appeared when we switched on trace-syscalls: SYSCALL[28094,1](12) sys_brk ( 0x0 ) --> [pre-success] Success(0x4000000) --28094-- REDIR: 0x37458176d0 (ld-linux-x86-64.so.2:strlen) redirected to 0x380550c1 (vgPlain_amd64_linux_REDIR_FOR_strlen) SYSCALL[28094,1](63) sys_newuname ( 0xffefffb10 )[sync] --> Success(0x0) --28094-- REDIR: 0x37458174e0 (ld-linux-x86-64.so.2:index) redirected to 0x380550db (vgPlain_amd64_linux_REDIR_FOR_index) SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), 0xffeffec10, 4096 ) --> [pre-success] Success(0x52) valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed. We also managed to create simple file exposing the problem (see attachment). Exact command used by us to reproduce the problem: valgrind --tool=memcheck -v --trace-syscalls=yes --sim-hints=fuse-compatible ./a.out Radek W dniu 06.12.2016 o 17:23, Ivo Raisr pisze: > > > 2016-12-06 15:39 GMT+00:00 Radoslaw Kujawa <ku...@9l... > <mailto:ku...@9l...>>: > > Hi everyone, > > we're trying to run valgrind on a multi-threaded binary that uses fuse to emulate a filesystem. We've found|--sim-hints=|fuse-compatible flag, which is great (resolves some problems with deadlocks), but then valgrind fails on an assert: > > vg_assert(0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))); (in syswrap_main.c) > > In our case (sci->flags & SfMayBlock) is true. When we added SfMayBlock to acceptable flags, and recompiled valgrind, everything worked fine. > > Do you have and idea what we can do to run our binary in a simple way (without any hacks in assertions...)? Is it some configuration issue to allow blocking operations or maybe it is a bug in valgrind itself? > > > Can you isolate this problem in a very simple program so there is a > reproducible test case? > Alternatively, please provide output of running Valgrind with > '--trace-syscalls=yes'. > I. |
|
From: Ivo R. <iv...@iv...> - 2016-12-08 10:12:43
|
2016-12-07 13:54 GMT+01:00 Radoslaw Kujawa <ku...@9l...>: > Hi Ivo, > > here is the log that appeared when we switched on trace-syscalls: > > SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), > 0xffeffec10, 4096 ) --> [pre-success] Success(0x52) > valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): > Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | > SfNoWriteResult))' failed. > Hi Radek, Thanks to the log you provided, I was able to quickly analyse the situation and get to the root cause. Pre-syscall wrapper for sys_readlink in syswrap-generic.c sets SfMayBlock unconditionally (via FUSE_COMPATIBLE_MAY_BLOCK) in the anticipation that the subsequent real syscall may block. However in some cases, such as when operating on /proc/self/exe, the wrapper calls the real syscall itself and therefore SfMayBlock should not have been set. If you are able to compile Valgrind from sources, I can prepare a small patch for you to test. Unfortunately I don't have an environment to test with FUSE for you. I. |
|
From: Radoslaw K. <ku...@9l...> - 2016-12-08 10:28:22
|
W dniu 08.12.2016 o 11:12, Ivo Raisr pisze: > > > 2016-12-07 13:54 GMT+01:00 Radoslaw Kujawa <ku...@9l... > <mailto:ku...@9l...>>: > > Hi Ivo, > > here is the log that appeared when we switched on trace-syscalls: > > SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), > 0xffeffec10, 4096 ) --> [pre-success] Success(0x52) > valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): > Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | > SfNoWriteResult))' failed. > > > Hi Radek, > > Thanks to the log you provided, I was able to quickly analyse the > situation and get to the root cause. > Pre-syscall wrapper for sys_readlink in syswrap-generic.c sets > SfMayBlock unconditionally (via FUSE_COMPATIBLE_MAY_BLOCK) > in the anticipation that the subsequent real syscall may block. > However in some cases, such as when operating on /proc/self/exe, the > wrapper calls the real syscall > itself and therefore SfMayBlock should not have been set. > > If you are able to compile Valgrind from sources, I can prepare a > small patch for you to test. > Unfortunately I don't have an environment to test with FUSE for you. > I. Yes, we can recompile Valgrind. We will be grateful if you prepare a patch for us. If it occurs that this patch resolves the problem, do you think we should report it as a bug via bugzilla? Radek |
|
From: Ivo R. <iv...@iv...> - 2016-12-08 16:25:36
Attachments:
fuse.patch-01
|
2016-12-08 11:28 GMT+01:00 Radoslaw Kujawa <ku...@9l...>: > > W dniu 08.12.2016 o 11:12, Ivo Raisr pisze: > > > > 2016-12-07 13:54 GMT+01:00 Radoslaw Kujawa <ku...@9l...>: > >> Hi Ivo, >> >> here is the log that appeared when we switched on trace-syscalls: >> >> SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), >> 0xffeffec10, 4096 ) --> [pre-success] Success(0x52) >> valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): >> Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | >> SfNoWriteResult))' failed. >> > > Hi Radek, > > Thanks to the log you provided, I was able to quickly analyse the > situation and get to the root cause. > Pre-syscall wrapper for sys_readlink in syswrap-generic.c sets SfMayBlock > unconditionally (via FUSE_COMPATIBLE_MAY_BLOCK) > in the anticipation that the subsequent real syscall may block. > However in some cases, such as when operating on /proc/self/exe, the > wrapper calls the real syscall > itself and therefore SfMayBlock should not have been set. > > If you are able to compile Valgrind from sources, I can prepare a small > patch for you to test. > Unfortunately I don't have an environment to test with FUSE for you. > I. > > > Yes, we can recompile Valgrind. We will be grateful if you prepare a patch > for us. > If it occurs that this patch resolves the problem, do you think we should > report it as a bug via bugzilla? > Please find attached a patch. File a bug anyway and report your findings. I. |
|
From: Radoslaw K. <ku...@9l...> - 2016-12-12 14:08:39
|
W dniu 08.12.2016 o 17:24, Ivo Raisr pisze: > > > 2016-12-08 11:28 GMT+01:00 Radoslaw Kujawa <ku...@9l... > <mailto:ku...@9l...>>: > > > W dniu 08.12.2016 o 11:12, Ivo Raisr pisze: >> >> >> 2016-12-07 13:54 GMT+01:00 Radoslaw Kujawa <ku...@9l... >> <mailto:ku...@9l...>>: >> >> Hi Ivo, >> >> here is the log that appeared when we switched on trace-syscalls: >> >> SYSCALL[28094,1](89) sys_readlink ( >> 0x374581b667(/proc/self/exe), 0xffeffec10, 4096 ) --> >> [pre-success] Success(0x52) >> valgrind: m_syswrap/syswrap-main.c:1938 >> (vgPlain_client_syscall): Assertion '0 == (sci->flags & >> ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed. >> >> >> Hi Radek, >> >> Thanks to the log you provided, I was able to quickly analyse the >> situation and get to the root cause. >> Pre-syscall wrapper for sys_readlink in syswrap-generic.c sets >> SfMayBlock unconditionally (via FUSE_COMPATIBLE_MAY_BLOCK) >> in the anticipation that the subsequent real syscall may block. >> However in some cases, such as when operating on /proc/self/exe, >> the wrapper calls the real syscall >> itself and therefore SfMayBlock should not have been set. >> >> If you are able to compile Valgrind from sources, I can prepare a >> small patch for you to test. >> Unfortunately I don't have an environment to test with FUSE for you. >> I. > > Yes, we can recompile Valgrind. We will be grateful if you prepare > a patch for us. > If it occurs that this patch resolves the problem, do you think we > should report it as a bug via bugzilla? > > > > Please find attached a patch. File a bug anyway and report your findings. > I. Thanks for patch, it helped. I've reported this issue via bugzilla. Radek |