You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(3) |
Nov
|
Dec
|
From: Jacek M. H. <jac...@gm...> - 2018-01-09 19:05:22
|
Dear Sirs, > * and we have a bunch of other places where a similar strtok_r loop > is used, with similar variables not initialised. I do not know what is so "specific" about these two places but these are the only ones which generate any compiler warnings. Best regards, Jacek. |
From: Philippe W. <phi...@sk...> - 2018-01-09 18:59:27
|
The compiler warning messages look somewhat fishy: * they speak about a variable (e.g. tokens_saveptr) but pointing at a line in a function where there is no such variable (e.g. the 'for (p = s' loop). * Maybe this is because the compiler does not understand strtok_r. Here is an extract of the man page: The strtok_r() function is a reentrant version strtok(). The saveptr argument is a pointer to a char * variable that is used internally by strtok_r() in order to maintain context between successive calls that parse the same string. On the first call to strtok_r(), str should point to the string to be parsed, and the value of saveptr is ignored. In subsequent calls, str should be NULL, and saveptr should be unchanged since the previous call. * and we have a bunch of other places where a similar strtok_r loop is used, with similar variables not initialised. So, not very clear what is exactly the warning about. Philippe On Tue, 2018-01-09 at 14:10 +0100, Jacek M. Holeczek wrote: > Dear Sirs, > this is Ubuntu 14.04.5 LTS / x86_64 / gcc (Ubuntu > 4.8.4-2ubuntu1~14.04.3) 4.8.4. > While compiling the most current GIT version (as of today), I get some > warnings: > > ---------------------------------------------------------------------- > gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../include -I../include > -I../VEX/pub -I../VEX/pub -DVGA_amd64=1 -DVGO_linux=1 > -DVGP_amd64_linux=1 -DVGPV_amd64_linux_vanilla=1 -I../coregrind > -DVG_LIBDIR="\"/usr/local/lib/valgrind"\" -DVG_PLATFORM="\ > "amd64-linux\"" -m64 -O2 -finline-functions -g -std=gnu99 -Wall > -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes > -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings > -Wempty-body -Wformat -Wformat-security -W > ignored-qualifiers -Wmissing-parameter-type -Wold-style-declaration > -fno-stack-protector -fno-strict-aliasing -fno-builtin > -fomit-frame-pointer -DENABLE_LINUX_TICKET_LOCK -MT > libcoregrind_amd64_linux_a-m_libcbase.o -MD -MP -MF .deps/li > bcoregrind_amd64_linux_a-m_libcbase.Tpo -c -o > libcoregrind_amd64_linux_a-m_libcbase.o `test -f 'm_libcbase.c' || echo > './'`m_libcbase.c > m_libcbase.c: In function ‘vgPlain_parse_enum_set’: > m_libcbase.c:645:19: warning: ‘tokens_saveptr’ may be used uninitialized > in this function [-Wmaybe-uninitialized] > for (p = s; *p != '\0'; ++p) { > ^ > m_libcbase.c:572:11: note: ‘tokens_saveptr’ was declared here > HChar *tokens_saveptr; > ^ > m_libcbase.c:645:19: warning: ‘input_saveptr’ may be used uninitialized > in this function [-Wmaybe-uninitialized] > for (p = s; *p != '\0'; ++p) { > ^ > m_libcbase.c:580:11: note: ‘input_saveptr’ was declared here > HChar *input_saveptr; > ^ > ---------------------------------------------------------------------- > > Hope it helps, > Best regards, > Jacek. > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Philippe W. <phi...@sk...> - 2018-01-09 18:48:47
|
On Tue, 2018-01-09 at 14:15 +0100, Jacek M. Holeczek wrote: > Dear Sirs, > this is CentOS Linux release 7.4.1708 (Core) / > 3.10.0-693.11.6.el7.x86_64 kernel / gcc (GCC) 4.8.5 20150623 (Red Hat > 4.8.5-16). > I tried to use two recent versions of valgrind, the most current GIT (as > of today) and then the last release (3.13.0). > Both versions refuse to run "valgrind --tool=exp-sgcheck" in the same > way (I also tried to completely disable SELinux, if that matters): > > ---------------------------------------------------------------------- > [...]$ valgrind --tool=exp-sgcheck /bin/ls > ==20624== exp-sgcheck, a stack and global array overrun detector > ==20624== NOTE: This is an Experimental-Class Valgrind Tool > ==20624== Copyright (C) 2003-2017, and GNU GPL'd, by OpenWorks Ltd et al. > ==20624== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for > copyright info > ==20624== Command: /bin/ls > ==20624== > > exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible' > happened. The switch statement around that line handles all possible values except 3: Ist_LoadG, Ist_StoreG, Ist_LLSC What is funny is that these 3 values are not really new (they have been introduced in 2012 and 2009). So, I guess _dl_runtime_resolve_xsave contains an instruction at or around 0x4015F5A that is translated in one of the 3 (unhandled) above values. I could reproduce something similar with valgrind --tool=exp-sgcheck ./memcheck/tests/amd64/xsave-avx Can you do: valgrind --tool=exp-sgcheck --trace-flags=11000000 /bin/ls This will output a bunch of lines like: ==== SB 1639 (evchecks 10601) [tid 1] 0x108b73 do_setup_then_xsave /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xb73 ==== SB 1640 (evchecks 10602) [tid 1] 0x108ae5 do_xsave /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xae5 ==== SB 1641 (evchecks 10603) [tid 1] 0x108b19 do_xsave+52 /home/philippe/valgrind/git/trunk_untouched/memcheck/tests/amd64/xsave-avx+0xb19 Then redo the command but adding --trace-notbelow=1635 (where the 1635 is somewhat before the failing SB nr (in my case 1641). Then create a bug in bugzilla and attach the trace obtained. In my case, the problem is created by the instruction 0x108B26: xsave (%rsi) which generates a bunch of guarded store Ist_StoreG. I suppose (seeing the name of the function that causes the crash for you) that it will similarly be the xsave instruction. By having a bug in bugzilla with this info, you increase the chance to have this problem not forgotten, and who knows, even solved one day :). Thanks Philippe |
From: Jacek M. H. <jac...@gm...> - 2018-01-09 13:15:46
|
Dear Sirs, this is CentOS Linux release 7.4.1708 (Core) / 3.10.0-693.11.6.el7.x86_64 kernel / gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16). I tried to use two recent versions of valgrind, the most current GIT (as of today) and then the last release (3.13.0). Both versions refuse to run "valgrind --tool=exp-sgcheck" in the same way (I also tried to completely disable SELinux, if that matters): ---------------------------------------------------------------------- [...]$ valgrind --tool=exp-sgcheck /bin/ls ==20624== exp-sgcheck, a stack and global array overrun detector ==20624== NOTE: This is an Experimental-Class Valgrind Tool ==20624== Copyright (C) 2003-2017, and GNU GPL'd, by OpenWorks Ltd et al. ==20624== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for copyright info ==20624== Command: /bin/ls ==20624== exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible' happened. host stacktrace: ==20624== at 0x580179CD: show_sched_status_wrk (m_libcassert.c:355) ==20624== by 0x58017AE4: report_and_quit (m_libcassert.c:426) ==20624== by 0x58017C71: vgPlain_assert_fail (m_libcassert.c:492) ==20624== by 0x58010033: sg_instrument_IRStmt (sg_main.c:2332) ==20624== by 0x5800AE7F: h_instrument (h_main.c:683) ==20624== by 0x580340C1: tool_instrument_then_gdbserver_if_needed (m_translate.c:232) ==20624== by 0x58106EE1: LibVEX_FrontEnd (main_main.c:650) ==20624== by 0x581076EB: LibVEX_Translate (main_main.c:1185) ==20624== by 0x5803691C: vgPlain_translate (m_translate.c:1805) ==20624== by 0x58077A96: vgPlain_scheduler (scheduler.c:1056) ==20624== by 0x5808970A: run_a_thread_NORETURN (syswrap-linux.c:103) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 20624) ==20624== at 0x4015F5A: _dl_runtime_resolve_xsave (in /usr/lib64/ld-2.17.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. ---------------------------------------------------------------------- I haven't found any hints in the documentation about such an issue. Could you, please, help me, Best regards, Jacek. |
From: Jacek M. H. <jac...@gm...> - 2018-01-09 13:11:07
|
Dear Sirs, this is Ubuntu 14.04.5 LTS / x86_64 / gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4. While compiling the most current GIT version (as of today), I get some warnings: ---------------------------------------------------------------------- gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../include -I../include -I../VEX/pub -I../VEX/pub -DVGA_amd64=1 -DVGO_linux=1 -DVGP_amd64_linux=1 -DVGPV_amd64_linux_vanilla=1 -I../coregrind -DVG_LIBDIR="\"/usr/local/lib/valgrind"\" -DVG_PLATFORM="\ "amd64-linux\"" -m64 -O2 -finline-functions -g -std=gnu99 -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wcast-align -Wcast-qual -Wwrite-strings -Wempty-body -Wformat -Wformat-security -W ignored-qualifiers -Wmissing-parameter-type -Wold-style-declaration -fno-stack-protector -fno-strict-aliasing -fno-builtin -fomit-frame-pointer -DENABLE_LINUX_TICKET_LOCK -MT libcoregrind_amd64_linux_a-m_libcbase.o -MD -MP -MF .deps/li bcoregrind_amd64_linux_a-m_libcbase.Tpo -c -o libcoregrind_amd64_linux_a-m_libcbase.o `test -f 'm_libcbase.c' || echo './'`m_libcbase.c m_libcbase.c: In function ‘vgPlain_parse_enum_set’: m_libcbase.c:645:19: warning: ‘tokens_saveptr’ may be used uninitialized in this function [-Wmaybe-uninitialized] for (p = s; *p != '\0'; ++p) { ^ m_libcbase.c:572:11: note: ‘tokens_saveptr’ was declared here HChar *tokens_saveptr; ^ m_libcbase.c:645:19: warning: ‘input_saveptr’ may be used uninitialized in this function [-Wmaybe-uninitialized] for (p = s; *p != '\0'; ++p) { ^ m_libcbase.c:580:11: note: ‘input_saveptr’ was declared here HChar *input_saveptr; ^ ---------------------------------------------------------------------- Hope it helps, Best regards, Jacek. |
From: 'Mark W. <ma...@kl...> - 2018-01-05 18:42:03
|
On Fri, Jan 05, 2018 at 09:36:16AM -0800, Mark Roberts wrote: > On Tue, Jan 02, 2018 at 02:39:36PM -0800, Mark Roberts wrote: > > The problem with valgrind/memcheck/tests/linux/stack_changes.c > > appears to be an actual problem with the source. In June of > > 2017 the gnu c header file sys/ucontext.h was changed. > > typedef struct ucontext is now typedef struct ucontext_t. This change > > is included in the latest release of glibc 2.26 (Aug 2017). > > There is already a patch in git: > https://sourceware.org/git/?p=valgrind.git;a=commitdiff;h=2b5eab6a8db1b0487a3ad7fc4e7eeda6d3513626;hp=02b719e7b2f4c88eedd8b5689d842a62118cb47a > This patch just changes the definition to the 2.26 version. What about older versions? Doesn't this change need to be > dependent on a GLIBC_VERSION check in configure? No. ucontext_t works for older glibc versions too. Cheers, Mark |
From: Mark R. <ma...@cs...> - 2018-01-05 17:36:22
|
This patch just changes the definition to the 2.26 version. What about older versions? Doesn't this change need to be dependent on a GLIBC_VERSION check in configure? -----Original Message----- From: Mark Wielaard [mailto:ma...@kl...] Sent: Tuesday, January 02, 2018 3:27 PM To: Mark Roberts Cc: 'John Reiser'; val...@li... Subject: Re: [Valgrind-users] problems with Ubuntu 17.10 On Tue, Jan 02, 2018 at 02:39:36PM -0800, Mark Roberts wrote: > The problem with valgrind/memcheck/tests/linux/stack_changes.c > appears to be an actual problem with the source. In June of > 2017 the gnu c header file sys/ucontext.h was changed. > typedef struct ucontext is now typedef struct ucontext_t. This change > is included in the latest release of glibc 2.26 (Aug 2017). There is already a patch in git: https://sourceware.org/git/?p=valgrind.git;a=commitdiff;h=2b5eab6a8db1b0487a3ad7fc4e7eeda6d3513626;hp=02b719e7b2f4c88eedd8b5689 d842a62118cb47a But no release yet. Cheers, Mark |
From: Mark W. <ma...@kl...> - 2018-01-02 23:44:48
|
On Tue, Jan 02, 2018 at 02:39:36PM -0800, Mark Roberts wrote: > The problem with valgrind/memcheck/tests/linux/stack_changes.c appears > to be an actual problem with the source. In June of > 2017 the gnu c header file sys/ucontext.h was changed. > typedef struct ucontext is now typedef struct ucontext_t. This change > is included in the latest release of glibc 2.26 (Aug 2017). There is already a patch in git: https://sourceware.org/git/?p=valgrind.git;a=commitdiff;h=2b5eab6a8db1b0487a3ad7fc4e7eeda6d3513626;hp=02b719e7b2f4c88eedd8b5689d842a62118cb47a But no release yet. Cheers, Mark |
From: Mark R. <ma...@cs...> - 2018-01-02 23:11:38
|
I'm so sorry - I left out some information and my message was not clear. Valgrind itself builds and runs fine it's the regression tests that failed to compile and/or link. During my further investigation, the pie/no-pie linking problem turned out to be a bad file in our repository. The problem with valgrind/memcheck/tests/linux/stack_changes.c appears to be an actual problem with the source. In June of 2017 the gnu c header file sys/ucontext.h was changed. typedef struct ucontext is now typedef struct ucontext_t. This change is included in the latest release of glibc 2.26 (Aug 2017). Thank you, Mark Roberts -----Original Message----- From: John Reiser [mailto:jr...@bi...] Sent: Tuesday, January 02, 2018 9:02 AM To: val...@li... Subject: Re: [Valgrind-users] problems with Ubuntu 17.10 > But the larger, more pervasive problem is with this release of Ubuntu > the gcc (7.2.0) compiler has changed to emitting position independent > code by default. I have tried to add -no-pie to compiler options but > have not been successful. Why is that a problem? And is it a problem in building valgrind itself, or in applying an old valgrind to newly-compiled code? Please give an explicit example (copy+paste) of a new complaint, and analyze it as best you can. -- ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: John R. <jr...@bi...> - 2018-01-02 17:19:12
|
> But the larger, more pervasive problem is with this release of Ubuntu > the gcc (7.2.0) compiler has changed to emitting position > independent code by default. I have tried to add -no-pie > to compiler options but have not been successful. Why is that a problem? And is it a problem in building valgrind itself, or in applying an old valgrind to newly-compiled code? Please give an explicit example (copy+paste) of a new complaint, and analyze it as best you can. -- |
From: Tom H. <to...@co...> - 2018-01-02 17:06:18
|
On 02/01/18 16:20, Mark Roberts wrote: > But the larger, more pervasive problem is with this release of Ubuntu the gcc (7.2.0) compiler has changed to emitting position > independent code by default. I have tried to add -no-pie to compiler options but have not been successful. Why is that a problem? Fedora 27 uses gcc 7.2 as well and I have no trouble building valgrind there. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Mark R. <ma...@cs...> - 2018-01-02 16:49:26
|
I am having lots of problems getting Valgrind to run on Ubuntu 17.10 (on x86_64 hardware). First, valgrind/memcheck/tests/linux/stack_changes.c would not compile. I had to change the typedef on line 13 from ucontext to ucontext_t. But the larger, more pervasive problem is with this release of Ubuntu the gcc (7.2.0) compiler has changed to emitting position independent code by default. I have tried to add -no-pie to compiler options but have not been successful. Has anybody had these problems and/or have a solution? Thank you, Mark Roberts |
From: Philippe W. <phi...@sk...> - 2017-12-20 19:42:40
|
Does it help to run with --expensive-definedness-checks=yes ? You might also try 3.14 GIT version, as some work was recently done in the area of definedness checking and the above option was also extended with a 3rd value (auto), which is less expensive. Philippe On Wed, 2017-12-20 at 15:37 +0000, Jason Vas Dias wrote: > Good day - > > Please could anyone explain why valgrind v3.13.0, built for x86_64 under Linux > (RHEL 7.4), is complaining about > "Conditional jump or move depends on uninitialised value(s)" > in this case - I cannot see how any memory accessed by this > code is uninitialized, and inspecting the V bits and shadow > registers also does not show any 0 bits - the program always > stops with the above error, at the line > > ==26770== Thread 4: > ==26770== Conditional jump or move depends on uninitialised value(s) > ==26770== at 0x5C3EF46: lround (s_llround.c:42) > > which is entered via the line in our code: > > const uint32_t delta_time = uint32_t(std::lround(sensor.time * 2e9)); > > ^^^^^^^^^^^^^^^^^^^^^^^^ > This is a call to GLIBC v2.17's lround, in glibc source code file: > sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c, > @ line 28: > long long int > __llround (double x) > { // I recompiled glibc to add initializers for > // these auto variables, but it made no difference: > int32_t j0=0; > int64_t i0=0; > long long int result=0; > int sign=0; > > EXTRACT_WORDS64 (i0, x); > j0 = ((i0 >> 52) & 0x7ff) - 0x3ff; > sign = i0 < 0 ? -1 : 1; > i0 &= UINT64_C(0xfffffffffffff); > i0 |= UINT64_C(0x10000000000000); > @ line 42: > ==> if (j0 < (int32_t) (8 * sizeof (long long int)) - 1) > { > > EXTRACT_WORDS64 resolves to an asm statement defined in > sysdeps/x86_64/fpu/math_private.h: > /* Direct movement of float into integer register. */ > #define EXTRACT_WORDS64(i, d) \ > do { \ > int64_t i_; \ > asm (MOVD " %1, %0" : "=rm" (i_) : "x" ((double) (d))); \ > (i) = i_; \ > } while (0) > > . > > When I run valgrind with options: > > --tool=memcheck --track-origins=yes --vgdb-shadow-registers=yes > --vgdb=yes \ > --vgdb-error=0 my_program .... > > it invariably stops at the same s_llround.c:42 place shown above . > > Inspecting the valid bits for both 'j0' (in glibc's __llround) and 'sensor.time' > (in our code) in GDB shows ALL VALID BITS set : > > (gdb is stopped at s_llround.c, line 42): > > (gdb) p &j0 > Address requested for identifier "j0" which is in register $rdx > (gdb) p/x $rdxs1 > $1 = 0xffffffff > (gdb) p j0 > $1 = 6 > > // so the j0 variable appears to be valid, according to valgrind's > shadow register V-bits. > // So why did valgrind stop at that particular line, where no variable > or memory other > // than j0 is being accessed ? > > (gdb) up > ... ( back to our code: delta_time = > uint32_t(std::lround(sensor.time * 2e9)); > ... sensor is a structure reference variable > ... ) > > (gdb) p &sensor->time > $16 = (double *) 0x10ea9088 > (gdb) mo xb 0x10ea9088 8 > ff ff ff ff ff ff ff ff > 0x10EA9088: 0xef 0xd9 0x0e 0x32 0x57 0x0e 0x6a 0x3e > > So how can I tell which valid bit valgrind is complaining about being 0 here ? > No relevant valid bits appear to be 0 ? > > Yes, not all bits for the whole 40 byte 'sensor' structure are valid > yet (it is in the processof being constructed here) but the 8 bytes > referenced by 'sensor.time' ARE VALID , and no other bits can be > accessed by the statement at which valgrind stops. > > It just says at the end: > ==26770== Uninitialised value was created by a stack allocation > ==26770== at 0x4E2979: main (Main.cpp:88) > > Yes, the 'sensor' structure is part of a 200MB array created at > program initialization , which is populated by SPI + GPIO + DMA reads > from an embedded device, in the multi-threaded program. But the memory > being accessed by the statement above HAS ALL VALID BITS SET, so I > cannot see what valgrind is complaining about here . > > I'd really appreciate some kind of '--show-valid-bits-and-addresses' > option to valgrind, which would make it display exactly the valid bits > it found to be 0, and which memory addresses / registers they > correspond to . > > I believe the above behavior represents a BUG in latest version of > valgrind, because > NO RELEVANT VALID BITS ARE ZERO , AFAICS. > > valgrind-3.12.0 (the RHEL-7.4 default version) displays the same behavior , and > stops at the same place with the same error. > > I'd really like to test our program with valgrind, but false positives such as > the above are blocking this - I am having to abandon valgrind testing because of > this issue , because valgrind appears to be too buggy to use. The program runs > fine outside of valgrind without any errors (usually) - but as I am changing it > I'd like to run it under valgrind as part of standard automated testing. > > Any ideas / suggestions how to resolve this false positive, or proof that it is > not a false positive, would be most gratefully received. > > Thanks in advance & Best Regards . > Jason Vas Dias > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Jason V. D. <jas...@gm...> - 2017-12-20 15:37:16
|
Good day - Please could anyone explain why valgrind v3.13.0, built for x86_64 under Linux (RHEL 7.4), is complaining about "Conditional jump or move depends on uninitialised value(s)" in this case - I cannot see how any memory accessed by this code is uninitialized, and inspecting the V bits and shadow registers also does not show any 0 bits - the program always stops with the above error, at the line ==26770== Thread 4: ==26770== Conditional jump or move depends on uninitialised value(s) ==26770== at 0x5C3EF46: lround (s_llround.c:42) which is entered via the line in our code: const uint32_t delta_time = uint32_t(std::lround(sensor.time * 2e9)); ^^^^^^^^^^^^^^^^^^^^^^^^ This is a call to GLIBC v2.17's lround, in glibc source code file: sysdeps/ieee754/dbl-64/wordsize-64/s_llround.c, @ line 28: long long int __llround (double x) { // I recompiled glibc to add initializers for // these auto variables, but it made no difference: int32_t j0=0; int64_t i0=0; long long int result=0; int sign=0; EXTRACT_WORDS64 (i0, x); j0 = ((i0 >> 52) & 0x7ff) - 0x3ff; sign = i0 < 0 ? -1 : 1; i0 &= UINT64_C(0xfffffffffffff); i0 |= UINT64_C(0x10000000000000); @ line 42: ==> if (j0 < (int32_t) (8 * sizeof (long long int)) - 1) { EXTRACT_WORDS64 resolves to an asm statement defined in sysdeps/x86_64/fpu/math_private.h: /* Direct movement of float into integer register. */ #define EXTRACT_WORDS64(i, d) \ do { \ int64_t i_; \ asm (MOVD " %1, %0" : "=rm" (i_) : "x" ((double) (d))); \ (i) = i_; \ } while (0) . When I run valgrind with options: --tool=memcheck --track-origins=yes --vgdb-shadow-registers=yes --vgdb=yes \ --vgdb-error=0 my_program .... it invariably stops at the same s_llround.c:42 place shown above . Inspecting the valid bits for both 'j0' (in glibc's __llround) and 'sensor.time' (in our code) in GDB shows ALL VALID BITS set : (gdb is stopped at s_llround.c, line 42): (gdb) p &j0 Address requested for identifier "j0" which is in register $rdx (gdb) p/x $rdxs1 $1 = 0xffffffff (gdb) p j0 $1 = 6 // so the j0 variable appears to be valid, according to valgrind's shadow register V-bits. // So why did valgrind stop at that particular line, where no variable or memory other // than j0 is being accessed ? (gdb) up ... ( back to our code: delta_time = uint32_t(std::lround(sensor.time * 2e9)); ... sensor is a structure reference variable ... ) (gdb) p &sensor->time $16 = (double *) 0x10ea9088 (gdb) mo xb 0x10ea9088 8 ff ff ff ff ff ff ff ff 0x10EA9088: 0xef 0xd9 0x0e 0x32 0x57 0x0e 0x6a 0x3e So how can I tell which valid bit valgrind is complaining about being 0 here ? No relevant valid bits appear to be 0 ? Yes, not all bits for the whole 40 byte 'sensor' structure are valid yet (it is in the processof being constructed here) but the 8 bytes referenced by 'sensor.time' ARE VALID , and no other bits can be accessed by the statement at which valgrind stops. It just says at the end: ==26770== Uninitialised value was created by a stack allocation ==26770== at 0x4E2979: main (Main.cpp:88) Yes, the 'sensor' structure is part of a 200MB array created at program initialization , which is populated by SPI + GPIO + DMA reads from an embedded device, in the multi-threaded program. But the memory being accessed by the statement above HAS ALL VALID BITS SET, so I cannot see what valgrind is complaining about here . I'd really appreciate some kind of '--show-valid-bits-and-addresses' option to valgrind, which would make it display exactly the valid bits it found to be 0, and which memory addresses / registers they correspond to . I believe the above behavior represents a BUG in latest version of valgrind, because NO RELEVANT VALID BITS ARE ZERO , AFAICS. valgrind-3.12.0 (the RHEL-7.4 default version) displays the same behavior , and stops at the same place with the same error. I'd really like to test our program with valgrind, but false positives such as the above are blocking this - I am having to abandon valgrind testing because of this issue , because valgrind appears to be too buggy to use. The program runs fine outside of valgrind without any errors (usually) - but as I am changing it I'd like to run it under valgrind as part of standard automated testing. Any ideas / suggestions how to resolve this false positive, or proof that it is not a false positive, would be most gratefully received. Thanks in advance & Best Regards . Jason Vas Dias |
From: Silva J. <joa...@al...> - 2017-12-10 21:10:05
|
Then you have to understand what this task is doing. Isn't the backtrace pointing at what the code is doing and what this read could be ? Look at the file descriptor on which it is reading and see what this fd is ? Is it a real file ? (unlikely to be blocking then) Is it a pipe ? A tcp/ip connection ? Use lsof if you cannot determine in gdb what this fd is for. [JMSS] It's a file. After I added some prints for debugging, the main thread seem to get unblocked (?) This can be false positive of course, and of course, this can be a true positive :). With only an address, no access to the code, no backtrace, no reproducer, there is not much feedback we can give. Let me just tell that at my work, we have added for helgrind a few suppression entries related to the 'low level implementation of the gnat runtime', to suppress false positive created by the low level inner working of the runtime. To see what you case is, the minimum needed would be the stack traces of the error msg. In summary, at this point, it looks like you have to debug your application when running under valgrind, and then you might determine if what you see is a real application bug, or a valgrind bug/limitation e.g. in the valgrind scheduler/signal handling/syscall handling or whatever. At this state, without further info, let's assume you have an application bug :) [JMSS] Yes, I think that we are now able to debug the application. It seems to run under nulgrind, helgrind and memcheck, so it should be "good" to go. [JMSS] I'll try to provide the patch for the configuration now. João M. S. Silva |
From: Ivo R. <iv...@iv...> - 2017-12-09 19:26:28
|
2017-12-08 22:18 GMT+01:00 Rob Boehne <ro...@da...>: > All, > > I built valgrind 3.13.0 from source with prefix=/opt/valgrind-3.13 and > installed there on a 64-bit Intel Solaris machine. > It seems to work well for the program passed on the command line, but > tracing children doesn’t. > > Whenever I pass –trace-children=yes on the command line I get this message: > > valgrind: failed to start tool 'memcheck' for platform 'x86-solaris': No > such file or directory > > I see files under the lib directory /opt/valgrind-3.12/lib/ that are labeled > “amd64-solaris” Hi Rob, It seems the 32-bit binaries were not built or installed. Perhaps you specified '--enable-only64bit' with configure? If yes, then do not use this option. Alternatively, please post the configure summary. For me it looks like: Maximum build arch: amd64 Primary build arch: amd64 Secondary build arch: x86 Build OS: solaris Primary build target: AMD64_SOLARIS Secondary build target: X86_SOLARIS Platform variant: vanilla Primary -DVGPV string: -DVGPV_amd64_solaris_vanilla=1 Default supp files: solaris11.supp I. |
From: Rob B. <ro...@da...> - 2017-12-09 00:52:00
|
All, I built valgrind 3.13.0 from source with prefix=/opt/valgrind-3.13 and installed there on a 64-bit Intel Solaris machine. It seems to work well for the program passed on the command line, but tracing children doesn’t. Whenever I pass –trace-children=yes on the command line I get this message: valgrind: failed to start tool 'memcheck' for platform 'x86-solaris': No such file or directory I see files under the lib directory /opt/valgrind-3.12/lib/ that are labeled “amd64-solaris” robb@solaris11-x64:~$ ls -l /opt/valgrind-3.13/lib/valgrind/ | grep solaris -rwxrwxr-x 1 root devadmin 11091192 Oct 28 04:25 cachegrind-amd64-solaris -rwxrwxr-x 1 root devadmin 11630792 Oct 28 04:25 callgrind-amd64-solaris -rwxrwxr-x 1 root devadmin 11671896 Oct 28 04:26 drd-amd64-solaris -rwxrwxr-x 1 root devadmin 10936944 Oct 28 04:26 exp-bbv-amd64-solaris -rwxrwxr-x 1 root devadmin 10976704 Oct 28 04:26 exp-dhat-amd64-solaris -rwxrwxr-x 1 root devadmin 11136272 Oct 28 04:26 exp-sgcheck-amd64-solaris -rwxrwxr-x 1 root devadmin 16080 Oct 28 04:26 getoff-amd64-solaris -rwxrwxr-x 1 root devadmin 11705496 Oct 28 04:26 helgrind-amd64-solaris -rwxrwxr-x 1 root devadmin 10954104 Oct 28 04:26 lackey-amd64-solaris <SNIP> but nothing ‘x86-solaris’ - perhaps that’s the issue? Thanks, Rob Boehne |
From: David C. <dcc...@ac...> - 2017-12-08 10:08:09
|
On 12/7/2017 1:36 PM, Yusuf Pisan wrote: > > valgrind report memory as being reachable when I think it has been > properly freed in the below program. Is this a bug, a feature, a > misunderstanding of how to use delete by me? > > Thanks > > Yusuf > > =================== > > #include <iostream> > > using namespace std; > > > int test() { > > int* p = new int[5]; > > delete [] p; > > return 0; > > } > > > int main() { > > test(); > > return 0; > > } > > > =================== > > $ uname -a > > Linux uw1-320-10 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 14:24:03 > UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > > > $ valgrind --version > > valgrind-3.11.0 > > > $ g++ --version > > g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 > > > $ g++ -g -Wall -Wextra valgrind-example.cpp -o valgrind-example > > > $ valgrind --leak-check=full --show-leak-kinds=all ./valgrind-example > > ==11597== Memcheck, a memory error detector > > ==11597== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. > > ==11597== Using Valgrind-3.11.0 and LibVEX; rerun with -h for > copyright info > > ==11597== Command: ./valgrind-example > > ==11597== > > ==11597== > > ==11597== HEAP SUMMARY: > > ==11597== in use at exit: 72,704 bytes in 1 blocks > > ==11597== total heap usage: 2 allocs, 1 frees, 72,724 bytes allocated > > ==11597== > > ==11597== 72,704 bytes in 1 blocks are still reachable in loss record > 1 of 1 > > ==11597==at 0x4C2DB8F: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > > ==11597==by 0x4EC3EFF: ??? (in > /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) > > ==11597==by 0x40106B9: call_init.part.0 (dl-init.c:72) > > ==11597==by 0x40107CA: call_init (dl-init.c:30) > > ==11597==by 0x40107CA: _dl_init (dl-init.c:120) > > ==11597==by 0x4000C69: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so > <http://ld-2.23.so>) > > ==11597== > > ==11597== LEAK SUMMARY: > > ==11597==definitely lost: 0 bytes in 0 blocks > > ==11597==indirectly lost: 0 bytes in 0 blocks > > ==11597==possibly lost: 0 bytes in 0 blocks > > ==11597==still reachable: 72,704 bytes in 1 blocks > > ==11597== suppressed: 0 bytes in 0 blocks > > ==11597== > > ==11597== For counts of detected and suppressed errors, rerun with: -v > > ==11597== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > $ > > Look at the call chain - the still-reachable memory is allocated outside of your code. It looks like the fault is in the C++ startup library that is provided by your Linux installation. You could add a valgrind suppression to hide this "false positive" (as far as your program is concerned) but I've never bothered; I simply learn to ignore them. -- David Chapman dcc...@ac... Chapman Consulting -- San Jose, CA EDA Software Developer, Expert Witness www.chapman-consulting-sj.com |
From: Yusuf P. <pi...@uw...> - 2017-12-07 21:52:00
|
valgrind report memory as being reachable when I think it has been properly freed in the below program. Is this a bug, a feature, a misunderstanding of how to use delete by me? Thanks Yusuf =================== #include <iostream> using namespace std; int test() { int* p = new int[5]; delete [] p; return 0; } int main() { test(); return 0; } =================== $ uname -a Linux uw1-320-10 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 14:24:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux $ valgrind --version valgrind-3.11.0 $ g++ --version g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 $ g++ -g -Wall -Wextra valgrind-example.cpp -o valgrind-example $ valgrind --leak-check=full --show-leak-kinds=all ./valgrind-example ==11597== Memcheck, a memory error detector ==11597== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==11597== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==11597== Command: ./valgrind-example ==11597== ==11597== ==11597== HEAP SUMMARY: ==11597== in use at exit: 72,704 bytes in 1 blocks ==11597== total heap usage: 2 allocs, 1 frees, 72,724 bytes allocated ==11597== ==11597== 72,704 bytes in 1 blocks are still reachable in loss record 1 of 1 ==11597== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==11597== by 0x4EC3EFF: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21) ==11597== by 0x40106B9: call_init.part.0 (dl-init.c:72) ==11597== by 0x40107CA: call_init (dl-init.c:30) ==11597== by 0x40107CA: _dl_init (dl-init.c:120) ==11597== by 0x4000C69: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so) ==11597== ==11597== LEAK SUMMARY: ==11597== definitely lost: 0 bytes in 0 blocks ==11597== indirectly lost: 0 bytes in 0 blocks ==11597== possibly lost: 0 bytes in 0 blocks ==11597== still reachable: 72,704 bytes in 1 blocks ==11597== suppressed: 0 bytes in 0 blocks ==11597== ==11597== For counts of detected and suppressed errors, rerun with: -v ==11597== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) $ |
From: Philippe W. <phi...@sk...> - 2017-12-07 19:40:00
|
On Thu, 2017-12-07 at 12:39 +0000, Silva João wrote: > > If you have 39 tasks in Runnable state, I guess that they are not all > > blocked in libc read ? > > So, you might investigate which task(s) are really still doing something > > by doing e.g. > > thread apply all bt > > and/or put breakpoints at places that you know should be soon > > encountered > > by the runnable tasks and continue the execution. > > And then control-c, and redo the above to see if some tasks are/have > > still > > progressed. > > > > Also, from gdb, you can do > > monitor v.info scheduler > > to have the valgrind status of the tasks/threads. > > Thanks for the commands. > > All 38 threads are waiting on the 1st one (pthread_cond_wait). The first one is blocked on the file read. Then you have to understand what this task is doing. Isn't the backtrace pointing at what the code is doing and what this read could be ? Look at the file descriptor on which it is reading and see what this fd is ? Is it a real file ? (unlikely to be blocking then) Is it a pipe ? A tcp/ip connection ? Use lsof if you cannot determine in gdb what this fd is for. And then you have to guess why this read does not return. > > > You can also use the option --trace-sched=yes to see how and if valgrind > > still schedules the threads. > > > > Note that at my work, we are using valgrind + Ada tasks without > > any particular problem. > > The trace-sched option produces a lot of output continuously: > > --20282-- SCHED[1]: releasing lock (VG_(scheduler):timeslice) -> VgTs_Yielding > --20282-- SCHED[1]: acquired lock (VG_(scheduler):timeslice) > --20282-- SCHED[1]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys > --20282-- SCHED[1]: acquired lock (VG_(client_syscall)[async]) This shows that on valgrind side, task 1 is not blocked and valgrind still let it run.. Now, you might understand what is happening with the above suggestions (gdb backtrace, lsof, ...). If the above does not clarify, you might learn more/have a hint by adding --trace-syscalls=yes --trace-signals=yes You might also compare the syscalls executed natively and under valgrind by using strace. In this comparison, you will have to take into account that valgrind changes the way threads are scheduled, and valgrind introduces some syscalls for its own internal kitchen. So, the comparison is not mechanical ... > > > Possibly also, the change in the way the threads are scheduled causes an > > application deadlock. > > > > You might thus also try --tool=helgrind just in case this would reveal > > some > > non thread safe bug ... > > There seems to be some thread errors like: > > Lock at 0xD92BC0 was first observed > Possible data race during read of size 8 at 0x7B459FD8 by thread #1 > This conflicts with a previous write of size 8 by thread #2 > > Or can these be false positives? This can be false positive of course, and of course, this can be a true positive :). With only an address, no access to the code, no backtrace, no reproducer, there is not much feedback we can give. Let me just tell that at my work, we have added for helgrind a few suppression entries related to the 'low level implementation of the gnat runtime', to suppress false positive created by the low level inner working of the runtime. To see what you case is, the minimum needed would be the stack traces of the error msg. In summary, at this point, it looks like you have to debug your application when running under valgrind, and then you might determine if what you see is a real application bug, or a valgrind bug/limitation e.g. in the valgrind scheduler/signal handling/syscall handling or whatever. At this state, without further info, let's assume you have an application bug :) Philippe |
From: Silva J. <joa...@al...> - 2017-12-07 12:40:10
|
> If you have 39 tasks in Runnable state, I guess that they are not all > blocked in libc read ? > So, you might investigate which task(s) are really still doing something > by doing e.g. > thread apply all bt > and/or put breakpoints at places that you know should be soon > encountered > by the runnable tasks and continue the execution. > And then control-c, and redo the above to see if some tasks are/have > still > progressed. > > Also, from gdb, you can do > monitor v.info scheduler > to have the valgrind status of the tasks/threads. Thanks for the commands. All 38 threads are waiting on the 1st one (pthread_cond_wait). The first one is blocked on the file read. > You can also use the option --trace-sched=yes to see how and if valgrind > still schedules the threads. > > Note that at my work, we are using valgrind + Ada tasks without > any particular problem. The trace-sched option produces a lot of output continuously: --20282-- SCHED[1]: releasing lock (VG_(scheduler):timeslice) -> VgTs_Yielding --20282-- SCHED[1]: acquired lock (VG_(scheduler):timeslice) --20282-- SCHED[1]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys --20282-- SCHED[1]: acquired lock (VG_(client_syscall)[async]) > Possibly also, the change in the way the threads are scheduled causes an > application deadlock. > > You might thus also try --tool=helgrind just in case this would reveal > some > non thread safe bug ... There seems to be some thread errors like: Lock at 0xD92BC0 was first observed Possible data race during read of size 8 at 0x7B459FD8 by thread #1 This conflicts with a previous write of size 8 by thread #2 Or can these be false positives? João M. S. Silva |
From: Philippe W. <phi...@sk...> - 2017-12-06 18:29:23
|
On Wed, 2017-12-06 at 13:00 +0000, Silva João wrote: > From info task, all tasks are in "Runnable" state. There are 39 in this list. > > From gdb the program seems blocked reading a file (libc read()). If you have 39 tasks in Runnable state, I guess that they are not all blocked in libc read ? So, you might investigate which task(s) are really still doing something by doing e.g. thread apply all bt and/or put breakpoints at places that you know should be soon encountered by the runnable tasks and continue the execution. And then control-c, and redo the above to see if some tasks are/have still progressed. Also, from gdb, you can do monitor v.info scheduler to have the valgrind status of the tasks/threads. You can also use the option --trace-sched=yes to see how and if valgrind still schedules the threads. Note that at my work, we are using valgrind + Ada tasks without any particular problem. > > But when I list (l) in gdb it does not show the correct line. It seems the synchronization with the source file is not correct. > > > As you are using Ada/SPARK, I guess you use a special tasking profile > > (e.g. ravenscar or similar). I have no idea how such tasking profile > > interacts with the single thread at a time model of Valgrind. > > > > You might try --fair-sched=yes to see if that helps. > > I tried but without success. > > > Also, try also --tool=none, just to see if the problem/blockage status > > is linked to memcheck, or to the valgrind scheduling and Ada/SPARK > > interaction. > > With nulgrind it still "hangs" so I seems related to the tread issue you mention. Possibly also, the change in the way the threads are scheduled causes an application deadlock. You might thus also try --tool=helgrind just in case this would reveal some non thread safe bug ... Philippe |
From: Silva J. <joa...@al...> - 2017-12-06 13:00:28
|
Thanks. > The above error is triggered because you are using the gnat 'used stack' > measurement package. This package 'paints' the stack to see what has > been consumed. It paints more than the program really uses (of course > :), > and so it is completely normal that memcheck reports an error. OK, I have removed switch -fstack-usage and the error vanished. > As discussed above, the error 'looks' normal, and should IMO be ignored > (you might want to disable the stack measurement functionality). > > To see why your program is blocked, use vgdb as suggested by Ivo. > Then do 'info task' to see the status of the Ada tasks, and if they > are blocked, and on what. From info task, all tasks are in "Runnable" state. There are 39 in this list. From gdb the program seems blocked reading a file (libc read()). But when I list (l) in gdb it does not show the correct line. It seems the synchronization with the source file is not correct. > As you are using Ada/SPARK, I guess you use a special tasking profile > (e.g. ravenscar or similar). I have no idea how such tasking profile > interacts with the single thread at a time model of Valgrind. > > You might try --fair-sched=yes to see if that helps. I tried but without success. > Also, try also --tool=none, just to see if the problem/blockage status > is linked to memcheck, or to the valgrind scheduling and Ada/SPARK > interaction. With nulgrind it still "hangs" so I seems related to the tread issue you mention. João M. S. Silva |
From: Philippe W. <phi...@sk...> - 2017-12-05 19:15:38
|
On Tue, 2017-12-05 at 15:41 +0000, Silva João wrote: > > > ==44156== Thread 2 FDP_MRP_Recover_: > > > ==44156== Invalid write of size 4 > > > ==44156== at 0x78BB33: system__stack_usage__fill_stack (in > > > > /u/wh/rel/ifaplrel/pw_fwp_engine.eab) > > > ==44156== by 0x75C49B: system__tasking__stages__task_wrapper (in > > > > /u/wh/rel/ifaplrel/pw_fwp_engine.eab) > > > ==44156== by 0x79E1CDC4: start_thread (in /usr/lib64/libpthread- > > > > 2.17.so) > > > ==44156== by 0x7ADCC76C: clone (in /usr/lib64/libc-2.17.so) > > > ==44156== Address 0x78972a08 is on thread 2's stack > > > ==44156== 272 bytes below stack pointer > > > > This is probably a valid complaint. The above error is triggered because you are using the gnat 'used stack' measurement package. This package 'paints' the stack to see what has been consumed. It paints more than the program really uses (of course :), and so it is completely normal that memcheck reports an error. > > Does the program complete afterwards? > > It does not stop, but seems to stay in a "numb" state. > > But if running outside of Valgrind it seems to run OK. > > > To check for that problem, start Valgrind with vgdb enabled as per: > > Assuming this problem is the first one reported, pass "--vgdb-error=1". As discussed above, the error 'looks' normal, and should IMO be ignored (you might want to disable the stack measurement functionality). To see why your program is blocked, use vgdb as suggested by Ivo. Then do 'info task' to see the status of the Ada tasks, and if they are blocked, and on what. As you are using Ada/SPARK, I guess you use a special tasking profile (e.g. ravenscar or similar). I have no idea how such tasking profile interacts with the single thread at a time model of Valgrind. You might try --fair-sched=yes to see if that helps. Also, try also --tool=none, just to see if the problem/blockage status is linked to memcheck, or to the valgrind scheduling and Ada/SPARK interaction. Philippe |
From: Silva J. <joa...@al...> - 2017-12-05 15:42:15
|
> Are you running ./autogen.sh && ./configure so that changes from > configure.ac get in effect? > You can also do 'make distclean' before './autogen.sh && ./configure' > so as to clean up everything. Yes, I am but it doesn't work. I guess this might be a bug in configure or the makefile? Running make clean requires to build everything again, which takes some minutes. > > ==44156== Thread 2 FDP_MRP_Recover_: > > ==44156== Invalid write of size 4 > > ==44156== at 0x78BB33: system__stack_usage__fill_stack (in > /u/wh/rel/ifaplrel/pw_fwp_engine.eab) > > ==44156== by 0x75C49B: system__tasking__stages__task_wrapper (in > /u/wh/rel/ifaplrel/pw_fwp_engine.eab) > > ==44156== by 0x79E1CDC4: start_thread (in /usr/lib64/libpthread- > 2.17.so) > > ==44156== by 0x7ADCC76C: clone (in /usr/lib64/libc-2.17.so) > > ==44156== Address 0x78972a08 is on thread 2's stack > > ==44156== 272 bytes below stack pointer > > This is probably a valid complaint. > Does the program complete afterwards? It does not stop, but seems to stay in a "numb" state. But if running outside of Valgrind it seems to run OK. > To check for that problem, start Valgrind with vgdb enabled as per: > Assuming this problem is the first one reported, pass "--vgdb-error=1". > > After gdb stops on encountering that particular problem, print the > Valgrind address space layout (SEGMENTS) with: > (gdb) monitor v.info memory [aspacemgr] > > Also print the registers, in particular stack pointer (rsp) and the > function disassembly (disas in gdb). > > You can either correlate this output by yourself or post it here. These are the results: (gdb) monitor v.info memory 2,110,115,840 bytes have already been mmap-ed ANONYMOUS. --50867-- core : 8,388,608/ 8,388,608 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 3,927,928/ 3,804,944 max/curr, 23732/ 29302048 totalloc-blocks/bytes, 23720 searches 8 rzB --50867-- dinfo : 26,296,320/ 19,283,968 max/curr mmap'd, 2/15 unsplit/split sb unmmap'd, 25,664,864/ 16,859,824 max/curr, 331843/ 102928896 totalloc-blocks/bytes, 346351 searches 8 rzB --50867-- client : 4,194,304/ 4,194,304 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 1,738,032/ 1,738,032 max/curr, 11/ 1738032 totalloc-blocks/bytes, 10 searches 24 rzB --50867-- demangle: 65,536/ 65,536 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 800/ 544 max/curr, 24/ 3584 totalloc-blocks/bytes, 23 searches 8 rzB --50867-- ttaux : 221,184/ 221,184 max/curr mmap'd, 0/1 unsplit/split sb unmmap'd, 167,616/ 115,584 max/curr, 695/ 306496 totalloc-blocks/bytes, 694 searches 8 rzB (gdb) monitor v.info memory aspacemgr 2,110,115,840 bytes have already been mmap-ed ANONYMOUS. --50867-- core : 8,388,608/ 8,388,608 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 3,927,928/ 3,804,944 max/curr, 23747/ 29435632 totalloc-blocks/bytes, 23735 searches 8 rzB --50867-- dinfo : 26,296,320/ 19,283,968 max/curr mmap'd, 2/15 unsplit/split sb unmmap'd, 25,664,864/ 16,859,824 max/curr, 331843/ 102928896 totalloc-blocks/bytes, 346351 searches 8 rzB --50867-- client : 4,194,304/ 4,194,304 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 1,738,032/ 1,738,032 max/curr, 11/ 1738032 totalloc-blocks/bytes, 10 searches 24 rzB --50867-- demangle: 65,536/ 65,536 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 800/ 544 max/curr, 24/ 3584 totalloc-blocks/bytes, 23 searches 8 rzB --50867-- ttaux : 221,184/ 221,184 max/curr mmap'd, 0/1 unsplit/split sb unmmap'd, 167,616/ 115,584 max/curr, 695/ 306496 totalloc-blocks/bytes, 694 searches 8 rzB (gdb) p $rsp $1 = (access void) 0x78972b18 (gdb) disas Dump of assembler code for function system__stack_usage__fill_stack: 0x000000000078bae0 <+0>: movslq 0x2c(%rdi),%rdx 0x000000000078bae4 <+4>: mov 0x20(%rdi),%rsi 0x000000000078bae8 <+8>: lea -0xc(%rsp),%r8 0x000000000078baed <+13>: mov %rsi,%rcx 0x000000000078baf0 <+16>: sub %rdx,%rcx 0x000000000078baf3 <+19>: mov %rdx,%rax 0x000000000078baf6 <+22>: lea -0x10c(%rsp),%rdx 0x000000000078bafe <+30>: cmp %rdx,%rcx 0x000000000078bb01 <+33>: ja 0x78bb40 <system__stack_usage__fill_stack+96> 0x000000000078bb03 <+35>: cmp %rdx,%rsi 0x000000000078bb06 <+38>: mov %rcx,0x38(%rdi) 0x000000000078bb0a <+42>: jbe 0x78bb18 <system__stack_usage__fill_stack+56> 0x000000000078bb0c <+44>: lea -0x100(%r8),%eax 0x000000000078bb13 <+51>: sub %ecx,%eax 0x000000000078bb15 <+53>: mov %eax,0x2c(%rdi) 0x000000000078bb18 <+56>: lea 0x3(%rax),%edx 0x000000000078bb1b <+59>: test %eax,%eax 0x000000000078bb1d <+61>: mov %rcx,0x48(%rdi) 0x000000000078bb21 <+65>: cmovns %eax,%edx 0x000000000078bb24 <+68>: sar $0x2,%edx 0x000000000078bb27 <+71>: test %edx,%edx 0x000000000078bb29 <+73>: movslq %edx,%rax 0x000000000078bb2c <+76>: jle 0x78bb3d <system__stack_usage__fill_stack+93> 0x000000000078bb2e <+78>: xchg %ax,%ax 0x000000000078bb30 <+80>: mov 0x30(%rdi),%edx => 0x000000000078bb33 <+83>: mov %edx,-0x4(%rcx,%rax,4) 0x000000000078bb37 <+87>: sub $0x1,%rax 0x000000000078bb3b <+91>: jne 0x78bb30 <system__stack_usage__fill_stack+80> 0x000000000078bb3d <+93>: retq 0x000000000078bb3e <+94>: xchg %ax,%ax 0x000000000078bb40 <+96>: movl $0x0,0x2c(%rdi) 0x000000000078bb47 <+103>: retq End of assembler dump. I've never gone so deep in analyzing Valgrind's output, so this is a bit unknown to me :P > P.S. I would rather double check again why pw_fwp_engine.eab is > allocating such as huge BSS segment. That executable is quite a tiny > one (approx 4MB file size)? It's from what I said before: in this program all memory is allocated statically. No dynamic memory allocation is allowed. This, in Ada/SPARK. The C/C++ parts, which relate to the utilization of libxerces may use dynamic memory. This is the part we want to analyze. So every bit of memory that the program needs is allocated from the beginning. For instance, the arrays are allocated with the worst case length. João M. S. Silva |