You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Julian S. <jse...@gm...> - 2022-08-05 15:22:04
|
On 05/08/2022 16:08, Tom Hughes via Valgrind-users wrote: > If you want to know for sure who killed it then strace it while > it runs and it should show you who sends the signel but my bet is > that it's the kernel. Or possibly watch `dmesg -w` running in another shell. J |
From: John R. <jr...@bi...> - 2022-08-05 15:11:27
|
> When running memcheck on a massive monolith embedded executable (237MB stripped, 1.8GiB unstripped), after I stop the executable under valgrind I see the “HEAP SUMMARY” but then valgrind dies before any leak reports are printed. If finding memory leaks is the only goal (for instance, if you are satisfied that memcheck has found all the overrun blocks, uninitialized reads, etc.) then https://github.com/KDE/heaptrack is the best tool. The data-gathering phase runs in any Linux process using LD_PRELOAD and libunwind. The analysis phase runs a GUI under KDE, and/or generates *useful* text reports: leaks by individual size, leaks by total size for a given traceback, allocations (leaked or not) by frequency or total size, etc. I like the text-only analysis, which avoids the requirement for KDE. Heaptrack CPU overhead tends to be around 20% or less, so it does not take forever. Heaptrack does require disk space to record data (sequential access only), so you may need several gigabytes (locally or via network.) |
From: Tom H. <to...@co...> - 2022-08-05 14:08:25
|
On 05/08/2022 14:09, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > When running memcheck on a massive monolith embedded executable (237MB > stripped, 1.8GiB unstripped), after I stop the executable under valgrind > I see the “HEAP SUMMARY” but then valgrind dies before any leak reports > are printed. The parent process sees that the return status of memcheck > is that it was SIGKILLed (status returned in waitpid call is ‘9’). I am > 99.9% sure that the parent process is not the one sending the SIGKILL. > Is it possible that valgrind SIGKILLs itself? Is there a reason that the > linux kernel (Wind River Linux) could be sending a SIGKILL to > valgrind/memcheck? I do not see any messages about Out of Memory/OOM > killer killing valgrind. Previous experience with this executable is > that there are almost 3 million leak reports (most of them are “still > reachable”), could that be occupying too much memory. Any ideas/advice > to figure out what is going on? Almost certainly the kernel OOM kiied it. If you want to know for sure who killed it then strace it while it runs and it should show you who sends the signel but my bet is that it's the kernel. > One thing I see in the logs is about “unhandled ioctl 0xa5 with no > size/direction hints”. Could this be a trigger for this crash/sigkill? Not really, no. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-05 13:42:43
|
When running memcheck on a massive monolith embedded executable (237MB stripped, 1.8GiB unstripped), after I stop the executable under valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak reports are printed. The parent process sees that the return status of memcheck is that it was SIGKILLed (status returned in waitpid call is '9'). I am 99.9% sure that the parent process is not the one sending the SIGKILL. Is it possible that valgrind SIGKILLs itself? Is there a reason that the linux kernel (Wind River Linux) could be sending a SIGKILL to valgrind/memcheck? I do not see any messages about Out of Memory/OOM killer killing valgrind. Previous experience with this executable is that there are almost 3 million leak reports (most of them are "still reachable"), could that be occupying too much memory. Any ideas/advice to figure out what is going on? We don't seem to get the sigkill if valgrind/memcheck is stopped earlier in the life of this executable. But to find the leak I need it to run past that point. I've tried many different versions of valgrind that have worked to find leaks on this executable in the past (3.16.1, 3.18.1, 3.19.0) but they all have this same issue of being sigkilled before any leaks get printed. One thing I see in the logs is about "unhandled ioctl 0xa5 with no size/direction hints". Could this be a trigger for this crash/sigkill? Would appreciate any ideas/advice. Thanks, Rob |
From: Paul F. <pj...@wa...> - 2022-08-04 06:50:58
|
Hi Sgcheck never got beyond experimental and was removed from Valgrind a few versions ago. My advice is simply to not use it. A+ Paul > On 4 Aug 2022, at 07:45, Pahome Chen via Valgrind-users <val...@li...> wrote: > > > Dear all, > > I read the sgcheck’s doc and know it’s a experimental tool, but it seems found no error even a very simple program. > Does this still work or need to wait for another version? > > Below is from my script and experiment, and fail in Valgrind-3.8.1/3.9.0/3.10.0/3.11.0/3.12.0 > E.g. > >$ cat test_valgrind.c > #include<stdio.h> > #include<stdlib.h> > int main() > { > int val[10] = {0}; > int tmp = val[1], i = 0; > tmp += val[15]; // array overrun > tmp *= val[20]; // array overrun > for (i=0; i<20; ++i) { int tmp = val[i]; } // array overrun > return 0; > } > > When I run above version of Valgrind mentioned, it always comes out following message. > > ==11673== exp-sgcheck, a stack and global array overrun detector > ==11673== NOTE: This is an Experimental-Class Valgrind Tool > ==11673== Copyright (C) 2003-2015, and GNU GPL'd, by OpenWorks Ltd et al. > ==11673== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info > ==11673== Command: ./run > ==11673== > > exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible' happened. > > host stacktrace: > ==11673== at 0x3800CC09: show_sched_status_wrk (m_libcassert.c:343) > ==11673== by 0x3800CEF4: report_and_quit (m_libcassert.c:415) > ==11673== by 0x3800D127: vgPlain_assert_fail (m_libcassert.c:481) > ==11673== by 0x38004A03: sg_instrument_IRStmt (sg_main.c:2332) > ==11673== by 0x380003B3: h_instrument (h_main.c:683) > ==11673== by 0x3802968D: tool_instrument_then_gdbserver_if_needed (m_translate.c:238) > ==11673== by 0x380D3290: LibVEX_Translate (main_main.c:934) > ==11673== by 0x380271BF: vgPlain_translate (m_translate.c:1765) > ==11673== by 0x3805F857: vgPlain_scheduler (scheduler.c:1048) > ==11673== by 0x38090445: run_a_thread_NORETURN (syswrap-linux.c:102) > > sched status: > running_tid=1 > > Thread 1: status = VgTs_Runnable (lwpid 11673) > ==11673== at 0x40169EA: _dl_runtime_resolve_xsave (in /usr/lib64/ld-2.17.so) > ==11673== by 0x1B: ??? > ==11673== by 0x40057F: ??? (in /PATH/peihung/test/run) > ==11673== by 0xFFEFFF517: ??? > > > Note: see also the FAQ in the source distribution. > It contains workarounds to several common problems. > In particular, if Valgrind aborted or crashed after > identifying problems in your program, there's a good chance > that fixing those problems will prevent Valgrind aborting or > crashing, especially if it happened in m_mallocfree.c. > > My machine environment is Centos7 and x86_64 > > > > > Thanks. > Best regards, > Pahome > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Pahome C. <Pei...@sy...> - 2022-08-04 05:44:46
|
Dear all, I read the sgcheck's doc and know it's a experimental tool, but it seems found no error even a very simple program. Does this still work or need to wait for another version? Below is from my script and experiment, and fail in Valgrind-3.8.1/3.9.0/3.10.0/3.11.0/3.12.0 E.g. >$ cat test_valgrind.c #include<stdio.h> #include<stdlib.h> int main() { int val[10] = {0}; int tmp = val[1], i = 0; tmp += val[15]; // array overrun tmp *= val[20]; // array overrun for (i=0; i<20; ++i) { int tmp = val[i]; } // array overrun return 0; } When I run above version of Valgrind mentioned, it always comes out following message. ==11673== exp-sgcheck, a stack and global array overrun detector ==11673== NOTE: This is an Experimental-Class Valgrind Tool ==11673== Copyright (C) 2003-2015, and GNU GPL'd, by OpenWorks Ltd et al. ==11673== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==11673== Command: ./run ==11673== exp-sgcheck: sg_main.c:2332 (sg_instrument_IRStmt): the 'impossible' happened. host stacktrace: ==11673== at 0x3800CC09: show_sched_status_wrk (m_libcassert.c:343) ==11673== by 0x3800CEF4: report_and_quit (m_libcassert.c:415) ==11673== by 0x3800D127: vgPlain_assert_fail (m_libcassert.c:481) ==11673== by 0x38004A03: sg_instrument_IRStmt (sg_main.c:2332) ==11673== by 0x380003B3: h_instrument (h_main.c:683) ==11673== by 0x3802968D: tool_instrument_then_gdbserver_if_needed (m_translate.c:238) ==11673== by 0x380D3290: LibVEX_Translate (main_main.c:934) ==11673== by 0x380271BF: vgPlain_translate (m_translate.c:1765) ==11673== by 0x3805F857: vgPlain_scheduler (scheduler.c:1048) ==11673== by 0x38090445: run_a_thread_NORETURN (syswrap-linux.c:102) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 11673) ==11673== at 0x40169EA: _dl_runtime_resolve_xsave (in /usr/lib64/ld-2.17.so) ==11673== by 0x1B: ??? ==11673== by 0x40057F: ??? (in /PATH/peihung/test/run) ==11673== by 0xFFEFFF517: ??? Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. My machine environment is Centos7 and x86_64 Thanks. Best regards, Pahome |
From: Mark W. <ma...@kl...> - 2022-08-03 17:00:55
|
Hi, On Wed, Aug 03, 2022 at 05:32:56PM +0100, Tom Hughes wrote: > No, it's clone3: > > https://bugs.kde.org/show_bug.cgi?id=420906 So please upgrade to valgrind 3.18.0 or higher (latest is 3.19.0). Cheers, Mark |
From: Mark R. <ma...@cs...> - 2022-08-03 16:58:33
|
Just tried running the Valgrind test suite on WSL2 (win 10, Ubuntu 22.04). I'm not surprised that there were lots of failures. But the majority were: WARNING: unhandled amd64-linux syscall: 435 I suspect WSL is not a platform you care much about, but looking at syswrap for Darwin I see this might be pid hibernate? Would it be difficult to add support for this? Thank you, Mark -----Original Message----- From: Mark Wielaard [mailto:ma...@kl...] Sent: Monday, August 1, 2022 4:56 AM To: Tom Hughes <to...@co...>; Mark Roberts <ma...@cs...>; val...@li... Subject: Re: [Valgrind-users] new error message from Valgrind On Thu, 2022-07-28 at 22:22 +0100, Tom Hughes via Valgrind-users wrote: > On 28/07/2022 21:39, Mark Roberts wrote: > > I recently upgraded from Ubutu 20.04 to 22.04 and am now getting a > > new error message from Valgrind: > > > > --915-- WARNING: unhandled amd64-linux syscall: 334 > > > > --915-- You may be able to write your own handler. > > > > --915-- Read the file README_MISSING_SYSCALL_OR_IOCTL. > > > > --915-- Nevertheless we consider this a bug. Please report > > > > --915-- it at http://valgrind.org/support/bug_reports.html > > <http://valgrind.org/support/bug_reports.html>. > > > > Using same version of Valgrind as before (3.17). > > > > Any ideas as to what’s happening? > > Yes, your libc has started trying to use rseq. > > It's harmless - the next version of valgrind will silently reject it > with ENOSYS which is what is happening now anyway just with a warning. Where the next version of valgrind is 3.19.0 which is already released (in April). So you might just want to upgrade your valgrind. If you want to backport to older versions then the commit that got rid of the warning was: commit 1024237358f01009fe233cb1294f3b8211304eaa Author: Mark Wielaard <ma...@kl...> Date: Fri Dec 10 17:41:59 2021 +0100 Implement linux rseq syscall as ENOSYS This implements rseq for amd64, arm, arm64, ppc32, ppc64, s390x and x86 linux as ENOSYS (without warning). glibc will start using rseq to accelerate sched_getcpu, if available. This would cause a warning from valgrind every time a new thread is started. Real rseq (restartable sequences) support is pretty hard, so for now just explicitly return ENOSYS (just like we do for clone3). https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html Cheers, Mark |
From: Tom H. <to...@co...> - 2022-08-03 16:33:26
|
No, it's clone3: https://bugs.kde.org/show_bug.cgi?id=420906 Tom On 03/08/2022 17:30, Mark Roberts wrote: > Just tried running the Valgrind test suite on WSL2 (win 10, Ubuntu 22.04). > I'm not surprised that there were lots of failures. But the majority were: > WARNING: unhandled amd64-linux syscall: 435 > > I suspect WSL is not a platform you care much about, but looking at syswrap > for Darwin I see this might be pid hibernate? Would it be difficult to add > support for this? > > Thank you, > Mark > > -----Original Message----- > From: Mark Wielaard [mailto:ma...@kl...] > Sent: Monday, August 1, 2022 4:56 AM > To: Tom Hughes <to...@co...>; Mark Roberts <ma...@cs...>; > val...@li... > Subject: Re: [Valgrind-users] new error message from Valgrind > > On Thu, 2022-07-28 at 22:22 +0100, Tom Hughes via Valgrind-users wrote: >> On 28/07/2022 21:39, Mark Roberts wrote: >>> I recently upgraded from Ubutu 20.04 to 22.04 and am now getting a >>> new error message from Valgrind: >>> >>> --915-- WARNING: unhandled amd64-linux syscall: 334 >>> >>> --915-- You may be able to write your own handler. >>> >>> --915-- Read the file README_MISSING_SYSCALL_OR_IOCTL. >>> >>> --915-- Nevertheless we consider this a bug. Please report >>> >>> --915-- it at http://valgrind.org/support/bug_reports.html >>> <http://valgrind.org/support/bug_reports.html>. >>> >>> Using same version of Valgrind as before (3.17). >>> >>> Any ideas as to what’s happening? >> >> Yes, your libc has started trying to use rseq. >> >> It's harmless - the next version of valgrind will silently reject it >> with ENOSYS which is what is happening now anyway just with a warning. > > Where the next version of valgrind is 3.19.0 which is already released (in > April). So you might just want to upgrade your valgrind. > > If you want to backport to older versions then the commit that got rid of > the warning was: > > commit 1024237358f01009fe233cb1294f3b8211304eaa > Author: Mark Wielaard <ma...@kl...> > Date: Fri Dec 10 17:41:59 2021 +0100 > > Implement linux rseq syscall as ENOSYS > > This implements rseq for amd64, arm, arm64, ppc32, ppc64, > s390x and x86 linux as ENOSYS (without warning). > > glibc will start using rseq to accelerate sched_getcpu, if > available. This would cause a warning from valgrind every > time a new thread is started. > > Real rseq (restartable sequences) support is pretty hard, so > for now just explicitly return ENOSYS (just like we do for clone3). > > > https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html > > Cheers, > > Mark -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Mark W. <ma...@kl...> - 2022-08-01 11:56:17
|
On Thu, 2022-07-28 at 22:22 +0100, Tom Hughes via Valgrind-users wrote: > On 28/07/2022 21:39, Mark Roberts wrote: > > I recently upgraded from Ubutu 20.04 to 22.04 and am now getting a > > new > > error message from Valgrind: > > > > --915-- WARNING: unhandled amd64-linux syscall: 334 > > > > --915-- You may be able to write your own handler. > > > > --915-- Read the file README_MISSING_SYSCALL_OR_IOCTL. > > > > --915-- Nevertheless we consider this a bug. Please report > > > > --915-- it at http://valgrind.org/support/bug_reports.html > > <http://valgrind.org/support/bug_reports.html>. > > > > Using same version of Valgrind as before (3.17). > > > > Any ideas as to what’s happening? > > Yes, your libc has started trying to use rseq. > > It's harmless - the next version of valgrind will silently > reject it with ENOSYS which is what is happening now anyway > just with a warning. Where the next version of valgrind is 3.19.0 which is already released (in April). So you might just want to upgrade your valgrind. If you want to backport to older versions then the commit that got rid of the warning was: commit 1024237358f01009fe233cb1294f3b8211304eaa Author: Mark Wielaard <ma...@kl...> Date: Fri Dec 10 17:41:59 2021 +0100 Implement linux rseq syscall as ENOSYS This implements rseq for amd64, arm, arm64, ppc32, ppc64, s390x and x86 linux as ENOSYS (without warning). glibc will start using rseq to accelerate sched_getcpu, if available. This would cause a warning from valgrind every time a new thread is started. Real rseq (restartable sequences) support is pretty hard, so for now just explicitly return ENOSYS (just like we do for clone3). https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html Cheers, Mark |
From: Tom H. <to...@co...> - 2022-07-28 21:23:19
|
On 28/07/2022 21:39, Mark Roberts wrote: > I recently upgraded from Ubutu 20.04 to 22.04 and am now getting a new > error message from Valgrind: > > --915-- WARNING: unhandled amd64-linux syscall: 334 > > --915-- You may be able to write your own handler. > > --915-- Read the file README_MISSING_SYSCALL_OR_IOCTL. > > --915-- Nevertheless we consider this a bug. Please report > > --915-- it at http://valgrind.org/support/bug_reports.html > <http://valgrind.org/support/bug_reports.html>. > > Using same version of Valgrind as before (3.17). > > Any ideas as to what’s happening? Yes, your libc has started trying to use rseq. It's harmless - the next version of valgrind will silently reject it with ENOSYS which is what is happening now anyway just with a warning. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Mark R. <ma...@cs...> - 2022-07-28 21:09:58
|
I recently upgraded from Ubutu 20.04 to 22.04 and am now getting a new error message from Valgrind: --915-- WARNING: unhandled amd64-linux syscall: 334 --915-- You may be able to write your own handler. --915-- Read the file README_MISSING_SYSCALL_OR_IOCTL. --915-- Nevertheless we consider this a bug. Please report --915-- it at http://valgrind.org/support/bug_reports.html. Using same version of Valgrind as before (3.17). Any ideas as to what’s happening? Thank you, Mark |
From: Mark W. <ma...@kl...> - 2022-07-27 11:43:43
|
It has been twenty years today since Valgrind 1.0 was released. Make sure to read Nicholas Nethercote’s Twenty years of Valgrind: https://nnethercote.github.io/2022/07/27/twenty-years-of-valgrind.html And learn about the early days, Valgrind "skins", the influence Valgrind had on raising the bar when it comes to correctness for C and C++ programs, and why a hacker on the Rust programming language still uses Valgrind. Happy birthday, Valgrind! |
From: John R. <jr...@bi...> - 2022-07-26 02:21:01
|
> Does memcheck include support for per-thread caching (now enabled by default in glibc)? Memcheck intercepts all calls on allocation and free-ing subroutines (malloc, free, calloc, realloc, memalign, posix_memalign, brk, sbrk, mmap, ...) and totally replaces each one with code that is internal to memcheck. The glibc routines never are called at all. Also, memcheck serializes all execution, and executes code from only one thread at a time. |
From: Glenn H. <gle...@gm...> - 2022-07-26 01:52:48
|
Does memcheck include support for per-thread caching (now enabled by default in glibc)? ᐧ |
From: Julian S. <jse...@gm...> - 2022-07-14 09:51:05
|
On 13/07/2022 16:38, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > We are trying to track down a suspected place in our code that keeps accumulating memory in a 'still reachable'. It sounds like you're trying to track down a "process lifetime" leak. You'd be better off using one of the heap profiling tools for that, either massif (--tool=massif) or dhat (--tool=dhat), but probably massif. You'll need to ask your package manager to install the massif-visualizer GUI. Run with --tool=massif --num-callers=12 (or 16 or whatever). Use the GUI to look at the resulting profile. After a bit of poking around it should be obvious where all your allocation is coming from. J |
From: John R. <jr...@bi...> - 2022-07-14 02:13:28
|
> We are trying to track down a suspected place in our code that keeps accumulating memory in a ‘still reachable’. > > When I turn on still reachable and run my process for a few hours and then stop the process to get the valgrind reports there are over 2.7 million loss records which are mostly still reachables. It would take forever for valgrind to print this out. It would take around one hour or less to *produce* the complete report without printing it. Re-direct stderr to a file, or use command-line options --xml-fd= or --xml-file=. See "valgrind --help" and/or the user manual for other options to control error reporting. Using any text editor on the report file, or inserting the 'sed' (or 'awk') stream editor into the pipeline of error output, enables filtering the error reports. > The large majority of “still reachable” that I want to ignore allocate just a few blocks. I would like to suppress these and only output “still reachables” that allocated 100 blocks or more. Note that all the excluded reports (counts 1 through 99) have only 1 or 2 characters in their decimal representation, so you don't even need to convert the field to a number to decide. > If not possible without patching valgrind, any hints on where I could patch valgrind to accomplish this? Find the source location which prints the instance count, and adjust the code. This is "standard engineering effort" for a programmer who is adept at using 'grep', even without prior experience with the source code of valgrind. |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-07-13 20:12:02
|
We are trying to track down a suspected place in our code that keeps accumulating memory in a 'still reachable'. When I turn on still reachable and run my process for a few hours and then stop the process to get the valgrind reports there are over 2.7 million loss records which are mostly still reachables. It would take forever for valgrind to print this out. The large majority of "still reachable" that I want to ignore allocate just a few blocks. I would like to suppress these and only output "still reachables" that allocated 100 blocks or more. The suppression mechanism seems to only be to suppress particular backtraces. But I would like to suppress based on number of blocks instead, suppress loss records with a small number of blocks. Is this possible to suppress based on block count without patching valgrind? If not possible without patching valgrind, any hints on where I could patch valgrind to accomplish this? Thanks, Rob |
From: Cédric P. <cp...@se...> - 2022-07-12 15:46:13
|
> > I have 2 different boards running QNX 6.5 and mounting the exact same file system. > > One board is based on an NXP iMX53 SoC and the other one on a Texas Instrument AM3352. > > Since both SoC share the same instruction set (Cortex A8 - amv7le), they can run the same binaries. > > > > However, whereas Valgrind 3.10.1 is working perfectly on the iMX53, it crashes on the AM3352 : > > ==1863697== Process terminating with default action of signal 11 > > (SIGSEGV): dumping core ==1863697== Bad permissions for mapped region at address 0x245C > > ==1863697== at 0x1E4CC: mprotect (mprotect.c:33 in /proc/boot/libc.so.3) > Run valgrind with "-d -d -d -v -v -v" and compare the two systems, paying particular attention to differences that involve "aspacem". Comparing output for iMX53 and AM3352, here is what I noticed : 1) aspacem lines are very similar (some values slightly change but not that much), except for those 2 lines : AM3352: aspacem 5: file 0000100000-0000100fff 4096 r-x-- d=0x409 i=103748 o=0 m=0 fnIdx=1 fname="/bin/echo" aspacem 6: file 0000101000-0000101fff 4096 rw--- d=0x409 i=103748 o=0 m=0 fnIdx=1 fname="/bin/echo" iMX53 : aspacem 5: file 0000100000-0000100fff 4096 r-x-- d=0x803 i=2951 o=0 m=0 fnIdx=1 fname="/bin/echo" aspacem 6: file 0000101000-0000101fff 4096 rw--- d=0x803 i=2951 o=0 m=0 fnIdx=1 fname="/bin/echo" 2) After the list of "Adding active redirection: ..." for each syscall (free, malloc, mallopt, bcopy, memcmp ...), it seems that we reach the crash : AM3352: REDIR: 0x261a8 (libc.so.3:mallopt) redirected to 0x85b74 (mallopt) gdbsrv VG core calling VG_(gdbserver_report_signal) vki_nr 11 SIGSEGV gdb_nr 11 SIGSEGV tid 1 iMX53 : REDIR: 0x5c484 (libc.so.3:memset) redirected to 0x88ce8 (memset) REDIR: 0x5cb34 (libc.so.3:strlen) redirected to 0x881c8 (strlen) REDIR: 0x5c6b4 (libc.so.3:strcmp) redirected to 0x886bc (strcmp) REDIR: 0x28714 (libc.so.3:malloc) redirected to 0x86928 (malloc) mallocfr newSuperblock at 0x102000 (pszB 4194288) owner CLIENT/client ... To be honest, I don't know how to interpret this. I don't even understand why REDIR in not called for "mallopt" on iMX53. > Does your QNX have tools such as gdb and strace or dtrace? > It will be helpful to know the address mappings at the time of the SIGSEGV. Yes, QNX have such tools. I tried to use vgdb and gdb, but I don't get ... anything ... (no backtrace, no info address, no info mem ...) (gdb) target remote 192.168.98.50:2346 Remote debugging using 192.168.98.50:2346 warning: Can not parse XML target description; XML support was disabled at compile time [New Thread 1] [Switching to Thread 1] 0x0003a190 in ?? () (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0001e4cc in ?? () (gdb) bt #0 0x0001e4cc in ?? () (gdb) info meminfo (gdb) I also recorded kernel and scheduling events but the only information I got is that during 5 seconds memcheck-arm-nto works a lot, and does a lot of read(), write() and fseek() calls. Finally, a signal 11 is received and a lot of close() calls are done and ... that's all. I suppose it doesn't help you that much ... -- _______________________________________________ Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: John R. <jr...@bi...> - 2022-07-11 14:34:13
|
> I have 2 different boards running QNX 6.5 and mounting the exact same file system. > One board is based on an NXP iMX53 SoC and the other one on a Texas Instrument AM3352. > Since both SoC share the same instruction set (Cortex A8 - amv7le), they can run the same binaries. > > However, whereas Valgrind 3.10.1 is working perfectly on the iMX53, it crashes on the AM3352 : > ==1863697== Process terminating with default action of signal 11 (SIGSEGV): dumping core > ==1863697== Bad permissions for mapped region at address 0x245C > ==1863697== at 0x1E4CC: mprotect (mprotect.c:33 in /proc/boot/libc.so.3) Run valgrind with "-d -d -d -v -v -v" and compare the two systems, paying particular attention to differences that involve "aspacem". Does your QNX have tools such as gdb and strace or dtrace? It will be helpful to know the address mappings at the time of the SIGSEGV. -- |
From: Cédric P. <cp...@se...> - 2022-07-11 12:53:09
|
> > Hi Valgrind community, > > > > I have 2 different boards running QNX 6.5 and mounting the exact same file system. > > One board is based on an NXP iMX53 SoC and the other one on a Texas Instrument AM3352. > > Since both SoC share the same instruction set (Cortex A8 - amv7le), they can run the same binaries. > > > > However, whereas Valgrind 3.10.1 is working perfectly on the iMX53, it crashes on the AM3352 : > Hi Cedric > 3.10 is quite old, can you try a newer version? No I cannot. The version of Valgrind I use was ported on QNX in 2015 by open QNX community (https://community.qnx.com/sf/go/projects.valgrind/frs.valgrind.valgrind_3_10) and, as far as I know, no newer version was ported since. > Also I recommend starting with the simplest thing possible (I see that you tried with echo, but I recommend using just "--tool=none" and no other options). Excellent idea. If I use "--tool=none", valgrind does not crash anymore (and I suppose it doesn't do much). At least, it seems to mean that the problem is linked to the tool I use. However, for all other tools I try (massif, drd, helgrind, exp-dhat) I face the same crash. > A+ > Paul _______________________________________________ Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Floyd, P. <pj...@wa...> - 2022-07-11 08:16:42
|
On 2022-07-11 09:43, Cédric Perles wrote: > Hi Valgrind community, > > I have 2 different boards running QNX 6.5 and mounting the exact same file system. > One board is based on an NXP iMX53 SoC and the other one on a Texas Instrument AM3352. > Since both SoC share the same instruction set (Cortex A8 - amv7le), they can run the same binaries. > > However, whereas Valgrind 3.10.1 is working perfectly on the iMX53, it crashes on the AM3352 : > Hi Cedric 3.10 is quite old, can you try a newer version? Also I recommend starting with the simplest thing possible (I see that you tried with echo, but I recommend using just "--tool=none" and no other options). A+ Paul |
From: Cédric P. <cp...@se...> - 2022-07-11 07:59:12
|
Hi Valgrind community, I have 2 different boards running QNX 6.5 and mounting the exact same file system. One board is based on an NXP iMX53 SoC and the other one on a Texas Instrument AM3352. Since both SoC share the same instruction set (Cortex A8 - amv7le), they can run the same binaries. However, whereas Valgrind 3.10.1 is working perfectly on the iMX53, it crashes on the AM3352 : pendant:/bin>/usr/bin/valgrind --tool=memcheck --allow-mismatched-debuginfo=yes --extra-debuginfo-path=/usr/lib/debug/ echo ==1863697== Memcheck, a memory error detector ==1863697== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==1863697== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==1863697== Command: echo ==1863697== ==1863697== ==1863697== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==1863697== Bad permissions for mapped region at address 0x245C ==1863697== at 0x1E4CC: mprotect (mprotect.c:33 in /proc/boot/libc.so.3) ==1863697== by 0x25B03: _band_get_aligned (band.c:450 in /proc/boot/libc.so.3) ==1863697== by 0x77383: ??? (in /proc/boot/libc.so.3) ==1863697== ==1863697== HEAP SUMMARY: ==1863697== in use at exit: 0 bytes in 0 blocks ==1863697== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==1863697== ==1863697== All heap blocks were freed -- no leaks are possible ==1863697== ==1863697== For counts of detected and suppressed errors, rerun with: -v ==1863697== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Memory fault It always crashes the same way, whatever the binary I analyse. I am clueless, I don't understand how it can work on iMX53 and fail on AM3352 with the same file system (same binaries, same libs, same scripts, same conf ...). Does somebody have an idea ? Best regards, Cédric |
From: Mathieu M. <ma...@de...> - 2022-07-01 10:17:42
|
On Wed, Jun 29, 2022 at 8:49 PM John Reiser <jr...@bi...> wrote: > > > Program received signal SIGILL, Illegal instruction. > > vgPlain_am_startup (sp_at_startup=3204445696) at > > m_aspacemgr/aspacemgr-linux.c:1626 > > 1626 init_nsegment(&seg); > > (gdb) x/i $pc > > => 0x58071090 <vgPlain_am_startup+20>: vmov.i32 d16, #0 ; 0x00000000 > > > As a reminder I do not have neon on this machine: > > > > Features : half thumb fastmult vfp edsp thumbee vfpv3 tls idiva > > idivt vfpd32 lpae > > Therefore this is a gcc configuration problem. Whoever configured the gcc > that was used to compile your valgrind assumed that neon would be present, > but your machine lacks neon. > > Run "gcc --verbose". On my RaspberryPi Model 2B I get (wrapped by hand > to reasonable line length): > ----- > Using built-in specs. > COLLECT_GCC=gcc > COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/10/lto-wrapper > Target: arm-linux-gnueabihf > Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-6' \ > --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs \ > --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr \ > --with-gcc-major-version-only --program-suffix=-10 --program-prefix=arm-linux-gnueabihf- \ > --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext \ > --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu \ > --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new \ > --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support \ > --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release \ > --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions \ > --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror \ > --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf \ > --target=arm-linux-gnueabihf > Thread model: posix > Supported LTO compression algorithms: zlib zstd > gcc version 10.2.1 20210110 (Debian 10.2.1-6) > ----- > where the important part now is "--with-arch=armv7-a --with-fpu=vfpv3-d16" > for which the "-d16" part restricts gcc to 16 double precision registers > even though my hardware has "vfpd32". > > Consulting "info gcc", then searching for "neon", and examining "ARM Options", > it seems to me that the fix for gcc to assume vfp3 floating point with 32 double- > precision registers and the half-precision floating-point conversion operations, > but omit neon, is the gcc command-line parameter "-march=armv7-a+vfpv3-fp16+nosimd". > You may want to use "-d16" somewhere, too. Discussing the issue with Debian/gcc/armhf maintainer resulted in: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1014091#12 So here is my suggested patch for valgrind: https://bugs.kde.org/show_bug.cgi?id=456200#c1 Thanks everyone for your help ! |
From: John R. <jr...@bi...> - 2022-06-29 18:48:33
|
> Program received signal SIGILL, Illegal instruction. > vgPlain_am_startup (sp_at_startup=3204445696) at > m_aspacemgr/aspacemgr-linux.c:1626 > 1626 init_nsegment(&seg); > (gdb) x/i $pc > => 0x58071090 <vgPlain_am_startup+20>: vmov.i32 d16, #0 ; 0x00000000 > As a reminder I do not have neon on this machine: > > Features : half thumb fastmult vfp edsp thumbee vfpv3 tls idiva > idivt vfpd32 lpae Therefore this is a gcc configuration problem. Whoever configured the gcc that was used to compile your valgrind assumed that neon would be present, but your machine lacks neon. Run "gcc --verbose". On my RaspberryPi Model 2B I get (wrapped by hand to reasonable line length): ----- Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/10/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-6' \ --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs \ --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr \ --with-gcc-major-version-only --program-suffix=-10 --program-prefix=arm-linux-gnueabihf- \ --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext \ --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu \ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new \ --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support \ --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release \ --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions \ --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror \ --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf \ --target=arm-linux-gnueabihf Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.2.1 20210110 (Debian 10.2.1-6) ----- where the important part now is "--with-arch=armv7-a --with-fpu=vfpv3-d16" for which the "-d16" part restricts gcc to 16 double precision registers even though my hardware has "vfpd32". Consulting "info gcc", then searching for "neon", and examining "ARM Options", it seems to me that the fix for gcc to assume vfp3 floating point with 32 double- precision registers and the half-precision floating-point conversion operations, but omit neon, is the gcc command-line parameter "-march=armv7-a+vfpv3-fp16+nosimd". You may want to use "-d16" somewhere, too. |