You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Tomas V. <tv...@fu...> - 2022-09-04 19:17:41
|
On 9/4/22 13:18, Philippe Waroquiers wrote: > On Sun, 2022-09-04 at 00:14 +0200, Tomas Vondra wrote: >> >> Clearly, this is not an issue valgrind is meant to detect (like invalid >> memory access, etc.) but an application issue. I've tried reproducing it >> without valgrind, but it only ever happens with valgrind - my theory is >> it's some sort of race condition, and valgrind changes the timing in a >> way that makes it much more likely to hit. I need to analyze the core to >> inspect the state more closely, etc. >> >> Any ideas what I might be doing wrong? Or how do I load the core file? > > Rather than have the core dump and analyse it, you might interactively debug > your program under valgrind. > E.g. you might put a breakpoint on the assert or at some interesting points > before the assert. > > See https://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver > for more info. > I know, and I've used vgdbserver before. But sometimes that's not very practical for a number of reasons: 1) Our tests run mostly unattended, possibly even on CI machines that we don't have access to. And we don't want the machine to just sit there and wait for someone to debug it interactively, it's better to report the failure. Being able to inspect the core later would be helpful, though. 2) The error may be quite rare and/or hard to trigger - we regularly see race conditions that happen 1 in 1000 runs. True, I could automate that using a gdb script. 3) I'd bet it's not so simple in multi-process system that forks various processes that can trigger the issue. I'd have to do attach a gdb to each of those. 4) It's already pretty slow under valgrind, I'd bet it'll be even worse with gdb, but maybe it's not that bad. rpi4 is very constrained, though. 5) Race conditions are often very sensitive to change in timing. For example I've never seen this particular issue without valgrind. I can easily imagine gdb changing the timing just enough for the race condition to not happen. regards Tomas |
From: Philippe W. <phi...@sk...> - 2022-09-04 11:18:16
|
On Sun, 2022-09-04 at 00:14 +0200, Tomas Vondra wrote: > > Clearly, this is not an issue valgrind is meant to detect (like invalid > memory access, etc.) but an application issue. I've tried reproducing it > without valgrind, but it only ever happens with valgrind - my theory is > it's some sort of race condition, and valgrind changes the timing in a > way that makes it much more likely to hit. I need to analyze the core to > inspect the state more closely, etc. > > Any ideas what I might be doing wrong? Or how do I load the core file? Rather than have the core dump and analyse it, you might interactively debug your program under valgrind. E.g. you might put a breakpoint on the assert or at some interesting points before the assert. See https://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver for more info. Philippe |
From: John R. <jr...@bi...> - 2022-09-04 02:17:10
|
> Any ideas what I might be doing wrong? Or how do I load the core file? Why does use of valgrind cause programmers to forget general debugging technique? 1. Describe the environment completely. The report does not say which compilers and compiler versions were used, or if the compiler commands contained any directives about debugging format. Such information is necessary to help understand what might be happening with regard to debugging and tracebacks. 2. Get debugging information whenever invoking a compiler. Traceback lines such as "(+0x57a574)[0x682574]" which lack the name of a symbol or file, suggest that "-g" debugging info was not requested for *all* compilations. Start over ("make clean; rm -rf '*.[oa]'") then re-compile every source file, making be sure to specify "-g" and no variant of "-O" or "-On", except possibly "-O0". 3. Optimizing for speed comes after achieving correct execution. If 'inline' is used anywhere, then re-compile with the compile-time argument "-Dinline=/*empty*/" in order to #define 'inline' as a one-word comment. If the behavior of the program changes (any difference at all, excepting only slower execution), then there is a *design error* in the source code. Fix that first. 4. Walk before attempting to run. Did you try a simple example? Write a half-page program with 5 subroutines, each of which calls the next one, and the last one sends SIGABRT to the process. Does the .core file when run under valgrind give the correct traceback using gdb? 5. (Learn and) Use the built-in tools where possible. Run the process interactively, invoking valgrind with "--vgdb-error=0", and giving the debugger command "(gdb) continue" after establishing connectivity between vgdb and the process. See the valgrind manual, section 3.2.9 "vgdb command line options". When the SIGABRT happens, then vgdb will allow you to use all the ordinary gdb commands to get a backtrace, go up and down the stack, examine variables and other memory, run (gdb) info proc (gdb) shell cat /proc/$PID/maps to see exactly the layout of process memory, etc. There are also special commands to access valgrind functionality interactively, such as checking for memory leaks. |
From: Tomas V. <tv...@fu...> - 2022-09-03 22:38:50
|
Hi, I'm having some issues with analyzing cores generated from valgrind. I do get the core file, but when I try opening it in gdb it just shows some entirely bogus information / backtrace etc. This is a rpi4 machine, with 64-bit debian, running a local build of valgrind 3.19.0 (built from sources, not a package). This is how I run the program (postgres binary) valgrind --quiet --trace-children=yes --track-origins=yes \ --read-var-info=yes --num-callers=20 --leak-check=no \ --gen-suppressions=all --error-limit=no \ --log-file=/tmp/valgrind.543917.log postgres \ -D /home/debian/postgres /contrib/test_decoding/tmp_check_iso/data \ -F -c listen_addresses= -k /tmp/pg_regress-n7HodE I get a ~200MB core file in /tmp, which I try loading like this: gdb src/backend/postgres /tmp/valgrind.542299.log.core.542391 but all I get is this: Reading symbols from src/backend/postgres... [New LWP 542391] Cannot access memory at address 0xcc10cc00cbf0cc6 Cannot access memory at address 0xcc10cc00cbf0cbe Core was generated by `'. Program terminated with signal SIGABRT, Aborted. #0 0x00000000049d42ac in ?? () (gdb) bt #0 0x00000000049d42ac in ?? () #1 0x0000000000400000 in dshash_dump (hash_table=0x0) at dshash.c:782 #2 0x0000000000400000 in dshash_dump (hash_table=0x49c0e44) at dshash.c:782 #3 0x0000000000000000 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) So the stack might be corrupt, for some reason? The first part looks entirely bogus too, though. The file size seems about right - with 128MB shared buffers, 200MB might be about right. The core is triggered by an "assert" in the source, and we even log a backtrace into the log - and that seems much more plausible: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 902, PID: 536049) (ExceptionalCondition+0x98)[0x8f5cec] (+0x57a574)[0x682574] (+0x579edc)[0x681edc] (ReorderBufferAddNewTupleCids+0x60)[0x6864dc] (SnapBuildProcessNewCid+0x94)[0x68b6a4] (heap2_decode+0x17c)[0x671584] (LogicalDecodingProcessRecord+0xbc)[0x670cd0] (+0x570f88)[0x678f88] (pg_logical_slot_get_changes+0x1c)[0x6790fc] (ExecMakeTableFunctionResult+0x29c)[0x4a92c0] (+0x3be638)[0x4c6638] (+0x3a2c14)[0x4aac14] (ExecScan+0x8c)[0x4aaca8] (+0x3bea14)[0x4c6a14] (+0x39ea60)[0x4a6a60] (+0x392378)[0x49a378] (+0x39520c)[0x49d20c] (standard_ExecutorRun+0x214)[0x49aad8] (ExecutorRun+0x64)[0x49a8b8] (+0x62e2ac)[0x7362ac] (PortalRun+0x27c)[0x735f08] (+0x626be8)[0x72ebe8] (PostgresMain+0x9a0)[0x733e9c] (+0x547be8)[0x64fbe8] (+0x547540)[0x64f540] (+0x542d30)[0x64ad30] (PostmasterMain+0x1460)[0x64a574] (+0x418888)[0x520888] Clearly, this is not an issue valgrind is meant to detect (like invalid memory access, etc.) but an application issue. I've tried reproducing it without valgrind, but it only ever happens with valgrind - my theory is it's some sort of race condition, and valgrind changes the timing in a way that makes it much more likely to hit. I need to analyze the core to inspect the state more closely, etc. Any ideas what I might be doing wrong? Or how do I load the core file? thanks Tomas |
From: John R. <jr...@bi...> - 2022-09-03 11:42:31
|
> ==123254== HEAP SUMMARY: > ==123254== in use at exit: 0 bytes in 0 blocks > ==123254== total heap usage: 6 allocs, 6 frees, 2,084 bytes allocated "2,084 bytes allocated" is the sum of all 6 arguments that were passed to malloc(), calloc() [possibly by calling malloc()], realloc() [at least the increase], etc. |
From: jian he <jia...@gm...> - 2022-09-03 07:25:42
|
helloc$valgrind ./a.out ==123254== Memcheck, a memory error detector ==123254== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==123254== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==123254== Command: ./a.out ==123254== enter string (EOF) to quit): test1 enter string (EOF) to quit): test2 enter string (EOF) to quit): (all done) test1 test2 ==123254== ==123254== HEAP SUMMARY: ==123254== in use at exit: 0 bytes in 0 blocks ==123254== total heap usage: 6 allocs, 6 frees, 2,084 bytes allocated ==123254== ==123254== All heap blocks were freed -- no leaks are possible ==123254== ==123254== For lists of detected and suppressed errors, rerun with: -s ==123254== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ----------------------------------------------------------------------------------------------------------------- simple c program source code: #include<stdio.h> #include<string.h> #include<stdlib.h> #define MAXC 1024 int main(void){ char buf[MAXC],**arr = NULL; size_t nstr = 0; /* counter for number of string stored */ for(;;){ size_t len; /* var to hold length of string after n removeal */ fputs("enter string (EOF) to quit): ",stdout); if(!fgets(buf,MAXC,stdin)){ puts("(all done)\n"); break; } buf[len = strcspn(buf,"\r\n")] = 0; /*always realloc using temp pointer to avoid mem-leak on reallco failure*/ void *tmp = realloc(arr,(nstr+1) * sizeof *arr); if(!tmp){ perror("realloc-tmp"); break; } arr = tmp; if(!(arr[nstr] = malloc(len + 1))){ perror("malloc-arr[str]"); break; } memcpy(arr[nstr++], buf,len + 1); } for(size_t i = 0; i < nstr; i++){ puts(arr[i]); free(arr[i]); } free(arr); return 0; } --------------------------------------------- New to C, I am not sure the following: total heap usage: 6 allocs, 6 frees, 2,084 bytes allocated I guess 6 allocs is 3 times mallocs called plus 3 times puts function called? But I don't know where 2084 comes from. -- I recommend David Deutsch's <<The Beginning of Infinity>> Jian |
From: Tom H. <to...@co...> - 2022-09-01 06:05:59
|
On 01/09/2022 01:03, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > Don't understand why strace log has exit(0) without the underscore, I know for a fact that it was with the underscore. Because exit() and _exit() are C library functions but both call the SYS_exit system call and that is what strace shows. The difference is that _exit doesn't run atexit() handlers or do any other cleanup before calling SYS_exit. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-09-01 00:18:20
|
> Normally, if it is the OOM that kills a process, you should find a trace of this in the system logs. I looked in every system log I could find, there was no indication of OOM killing it in any system log. > I do not understand what you mean by reducing the nr of callers from 12 to 6. > What are these callers ? Is that some threads of the process you are running > under valgrind ? > I mean the --num-callers option core option to valgrind. By default this is 12, and I didn't specify it. I tried using --num-callers=6 to reduce memory consumption. From the valgrind manual this means " Specifies the maximum number of entries shown in stack traces that identify program locations.". By reducing it to 6 I was hoping to reduce valgrind memory consumption in case it really was OOM killer, which I really doubt now. > And just in case: are you using the last version of Valgrind ? Yes I used the last version of valgrind and many earlier versions. > You might use "strace" on valgrind to see what is going on at the time > _exit(0) is called. I did use 'strace' and dmesg. Neither indicated it was OOM killer. I did happen to save the strace log when the SIGKILL happened. Here is the part around the _exit(0): read(2040, "R", 1) = 1 gettid() = 3332 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], NULL, 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[], ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], NULL, 8) = 0 gettid() = 3332 write(2041, "S", 1) = 1 exit(0) = ? +++ killed by SIGKILL +++ Don't understand why strace log has exit(0) without the underscore, I know for a fact that it was with the underscore. The strace log doesn't indicate anything special happening around the _exit(0). When I removed it the SIGKILL went away. > You might also start valgrind with some debug trace e.g. -d -d -d -d -v -v -v -v Was not aware of this and didn't try it. Don't have time to try it now. Regards, Rob |
From: Philippe W. <phi...@sk...> - 2022-08-31 22:14:27
|
On Wed, 2022-08-31 at 17:42 +0000, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > > When running memcheck on a massive monolith embedded executable > > (237MB stripped, 1.8GiB unstripped), after I stop the executable under > > valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak > > reports are printed. The parent process sees that the return status of > > memcheck is that it was SIGKILLed (status returned in waitpid call is '9'). > > We found that removing a call to _exit(0) made it so that valgrind is no longer > SIGKILLED. > > Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed? > > Previously exit(0) was called, without the leading underscore, but changed it to > _exit(0) to really make sure no memory was being deallocated. This worked well on a > different process, so we carried it over to this one, that is why we did it. > > Even with exit(0) (no underscore), in this process there is not much deallocation going > on in exit handlers, so have lots of doubts that valgrind/memcheck was using too much > memory and invoking the OOM killer. > > Using strace and dmesg while we had _exit(0) in use didn't show that OOM killer was > SIGKILLing valgrind. > > I also tried reducing number of callers from 12 to 6 when using _exit(0), still got the > SIGKILL. > > Also tried using a system that had an additional 4GByte of memory, and also got the > SIGKILL there. > > So I have many doubts that Valgrind was getting SIGKILLed due to too much memory usage. > > Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if anyone had any > ideas? Normally, if it is the OOM that kills a process, you should find a trace of this in the system logs. I do not understand what you mean by reducing the nr of callers from 12 to 6. What are these callers ? Is that some threads of the process you are running under valgrind ? And just in case: are you using the last version of Valgrind ? You might use "strace" on valgrind to see what is going on at the time _exit(0) is called. You might also start valgrind with some debug trace e.g. -d -d -d -d -v -v -v -v Philippe |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-31 19:16:21
|
> When running memcheck on a massive monolith embedded executable > (237MB stripped, 1.8GiB unstripped), after I stop the executable under > valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak > reports are printed. The parent process sees that the return status of > memcheck is that it was SIGKILLed (status returned in waitpid call is '9'). We found that removing a call to _exit(0) made it so that valgrind is no longer SIGKILLED. Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed? Previously exit(0) was called, without the leading underscore, but changed it to _exit(0) to really make sure no memory was being deallocated. This worked well on a different process, so we carried it over to this one, that is why we did it. Even with exit(0) (no underscore), in this process there is not much deallocation going on in exit handlers, so have lots of doubts that valgrind/memcheck was using too much memory and invoking the OOM killer. Using strace and dmesg while we had _exit(0) in use didn't show that OOM killer was SIGKILLing valgrind. I also tried reducing number of callers from 12 to 6 when using _exit(0), still got the SIGKILL. Also tried using a system that had an additional 4GByte of memory, and also got the SIGKILL there. So I have many doubts that Valgrind was getting SIGKILLed due to too much memory usage. Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if anyone had any ideas? |
From: Philippe W. <phi...@sk...> - 2022-08-06 08:35:01
|
> > > Is there anything that can be done with memcheck to make it consume less memory? > > No. In fact, Yes :). Or more precisely, yes, memory can be somewhat reduced :). See my other mail. Philippe |
From: Philippe W. <phi...@sk...> - 2022-08-06 08:32:56
|
On Fri, 2022-08-05 at 15:34 +0000, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > > If finding memory leaks is the only goal (for instance, if you are satisfied that > > memcheck has found all the overrun blocks, uninitialized reads, etc.) then > > https://github.com/KDE/heaptrack is the best tool. > > Thanks! I didn't know about heaptrack. I will look definitely into that. Does heaptrack > also show the 'still reachable' types of leaks that memcheck does? > > Any chance that the 'massif' tool would survive the OOM killer? This may be easier for > me to get going as I already have valgrind built. > > Is there anything that can be done with memcheck to make it consume less memory? You might be interested in looking at the slides of the FOSDEM presentation 'Tuning Valgrind for your workload' https://archive.fosdem.org/2015/schedule/event/valgrind_tuning/attachments/slides/743/export/events/attachments/valgrind_tuning/slides/743/tuning_V_for_your_workload.pdf There are several things you can do to reduce memcheck memory usage. Note also that you can also run leak search while your program runs, either via memcheck client requests or from the shell, using vgdb. Philippe |
From: Julian S. <jse...@gm...> - 2022-08-06 06:43:36
|
> Is there anything that can be done with memcheck to make it consume less memory? First of all, figure out whether memcheck got sigkilled because the machine ran out of space, or because you hit some shell limit/ulimit. In the former case, you can then try adding swap space to the machine. In the latter case you'll need to mess with the shell's ulimit settings. You could also try reducing the (data) size of the workload. Massif and Memcheck are different tools and do largely different things. Whether or not you can use one or the other depends a lot on the specifics of what problem you're trying to solve. J |
From: Eliot M. <mo...@cs...> - 2022-08-06 02:05:51
|
On 8/5/2022 8:47 PM, G N Srinivasa Prasanna wrote: > Thanks for this information. > > We are doing a memory system simulation, and need the address stream. At this point of time, we > don't care if we need a Terabyte even, we can delete the files later. > > Is there anything we can use from Valgrind? The lackey tool does just that - output a trace of memory references. -- Eliot Moss |
From: G N S. P. <gns...@ii...> - 2022-08-06 01:49:20
|
Thanks, will check it out. Best ________________________________ From: Eliot Moss <mo...@cs...> Sent: 06 August 2022 07:10 To: G N Srinivasa Prasanna <gns...@ii...>; John Reiser <jr...@bi...>; val...@li... <val...@li...> Subject: Re: [Valgrind-users] Valgrind trace Memory Addresses while running? On 8/5/2022 8:47 PM, G N Srinivasa Prasanna wrote: > Thanks for this information. > > We are doing a memory system simulation, and need the address stream. At this point of time, we > don't care if we need a Terabyte even, we can delete the files later. > > Is there anything we can use from Valgrind? The lackey tool does just that - output a trace of memory references. -- Eliot Moss |
From: G N S. P. <gns...@ii...> - 2022-08-06 00:47:29
|
Thanks for this information. We are doing a memory system simulation, and need the address stream. At this point of time, we don't care if we need a Terabyte even, we can delete the files later. Is there anything we can use from Valgrind? Best ________________________________ From: John Reiser <jr...@bi...> Sent: 06 August 2022 01:18 To: val...@li... <val...@li...> Subject: Re: [Valgrind-users] Valgrind trace Memory Addresses while running? >> if we can get a list of all the physical addresses the program used, in the order the program accessed them, and whether read/write. > For any real world application the size of the log would be overwhelmingly huge ... (unless you only want unique addresses). Of course this is the purpose of data compression (such as gzip, etc). You get some/much/most of the benefit of restricting to unique addresses while still capturing the entire stream of references. But as Paul noted, valgrind works in virtual addresses. Getting all the actual physical addresses is close to impossible. If you are working in an embedded device environment and care only about a small handful of memory-mapped device registers, then you can (must) process the mapping yourself. _______________________________________________ Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-05 23:37:06
|
I tried 'massif' on a simple program shown below where there are "definitely lost" leaks. massif doesn't seem to find "definitely lost" leaks, is this correct? I'm tried with both 3.19.0 and 3.15.0 versions of valgrind/massif, same result, "definitely lost" leaks are not found. I launch massif via: valgrind --tool=massif --sigill-diagnostics=no --error-limit=no --massif-out-file=definitely.%p.massif definitely.elf When I use memcheck it does find these definite leaks as below: ==29917== 60 bytes in 3 blocks are definitely lost in loss record 1 of 1 ==29917== at 0x402F67C: malloc (vg_replace_malloc.c:381) ==29917== by 0x80491D1: f2() (definitely.cpp:11) ==29917== by 0x804920F: f1() (definitely.cpp:17) ==29917== by 0x8049262: main (definitely.cpp:25) But massif doesn't find them at all? Is this correct? When I use massif on a program with "still reachable" it does find the still reachable, but it isn't finding definite leaks. Shouldn't massif also find definite leaks? The C code for "definitely.elf" is below: #include <stdlib.h> #include <stdio.h> #include <string.h> void* f2() { return malloc(20); } void f1() { f2(); } int main() { for (int i = 1; i <= 3; i++) { f1(); } return 0; } Thanks, Rob |
From: John R. <jr...@bi...> - 2022-08-05 19:49:06
|
>> if we can get a list of all the physical addresses the program used, in the order the program accessed them, and whether read/write. > For any real world application the size of the log would be overwhelmingly huge ... (unless you only want unique addresses). Of course this is the purpose of data compression (such as gzip, etc). You get some/much/most of the benefit of restricting to unique addresses while still capturing the entire stream of references. But as Paul noted, valgrind works in virtual addresses. Getting all the actual physical addresses is close to impossible. If you are working in an embedded device environment and care only about a small handful of memory-mapped device registers, then you can (must) process the mapping yourself. |
From: John R. <jr...@bi...> - 2022-08-05 19:39:20
|
>> Is there anything that can be done with memcheck to make it consume less memory? > > No. Well, you can use the command-line argument "--num-callers=<number>" to reduce the length of tracebacks that are stored in the "red zones" just before and after an allocated block. This might help enough if you have zillions of "still reachable" blocks. But you get shorter tracebacks, which might not give enough information to find and fix the leak quickly. If you do not have zillions of "still reachable" blocks, then --num-callers will not help so much; but probably would not be needed anyway. |
From: Paul F. <pj...@wa...> - 2022-08-05 19:30:44
|
> On 5 Aug 2022, at 20:53, G N Srinivasa Prasanna <gns...@ii...> wrote: > > > This is the first time we are using Valgrind, and we want to know if we can get a list of all the physical addresses the program used, in the order the program accessed them, and whether read/write. > > Please let us know if we can get this from Valgrind - the webpage information is not clear. > Hi Why do you need this? I’m not sure how to translate from virtual to physical address. Do you really mean physical address? For any real world application the size of the log would be overwhelmingly huge and I suspect would very rapidly fill most disks (unless you only want unique addresses). A+ Paul |
From: John R. <jr...@bi...> - 2022-08-05 19:28:04
|
> Does heaptrack also show the 'still reachable' types of leaks that memcheck does? Heaptrack intercepts malloc+free+etc, then logs the parameters, result, and traceback; but otherwise lets the progcess-original malloc+free+etc do the work. Heaptrack does not notice, and does not care, what you do with the result of malloc(), except whether or not the pointer returned by malloc() ever gets passed as an argument to free(). When heaptrack performs analysis, then any result from malloc() that has not been free()d is a "leak" as far as heaptrack is concerned. So that includes what memcheck calls "still reachable" but not (yet) a leak. > Any chance that the 'massif' tool would survive the OOM killer? This may be easier for me to get going as I already have valgrind built. Worth a try if you have a day or so to spend. Like all valgrind tools, massif relies on emulating the instruction stream, so the basic ~10X run-time slowdown applies. > Is there anything that can be done with memcheck to make it consume less memory? No. |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-05 19:16:05
|
> > If you want to know for sure who killed it then strace it while it > > runs and it should show you who sends the signel but my bet is that > > it's the kernel. > I tried strace -p <pid> on my process before I triggered its exit. The strace output ends saying with: "+++ killed by SIGKILL +++", but I don't find anything about who sent it. > Or possibly watch `dmesg -w` running in another shell. > I tried 'dmesg -w' but it didn't say anything about the SIGKILL. Is there something that has to be configured for dmesg to say the source of the SIGKILL? |
From: G N S. P. <gns...@ii...> - 2022-08-05 18:52:33
|
This is the first time we are using Valgrind, and we want to know if we can get a list of all the physical addresses the program used, in the order the program accessed them, and whether read/write. Please let us know if we can get this from Valgrind - the webpage information is not clear. Thanks Prasanna |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-05 18:49:19
|
Thanks Tom. Do you think I'd have better luck using the "massif" tool? Would "massif" be able to avoid the OOM killer? Or is there a way to reduce the amount of memory that memcheck will use? -----Original Message----- From: Tom Hughes <to...@co...> Sent: Friday, August 5, 2022 10:08 AM To: Bresalier, Rob (Nokia - US/Murray Hill) <rob...@no...>; val...@li... Subject: Re: memcheck is getting SIGKILLed before leak report is output On 05/08/2022 14:09, Bresalier, Rob (Nokia - US/Murray Hill) wrote: > When running memcheck on a massive monolith embedded executable (237MB > stripped, 1.8GiB unstripped), after I stop the executable under > valgrind I see the “HEAP SUMMARY” but then valgrind dies before any > leak reports are printed. The parent process sees that the return > status of memcheck is that it was SIGKILLed (status returned in > waitpid call is ‘9’). I am 99.9% sure that the parent process is not the one sending the SIGKILL. > Is it possible that valgrind SIGKILLs itself? Is there a reason that > the linux kernel (Wind River Linux) could be sending a SIGKILL to > valgrind/memcheck? I do not see any messages about Out of Memory/OOM > killer killing valgrind. Previous experience with this executable is > that there are almost 3 million leak reports (most of them are “still > reachable”), could that be occupying too much memory. Any ideas/advice > to figure out what is going on? Almost certainly the kernel OOM kiied it. If you want to know for sure who killed it then strace it while it runs and it should show you who sends the signel but my bet is that it's the kernel. > One thing I see in the logs is about “unhandled ioctl 0xa5 with no > size/direction hints”. Could this be a trigger for this crash/sigkill? Not really, no. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Bresalier, R. (N. - US/M. Hill) <rob...@no...> - 2022-08-05 15:34:58
|
> If finding memory leaks is the only goal (for instance, if you are satisfied that > memcheck has found all the overrun blocks, uninitialized reads, etc.) then > https://github.com/KDE/heaptrack is the best tool. Thanks! I didn't know about heaptrack. I will look definitely into that. Does heaptrack also show the 'still reachable' types of leaks that memcheck does? Any chance that the 'massif' tool would survive the OOM killer? This may be easier for me to get going as I already have valgrind built. Is there anything that can be done with memcheck to make it consume less memory? |