You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Luke H. <lu...@PA...> - 2003-04-10 10:32:45
|
>The reason for this is that I have a program with sockets and threads >that works fine in 1.0.4 but has problems in 1.9.4. I haven't tried >1.9.5 yet. No luck with 1.9.5 or 1.0.4 I'm afraid. It does appear that the stack gets smashed (see below) when siglongjmp()/sigsetjmp() are called: (gdb) bt #0 vg_do_syscall2 (syscallno=1074473960, arg1=1073822720, arg2=0) at vg_mylibc.c:76 #1 0x00000004 in ?? () #2 0x40064136 in vgPlain_main () at vg_main.c:1173 (gdb) And that this is potentially the problem. -- Luke -- Luke Howard | PADL Software Pty Ltd | www.padl.com |
From: David E. <da...@2g...> - 2003-04-10 10:21:27
|
On Thu, 2003-04-10 at 12:14, Luke Howard wrote: > >What valgrind version are you using? > > 1.9.3. I will try 1.9.5 and report any improvements. May I suggeest that you also try the old 1.0.4 version. The reason for this is that I have a program with sockets and threads that works fine in 1.0.4 but has problems in 1.9.4. I haven't tried 1.9.5 yet. -- -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Luke H. <lu...@PA...> - 2003-04-10 10:15:25
|
>What valgrind version are you using? 1.9.3. I will try 1.9.5 and report any improvements. -- Luke -- Luke Howard | PADL Software Pty Ltd | www.padl.com |
From: David E. <da...@2g...> - 2003-04-10 10:12:51
|
On Thu, 2003-04-10 at 11:56, Luke Howard wrote: > I'm attempting to run valgrind on a DCE RPC server. The problem is > that, while under valgrind, the server doesn't seem to receive any > client requests. It does work normally, of course, but there are > a couple of bugs for which valgrind will be very useful in fixing. What valgrind version are you using? -- -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Luke H. <lu...@PA...> - 2003-04-10 09:57:12
|
I'm attempting to run valgrind on a DCE RPC server. The problem is that, while under valgrind, the server doesn't seem to receive any client requests. It does work normally, of course, but there are a couple of bugs for which valgrind will be very useful in fixing. There are some interesting aspects of the DCE RPC runtime that, while not necessarily related to this problem, may give valgrind some grief. The RPC library does the following "weird" things: o wraps the pthread API to provide POSIX Threads Draft 4 API semantics (however, the symbols do not conflict with LinuxThreads) o wraps common system calls with jackets that enable asynchronous thread cancellation for the duration of the system call o implements exception handling on top of setjmp()/longjmp() and pthread cancels (see below): /* * Note that the rough schema for the exception handler is: * * do { * pthread_cleanup_push("routine that will longjmp back here"); * val = setjmp(...); * if (val == 0) * ...normal code... * else * ...exception code... * ...finally code... * if ("exception happened") * if ("exception handled") * break; * else * ...re-raise exception... * pthread_cleanup_pop(...); * } while (0); * * Exceptions are raised by doing a pthread_cancel against one's self * and then doing a pthread_testcancel. This causes the topmost cleanup * routine to be popped and called. This routine (_exc_cleanup_handler) * longjmp's back to the exception handler. This approach means we can * leverage off the fact that the push/pop routines are maintaining some * per-thread state (hopefully [but likely not] more efficiently than * we could ourselves). We need this state to string together the dynamic * stack of exception handlers. */ Disabling the system call jackets didn't seem to help. A backtrace of the under-valgrind server follows: (gdb) bt #0 0x401924a2 in vgPlain_do_syscall () from /usr/local/lib/valgrind/valgrind.so #1 0x4052a1e2 in recvmsg (fd=9, message=0x496ca89c, flags=0) at libc_jacket.c:221 #2 0x45fd1078 in receive_packet (assoc=0x4179608c, fragbuf_p=0x496ca9e4, ovf_fragbuf_p=0x496ca9e8, st=0x496ca9f0) at ../../../ncklib/include/comsoc_bsd.h:339 #3 0x45fcf9ea in receive_dispatch (assoc=0x4179608c) at cnrcvr.c:508 #4 0x45fcf248 in rpc__cn_network_receiver (assoc=0x4179608c) at cnrcvr.c:274 #5 0x4053254c in thread_wrapper (info=0x417961e0) at vg_libpthread.c:667 #6 0x4016c778 in do__apply_in_new_thread_bogusRA () at vg_scheduler.c:2122 Then, probably after the siglongjmp() call: (gdb) bt #0 vg_do_syscall2 (syscallno=1079218620, arg1=1077504048, arg2=0) at vg_mylibc.c:76 #1 0x00000004 in ?? () #2 0x4017eff2 in vgPlain_main () at vg_main.c:1384 (gdb) Valgrind shows: 04/10 09:56:09 [RPC] Registered endpoint ncacn_ip_tcp:127.0.0.1[1035] 04/10 09:56:09 [RPC] Registered endpoint ncadg_ip_udp:127.0.0.1[1035] ... ==3670== valgrind's libpthread.so: KLUDGED call to: siglongjmp (cleanup handlers are ignored) Any ideas? cheers, -- Luke -- Luke Howard | PADL Software Pty Ltd | www.padl.com |
From: Geoff A. <gal...@nc...> - 2003-04-10 02:27:59
|
This appears to be the same libstdc++ "memory leak" I mentioned in the = "still reachable" memory from g++'s std::string thread, which I started = on 31 March 2003. You can check the Valgrind-users archive at = http://sourceforge.net/mailarchive/forum.php?thread_id=3D1908802&forum_id= =3D32038. Note that the thread continues into April, so you'll need to = search both the March and April archives. In one of my later postings, = I give suppression files for suppressing these libstdc++ "still = reachable" messages for both gcc 2.95.2 and 3.2. Note that this is working as intended. There are a number of discussions = on this in the libstdc++ mailing list. For example, see = http://gcc.gnu.org/ml/libstdc++/2002-10/msg00038.html, = http://gcc.gnu.org/ml/libstdc++/2002-08/msg00105.html, and = http://gcc.gnu.org/ml/libstdc++/2002-10/msg00044.html. It's not = considered a bug, but rather part of a memory cache optimization. From = http://gcc.gnu.org/ml/libstdc++/2002-08/msg00105.html: Both libstdc++-v2 and libstdc++-v3 cache memory internally to greatly = aid performance yet, as far as we know, no pointers are ever lost as in = foo(). I believe there are compile options for disabling the libstdc++ memory = cache optimization if you don't like the behavior. But if you do, = expect significant performance degradation on some platforms. Geoff Alexander ----- Original Message -----=20 From: "Bastien Chevreux" <ba...@ch...> To: "valgrind users" <val...@li...> Sent: Wednesday, April 09, 2003 9:45 AM Subject: [Valgrind-users] "Reachable" memory, where's the bug (g++, STL, = valgrind)? Hello there, I have serious troubles with possible memory leaks in programs heavily using the STL. I am not able to tell whether it is a problem with the compiler, the STL or wrong valgrind output, so I'll start here before filing in a gcc bug report. One of my programs grew larger and larger without I knew why, so took valgrind to look and started building testcases to find out what was happening. I have a SuSE 8.1 distribution, that's kernel 2.4.x, glibc2.2 and gcc version 3.2 Consider this test case example: ------------------------------------------------------- #include <vector> #include <ext/hash_map> using namespace __gnu_cxx; void f1() { int n=3D42; vector<int> v; for(int i=3D0; i<1000000; i++) v.push_back(n); } int main(){ f1(); return 0; } ---------------------------------------------------------- g++ -g -o test test.C=20 and then=20 valgrind --leak-resolution=3Dhigh --num-callers=3D20 = --show-reachable=3Dyes --leak-check=3Dyes ./test I will get the following summary: ---------------------------------------------------------- =3D=3D16880=3D=3D LEAK SUMMARY: =3D=3D16880=3D=3D definitely lost: 16 bytes in 1 blocks. =3D=3D16880=3D=3D possibly lost: 0 bytes in 0 blocks. =3D=3D16880=3D=3D still reachable: 6912 bytes in 4 blocks. ---------------------------------------------------------- The number which troubles me ist the bytes that are still reachable. Here's, the detail: ---------------------------------------------------------- =3D=3D19169=3D=3D 6848 bytes in 3 blocks are still reachable in loss = record 3 of 3 =3D=3D19169=3D=3D at 0x4015DE3B: __builtin_new (vg_clientfuncs.c:129) =3D=3D19169=3D=3D by 0x4015DE76: operator new(unsigned) = (vg_clientfuncs.c:142) =3D=3D19169=3D=3D by 0x40278E00: std::__default_alloc_template<true, = 0>::_S_chunk_alloc(unsigned, int&) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x40278D1C: std::__default_alloc_template<true, = 0>::_S_refill(unsigned) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x402788EF: std::__default_alloc_template<true, = 0>::allocate(unsigned) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x8049008: std::__simple_alloc<int, = std::__default_alloc_template<true, 0> >::allocate(unsigned) = (/usr/include/g++/bits/stl_alloc.h:224) =3D=3D19169=3D=3D by 0x8048D7E: std::_Vector_alloc_base<int, = std::allocator<int>, true>::_M_allocate(unsigned) = (/usr/include/g++/bits/stl_vector.h:121) =3D=3D19169=3D=3D by 0x8048A45: std::vector<int, std::allocator<int> = >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, = std::allocator<int> > >, int const&) = (/usr/include/g++/bits/stl_vector.h:898) =3D=3D19169=3D=3D by 0x804884C: std::vector<int, std::allocator<int> = >::push_back(int const&) (/usr/include/g++/bits/stl_vector.h:496) =3D=3D19169=3D=3D by 0x80486A1: f1() (test2.C:10) =3D=3D19169=3D=3D by 0x80487A2: main (test2.C:21) =3D=3D19169=3D=3D by 0x403094A1: __libc_start_main (in = /lib/libc.so.6) =3D=3D19169=3D=3D by 0x8048580: (within = /home/bach/work/assembly/htga/src/progs/test) ---------------------------------------------------------- Regarding the program above, I sincerely do think that there should be no leak at all, even not in "reachable" parts. Now, a few bytes don't hurt. Unfortunately, when I let run my real program, here's what I get (for really small data sets): ---------------------------------------------------------- =3D=3D698=3D=3D LEAK SUMMARY: =3D=3D698=3D=3D definitely lost: 24825 bytes in 3492 blocks. =3D=3D698=3D=3D possibly lost: 1398 bytes in 3 blocks. =3D=3D698=3D=3D still reachable: 1125492 bytes in 65 blocks. ---------------------------------------------------------- (please note that I don't care about the that the definitely and possibly lost numbers, these I can trace back to real oversights in my code.) The "still reachable" 1M number is about 40 times greater than the other two numbers added together and I have the distinct impression that the memory is really eaten away somewhere:=20 1) all valgrind detail messages are more or less similar to the one of the test case above, all have something to do with containers 2) putting a "while(1)" loop at a distinctive point in my program where everything should have been more or less freed after some heavy computation (using about any existing STL container type that exists with dozens of different classes) gives me remaining memory footprints of >1G (yes, that's gigabyte). Now my question: any idea where to start searching? is valgrind at fault (which I don't think, but one never knows)? the gnu STL? the gnu g++ compiler? Any suggestion welcome.=20 Regards, Bastien --=20 -- The universe has its own cure for stupidity. -- -- Unfortunately, it doesn't always apply it. -- ------------------------------------------------------- This SF.net email is sponsored by: Etnus, makers of TotalView, The = debugger=20 for complex code. Debugging C/C++ programs can leave you feeling lost = and=20 disoriented. TotalView can help you find your way. Available on major = UNIX=20 and Linux platforms. Try it free. www.etnus.com _______________________________________________ Valgrind-users mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Jeremy F. <je...@go...> - 2003-04-09 23:13:26
|
Quoting Julian Seward <js...@ac...>: > Try the patch called 09-rdtsc-calibration from > http://www.goop.org/~jeremy/valgrind/ I doubt that will help much - that just stops an assertion failure when getting the calibration. The basic problem is that the TSC is variable-rate, and therefore useless as a timebase. J |
From: Bastien C. <ba...@ch...> - 2003-04-09 22:10:20
|
On Wednesday 09 April 2003 19:37, you wrote: > My guess is that the STL allocator keeps the memory around. What happen= s > if you call f1() several times in main()? No change, at least for this small example. But I continued to play a bit= with=20 containers my program uses and came out with this gem: ------------------------------------------------ #include<iostream> #include<deque> #include<set> using namespace std; void f1(int ic) { set<char> n; for(char c=3D'a'; c < 'z'; c++) n.insert(c); deque<set<char> > v; for(int i=3D0; i<ic; i++) v.push_back(n); } int main(){ f1(1000); cout << "The memory footprint ..." << endl; f1(20000); cout << "... should be ..." << endl; f1(50000); cout << "... near zero exactly now! (it isn't *sigh*)" << endl; //while(1); return 0; } ------------------------------------------------ Everyone's invited to let this run on their system ... ------------------------------------------------ =3D=3D13669=3D=3D LEAK SUMMARY: =3D=3D13669=3D=3D definitely lost: 16 bytes in 1 blocks. =3D=3D13669=3D=3D possibly lost: 0 bytes in 0 blocks. =3D=3D13669=3D=3D still reachable: 32568232 bytes in 127 blocks. =3D=3D13669=3D=3D suppressed: 0 bytes in 0 blocks. ------------------------------------------------ =2E.. and play with it: the numbers get lower when stopping after f1(2000= ) or=20 f1(20000)). Best thing is, when one uncomments the while(1); statement th= e=20 memory footprint is around 80M (where it shouldn't be much greater than t= he=20 size of the executable). I digged the news a bit and found this:=20 http://groups.google.com/groups?hl=3Den&lr=3D&ie=3DUTF-8&oe=3Dutf-8&frame= =3Dright&th=3D432bcc216e83d78f&seekm=3Dslrnaa3v4d.lt.mixtim_nospam%40taco= =2Emixtim.ispwest.com#link3 Here's one interesting part: > The default allocator for many c++ standard libraries (such as the one = that > ships with gcc) never actually "frees" memory. It just adds the memory = back > to a pool for later use. So, if you allocate a map that contains 80 MB > worth of data and then destroy the map, your application still has that= 80 > MB allocated and will until your program exits. On the other hand, after the program exited, valgrind should not find any= =20 leaks (the STL pool should have been freed, right?). Any ideas? So the STL is to 'blame'. The description of the pool behaviour should go= into=20 the FAQ of valgrind, though, I'm sure other people tripped (are tripping,= =20 will trip) over that too. And now for something completely related (but going off topic): is there = any=20 way to "flush" that pool?=20 Regards, Bastien PS: Did I already thank the valgrind author? No? I longed for a tool like= that=20 for Linux since I first worked with purify on a SUN. Thanks a lot. --=20 -- The universe has its own cure for stupidity. -- -- Unfortunately, it doesn't always apply it. -- |
From: David E. <da...@2g...> - 2003-04-09 17:37:55
|
On Wed, 2003-04-09 at 15:45, Bastien Chevreux wrote: > Hello there, > > I have serious troubles with possible memory leaks in programs heavily > using the STL. I am not able to tell whether it is a problem with the > compiler, the STL or wrong valgrind output, so I'll start here before > filing in a gcc bug report. > > One of my programs grew larger and larger without I knew why, so took > valgrind to look and started building testcases to find out what was > happening. I have a SuSE 8.1 distribution, that's kernel 2.4.x, > glibc2.2 and gcc version 3.2 > > Consider this test case example: > > ------------------------------------------------------- > #include <vector> > #include <ext/hash_map> > using namespace __gnu_cxx; > > void f1() > { > int n=42; > vector<int> v; > for(int i=0; i<1000000; i++) v.push_back(n); > } > > int main(){ > f1(); > return 0; > } > ---------------------------------------------------------- > > g++ -g -o test test.C > > and then > > valgrind --leak-resolution=high --num-callers=20 --show-reachable=yes --leak-check=yes ./test > > I will get the following summary: > > ---------------------------------------------------------- > ==16880== LEAK SUMMARY: > ==16880== definitely lost: 16 bytes in 1 blocks. > ==16880== possibly lost: 0 bytes in 0 blocks. > ==16880== still reachable: 6912 bytes in 4 blocks. > ---------------------------------------------------------- > > The number which troubles me ist the bytes that are still > reachable. Here's, the detail: > > ---------------------------------------------------------- > ==19169== 6848 bytes in 3 blocks are still reachable in loss record 3 of 3 > ==19169== at 0x4015DE3B: __builtin_new (vg_clientfuncs.c:129) > ==19169== by 0x4015DE76: operator new(unsigned) (vg_clientfuncs.c:142) > ==19169== by 0x40278E00: std::__default_alloc_template<true, 0>::_S_chunk_alloc(unsigned, int&) (in /usr/lib/libstdc++.so.5.0.0) > ==19169== by 0x40278D1C: std::__default_alloc_template<true, 0>::_S_refill(unsigned) (in /usr/lib/libstdc++.so.5.0.0) > ==19169== by 0x402788EF: std::__default_alloc_template<true, 0>::allocate(unsigned) (in /usr/lib/libstdc++.so.5.0.0) > ==19169== by 0x8049008: std::__simple_alloc<int, std::__default_alloc_template<true, 0> >::allocate(unsigned) (/usr/include/g++/bits/stl_alloc.h:224) > ==19169== by 0x8048D7E: std::_Vector_alloc_base<int, std::allocator<int>, true>::_M_allocate(unsigned) (/usr/include/g++/bits/stl_vector.h:121) > ==19169== by 0x8048A45: std::vector<int, std::allocator<int> >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&) (/usr/include/g++/bits/stl_vector.h:898) > ==19169== by 0x804884C: std::vector<int, std::allocator<int> >::push_back(int const&) (/usr/include/g++/bits/stl_vector.h:496) > ==19169== by 0x80486A1: f1() (test2.C:10) > ==19169== by 0x80487A2: main (test2.C:21) > ==19169== by 0x403094A1: __libc_start_main (in /lib/libc.so.6) > ==19169== by 0x8048580: (within /home/bach/work/assembly/htga/src/progs/test) > ---------------------------------------------------------- > > Regarding the program above, I sincerely do think that there should be > no leak at all, even not in "reachable" parts. > > Now, a few bytes don't hurt. Unfortunately, when I let run my real > program, here's what I get (for really small data sets): > > ---------------------------------------------------------- > ==698== LEAK SUMMARY: > ==698== definitely lost: 24825 bytes in 3492 blocks. > ==698== possibly lost: 1398 bytes in 3 blocks. > ==698== still reachable: 1125492 bytes in 65 blocks. > ---------------------------------------------------------- > > (please note that I don't care about the that the definitely and > possibly lost numbers, these I can trace back to real oversights in my > code.) > > The "still reachable" 1M number is about 40 times greater than the > other two numbers added together and I have the distinct impression > that the memory is really eaten away somewhere: > 1) all valgrind detail messages are more or less similar to the one of > the test case above, all have something to do with containers > 2) putting a "while(1)" loop at a distinctive point in my program > where everything should have been more or less freed after some > heavy computation (using about any existing STL container type > that exists with dozens of different classes) gives me remaining > memory footprints of >1G (yes, that's gigabyte). > > Now my question: any idea where to start searching? is valgrind at > fault (which I don't think, but one never knows)? the gnu STL? the gnu > g++ compiler? > > Any suggestion welcome. My guess is that the STL allocator keeps the memory around. What happens if you call f1() several times in main()? -- -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Bastien C. <ba...@ch...> - 2003-04-09 13:45:07
|
Hello there, I have serious troubles with possible memory leaks in programs heavily using the STL. I am not able to tell whether it is a problem with the compiler, the STL or wrong valgrind output, so I'll start here before filing in a gcc bug report. One of my programs grew larger and larger without I knew why, so took valgrind to look and started building testcases to find out what was happening. I have a SuSE 8.1 distribution, that's kernel 2.4.x, glibc2.2 and gcc version 3.2 Consider this test case example: ------------------------------------------------------- #include <vector> #include <ext/hash_map> using namespace __gnu_cxx; void f1() { int n=3D42; vector<int> v; for(int i=3D0; i<1000000; i++) v.push_back(n); } int main(){ f1(); return 0; } ---------------------------------------------------------- g++ -g -o test test.C=20 and then=20 valgrind --leak-resolution=3Dhigh --num-callers=3D20 --show-reachable=3Dy= es --leak-check=3Dyes ./test I will get the following summary: ---------------------------------------------------------- =3D=3D16880=3D=3D LEAK SUMMARY: =3D=3D16880=3D=3D definitely lost: 16 bytes in 1 blocks. =3D=3D16880=3D=3D possibly lost: 0 bytes in 0 blocks. =3D=3D16880=3D=3D still reachable: 6912 bytes in 4 blocks. ---------------------------------------------------------- The number which troubles me ist the bytes that are still reachable. Here's, the detail: ---------------------------------------------------------- =3D=3D19169=3D=3D 6848 bytes in 3 blocks are still reachable in loss reco= rd 3 of 3 =3D=3D19169=3D=3D at 0x4015DE3B: __builtin_new (vg_clientfuncs.c:129) =3D=3D19169=3D=3D by 0x4015DE76: operator new(unsigned) (vg_clientfunc= s.c:142) =3D=3D19169=3D=3D by 0x40278E00: std::__default_alloc_template<true, 0= >::_S_chunk_alloc(unsigned, int&) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x40278D1C: std::__default_alloc_template<true, 0= >::_S_refill(unsigned) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x402788EF: std::__default_alloc_template<true, 0= >::allocate(unsigned) (in /usr/lib/libstdc++.so.5.0.0) =3D=3D19169=3D=3D by 0x8049008: std::__simple_alloc<int, std::__defaul= t_alloc_template<true, 0> >::allocate(unsigned) (/usr/include/g++/bits/st= l_alloc.h:224) =3D=3D19169=3D=3D by 0x8048D7E: std::_Vector_alloc_base<int, std::allo= cator<int>, true>::_M_allocate(unsigned) (/usr/include/g++/bits/stl_vecto= r.h:121) =3D=3D19169=3D=3D by 0x8048A45: std::vector<int, std::allocator<int> >= ::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::= allocator<int> > >, int const&) (/usr/include/g++/bits/stl_vector.h:898) =3D=3D19169=3D=3D by 0x804884C: std::vector<int, std::allocator<int> >= ::push_back(int const&) (/usr/include/g++/bits/stl_vector.h:496) =3D=3D19169=3D=3D by 0x80486A1: f1() (test2.C:10) =3D=3D19169=3D=3D by 0x80487A2: main (test2.C:21) =3D=3D19169=3D=3D by 0x403094A1: __libc_start_main (in /lib/libc.so.6) =3D=3D19169=3D=3D by 0x8048580: (within /home/bach/work/assembly/htga/= src/progs/test) ---------------------------------------------------------- Regarding the program above, I sincerely do think that there should be no leak at all, even not in "reachable" parts. Now, a few bytes don't hurt. Unfortunately, when I let run my real program, here's what I get (for really small data sets): ---------------------------------------------------------- =3D=3D698=3D=3D LEAK SUMMARY: =3D=3D698=3D=3D definitely lost: 24825 bytes in 3492 blocks. =3D=3D698=3D=3D possibly lost: 1398 bytes in 3 blocks. =3D=3D698=3D=3D still reachable: 1125492 bytes in 65 blocks. ---------------------------------------------------------- (please note that I don't care about the that the definitely and possibly lost numbers, these I can trace back to real oversights in my code.) The "still reachable" 1M number is about 40 times greater than the other two numbers added together and I have the distinct impression that the memory is really eaten away somewhere:=20 1) all valgrind detail messages are more or less similar to the one of the test case above, all have something to do with containers 2) putting a "while(1)" loop at a distinctive point in my program where everything should have been more or less freed after some heavy computation (using about any existing STL container type that exists with dozens of different classes) gives me remaining memory footprints of >1G (yes, that's gigabyte). Now my question: any idea where to start searching? is valgrind at fault (which I don't think, but one never knows)? the gnu STL? the gnu g++ compiler? Any suggestion welcome.=20 Regards, Bastien --=20 -- The universe has its own cure for stupidity. -- -- Unfortunately, it doesn't always apply it. -- |
From: Julian S. <js...@ac...> - 2003-04-09 07:45:22
|
On Wednesday 09 April 2003 7:17 am, Sefer Tov wrote: > Indeed, you were both correct. > I am running it on a laptop (compaq armada m300) with apm enabled. > > I must admit that this proves to be quite an annoyance, since I do most of > my work on a laptop. Does that affect the timing of other functions as > well? > > I'm unfailiar with these high resolution time counters in x86, but it > strikes me odd that Intel wouldn't provide an equivilant, reliable > mechanism for laptops as well (maybe in exchange of accuracy or slight > performance impact). > > I'm curious, is there any way around this? Try the patch called 09-rdtsc-calibration from http://www.goop.org/~jeremy/valgrind/ J > > Thanks in advance, > Sefer. > > > From: Julian Seward <js...@ac...> > > >To: Jeremy Fitzhardinge <je...@go...>, Sefer Tov <se...@ho...> > >CC: val...@li... > >Subject: Re: [Valgrind-users] Scheduler problem > >Date: Tue, 8 Apr 2003 21:24:24 +0000 > > > > > >Sefer, > > > >I tested the program you sent me (below) and it behaves > >identically running on V from normal; no timing anomalies. > >This is running on a 1133Mhz PIII-T (desktop) machine. > > > >I suspect Jeremy may be right about the power management thing; > >he's had a patch available for that for a while. Can you > >clarify the situation re power management on your platform? > >Thanks. > > > >J > > > >#include <pthread.h> > >#include <stdio.h> > >#include <unistd.h> > > > > > >void *start(void *p) > >{ > > printf("Hi!\n"); > > sleep(1); > > printf("Here\n"); > > sleep(1); > > > > return 0; > >} > > > > > >int main() > >{ > > pthread_t tid; > > void *p; > > int i; > > > > for ( i = 0; i < 5; ++i ) { > > pthread_create(&tid, 0, start, 0); > > } > > pthread_join(tid, &p); > > > > return 0; > >} > > > >On Tuesday 08 April 2003 10:12 am, Jeremy Fitzhardinge wrote: > > > Quoting Sefer Tov <se...@ho...>: > > > > Hi! > > > > > > > > I've been testing a short threaded program, and I noticed that > > > >sleep(x), > > > > > > although utilizes no cpu, it schdules poorly (the program slows down > > > > almost > > > > to a halt). > > > > > > Is your machine a laptop, or a desktop with power management enabled? > > > Valgrind uses the TSC register as its timebase, and I've noticed on my > > > laptop the TSC doesn't advance when the machine is idle. You can > > > easily tell if this is the case: if you run a CPU-bound program at the > > > same > > > >time, > > > > > then the TSC advances at near full speed and the sleeps are for the > > > >right > > > > > time. > > > > > > J > > _________________________________________________________________ > Add photos to your messages with MSN 8. Get 2 months FREE*. > http://join.msn.com/?page=features/featuredemail |
From: Sefer T. <se...@ho...> - 2003-04-09 07:18:16
|
Indeed, you were both correct. I am running it on a laptop (compaq armada m300) with apm enabled. I must admit that this proves to be quite an annoyance, since I do most of my work on a laptop. Does that affect the timing of other functions as well? I'm unfailiar with these high resolution time counters in x86, but it strikes me odd that Intel wouldn't provide an equivilant, reliable mechanism for laptops as well (maybe in exchange of accuracy or slight performance impact). I'm curious, is there any way around this? Thanks in advance, Sefer. >From: Julian Seward <js...@ac...> >To: Jeremy Fitzhardinge <je...@go...>, Sefer Tov <se...@ho...> >CC: val...@li... >Subject: Re: [Valgrind-users] Scheduler problem >Date: Tue, 8 Apr 2003 21:24:24 +0000 > > >Sefer, > >I tested the program you sent me (below) and it behaves >identically running on V from normal; no timing anomalies. >This is running on a 1133Mhz PIII-T (desktop) machine. > >I suspect Jeremy may be right about the power management thing; >he's had a patch available for that for a while. Can you >clarify the situation re power management on your platform? >Thanks. > >J > >#include <pthread.h> >#include <stdio.h> >#include <unistd.h> > > >void *start(void *p) >{ > printf("Hi!\n"); > sleep(1); > printf("Here\n"); > sleep(1); > > return 0; >} > > >int main() >{ > pthread_t tid; > void *p; > int i; > > for ( i = 0; i < 5; ++i ) { > pthread_create(&tid, 0, start, 0); > } > pthread_join(tid, &p); > > return 0; >} > >On Tuesday 08 April 2003 10:12 am, Jeremy Fitzhardinge wrote: > > Quoting Sefer Tov <se...@ho...>: > > > Hi! > > > > > > I've been testing a short threaded program, and I noticed that >sleep(x), > > > > > > although utilizes no cpu, it schdules poorly (the program slows down > > > almost > > > to a halt). > > > > Is your machine a laptop, or a desktop with power management enabled? > > Valgrind uses the TSC register as its timebase, and I've noticed on my > > laptop the TSC doesn't advance when the machine is idle. You can easily > > tell if this is the case: if you run a CPU-bound program at the same >time, > > then the TSC advances at near full speed and the sleeps are for the >right > > time. > > > > J > _________________________________________________________________ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail |
From: Julian S. <js...@ac...> - 2003-04-08 21:25:00
|
Sefer, I tested the program you sent me (below) and it behaves identically running on V from normal; no timing anomalies. This is running on a 1133Mhz PIII-T (desktop) machine. I suspect Jeremy may be right about the power management thing; he's had a patch available for that for a while. Can you clarify the situation re power management on your platform? Thanks. J #include <pthread.h> #include <stdio.h> #include <unistd.h> void *start(void *p) { printf("Hi!\n"); sleep(1); printf("Here\n"); sleep(1); return 0; } int main() { pthread_t tid; void *p; int i; for ( i = 0; i < 5; ++i ) { pthread_create(&tid, 0, start, 0); } pthread_join(tid, &p); return 0; } On Tuesday 08 April 2003 10:12 am, Jeremy Fitzhardinge wrote: > Quoting Sefer Tov <se...@ho...>: > > Hi! > > > > I've been testing a short threaded program, and I noticed that sleep(x), > > > > although utilizes no cpu, it schdules poorly (the program slows down > > almost > > to a halt). > > Is your machine a laptop, or a desktop with power management enabled? > Valgrind uses the TSC register as its timebase, and I've noticed on my > laptop the TSC doesn't advance when the machine is idle. You can easily > tell if this is the case: if you run a CPU-bound program at the same time, > then the TSC advances at near full speed and the sleeps are for the right > time. > > J |
From: Jeremy F. <je...@go...> - 2003-04-08 10:13:01
|
Quoting Sefer Tov <se...@ho...>: > Hi! > > I've been testing a short threaded program, and I noticed that sleep(x), > > although utilizes no cpu, it schdules poorly (the program slows down > almost > to a halt). Is your machine a laptop, or a desktop with power management enabled? Valgrind uses the TSC register as its timebase, and I've noticed on my laptop the TSC doesn't advance when the machine is idle. You can easily tell if this is the case: if you run a CPU-bound program at the same time, then the TSC advances at near full speed and the sleeps are for the right time. J |
From: Sefer T. <se...@ho...> - 2003-04-08 07:54:51
|
Hi! I've been testing a short threaded program, and I noticed that sleep(x), although utilizes no cpu, it schdules poorly (the program slows down almost to a halt). This is a simple threaded queue handler program, which sleeps 2 sec. between each update to the queue, however, after 2-3 such updates, the sleep seems to triple (or worse) waiting often 10 seconds or more (while being completely idle). This happens to me on Mandrake 9.1 on i686 running valgrind both 1.9.4 and a recent snapshot from the CVS. This might indicate some problem in scheduling blocking commands in general (or sleep in particular), after all, why should a sleep(2) wait for 10 seconds (while the cpu is entirely idle). If anyone has any ideas or can offer an explanation, I'd appreciate your comments. Thanks, Sefer. #### Makefile all: gcc -g -Wall -D_REENTRANT -o go *.c -lpthread #### test.c #include <pthread.h> #include <unistd.h> #include <stdio.h> #include <string.h> #define MAXT 4 #define MAXD 10 typedef struct { pthread_mutex_t my_mutex; pthread_cond_t my_cond; int count; } data_t; static data_t gdata[MAXT]; static void data_add(data_t *data) { pthread_mutex_lock(&data->my_mutex); while ( data->count >= MAXD ) { pthread_cond_wait(&data->my_cond, &data->my_mutex); } if ( !data->count ) pthread_cond_signal(&data->my_cond); ++data->count; pthread_mutex_unlock(&data->my_mutex); } static void data_remove(data_t *data) { pthread_mutex_lock(&data->my_mutex); while ( data->count == 0 ) { pthread_cond_wait(&data->my_cond, &data->my_mutex); } if ( data->count >= MAXD ) pthread_cond_signal(&data->my_cond); --data->count; pthread_mutex_unlock(&data->my_mutex); } static void *thread_proc(void *ptr) { printf("Thread created!\n"); for ( ; ; ) { data_remove((data_t *) ptr); printf("Got some data.\n"); } return NULL; } static void run() { int i, rc; pthread_t tid; for ( i = 0; i < MAXT; ++i ) { pthread_mutex_init(&gdata[i].my_mutex, NULL); pthread_cond_init(&gdata[i].my_cond, NULL); gdata[i].count = 0; rc = pthread_create(&tid, NULL, thread_proc, gdata + i); } for ( ; ; ) { data_add(gdata + 0); data_add(gdata + 1); sleep(2); } } int main() { run(); return 0; } _________________________________________________________________ Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail |
From: Nicholas N. <nj...@ca...> - 2003-04-08 00:11:44
|
On Tue, 18 Mar 2003, Nicholas Nethercote wrote: > I've just hacked up support for automatic suppression generation. If > everyone likes it, I will commit it to the HEAD. I just committed it along with some other changes. I hope I haven't broken anything. Please try it out. > Some notes: > - option name is --gen-suppressions=yes N |
From: Julian S. <js...@ac...> - 2003-04-07 23:27:37
|
... from the usual place, http://developer.kde.org/~sewardj The release notes follow. J Version 1.9.5 (7 April 2003) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It occurs to me that it would be helpful for valgrind users to record in the source distribution the changes in each release. So I now attempt to mend my errant ways :-) Changes in this and future releases will be documented in the NEWS file in the source distribution. Major changes in 1.9.5: - (Critical bug fix): Fix a bug in the FPU simulation. This was causing some floating point conditional tests not to work right. Several people reported this. If you had floating point code which didn't work right on 1.9.1 to 1.9.4, it's worth trying 1.9.5. - Partial support for Red Hat 9. RH9 uses the new Native Posix Threads Library (NPTL), instead of the older LinuxThreads. This potentially causes problems with V which will take some time to correct. In the meantime we have partially worked around this, and so 1.9.5 works on RH9. Threaded programs still work, but they may deadlock, because some system calls (accept, read, write, etc) which should be nonblocking, in fact do block. This is a known bug which we are looking into. If you can, your best bet (unfortunately) is to avoid using 1.9.5 on a Red Hat 9 system, or on any NPTL-based distribution. If your glibc is 2.3.1 or earlier, you're almost certainly OK. Minor changes in 1.9.5: - Added some #errors to valgrind.h to ensure people don't include it accidentally in their sources. This is a change from 1.0.X which was never properly documented. The right thing to include is now memcheck.h. Some people reported problems and strange behaviour when (incorrectly) including valgrind.h in code with 1.9.1 -- 1.9.4. This is no longer possible. - Add some __extension__ bits and pieces so that gcc configured for valgrind-checking compiles even with -Werror. If you don't understand this, ignore it. Of interest to gcc developers only. - Removed a pointless check which caused problems interworking with Clearcase. V would complain about shared objects whose names did not end ".so", and refuse to run. This is now fixed. In fact it was fixed in 1.9.4 but not documented. - Fixed a bug causing an assertion failure of "waiters == 1" somewhere in vg_scheduler.c, when running large threaded apps, notably MySQL. - Add support for the munlock system call (124). Some comments about future releases: 1.9.5 is, we hope, the most stable Valgrind so far. It pretty much supersedes the 1.0.X branch. If you are a valgrind packager, please consider making 1.9.5 available to your users. You can regard the 1.0.X branch as obsolete: 1.9.5 is stable and vastly superior. There are no plans at all for further releases of the 1.0.X branch. If you want a leading-edge valgrind, consider building the cvs head (from SourceForge), or getting a snapshot of it. Current cool stuff going in includes MMX support (done); SSE/SSE2 support (in progress), a significant (10-20%) performance improvement (done), and the usual large collection of minor changes. Hopefully we will be able to improve our NPTL support, but no promises. |
From: Nicholas N. <nj...@ca...> - 2003-04-07 15:36:47
|
On Mon, 7 Apr 2003, Josef Weidendorfer wrote: > > Another approach: I've been thinking about, and half-implemented, this > > solution: when a program munmap()s an executable segment, Valgrind doesn't > > really munmap it, but just leaves it in that address space. Then the > > symbols don't get unloaded. The only downside is that it wastes address > > space, but hopefully code sizes aren't so big that this is a problem. > > I really see a problem if an application loads the same plugin multiple times > in a row, and you get thus one plugin mapped multiple times simultaniously. > E.g. in KDevelop (a C++ IDE), every feature for project development is in a > plugin. Now every time you open another project, the plugins of the old > project are unmapped and the plugins of the new one are mapped (around 20). Urgh. > > One problem is that debug info is currently read in lazily, which clashes > > with this, for tedious reasons. Getting back to this is somewhere in the > > Why? If the same plugin is loaded at mostly once, I don't see a problem. It's a problem for Memcheck: if you have a leak in foo.so, often foo.so is dlclose()d before the program ends, currently the leak checker gives hopeless error messages about foo.so because the debug info has been unloaded. Now imagine I've implemented the don't-unload-the-symbols scheme, but there are no other errors in the program using foo.so. So the first time the debug info for foo.so is needed is during the leak check. But by this time, foo.so has already been loaded and unloaded, its debug info was never read (because it wasn't needed) and the error message is still uninformative. If debug info was needed even once before foo.so was dlclose()d there would be no problem, but we can't rely on that. N |
From: Josef W. <Jos...@in...> - 2003-04-07 15:21:20
|
On Monday 07 April 2003 16:55, Nicholas Nethercote wrote: > On Mon, 7 Apr 2003, Josef Weidendorfer wrote: > > > Hmm, in theory there should be no problems, Valgrind (and hence > > > Cachegrind) can handle dlopen'd plugins ok... > > > > But they are unmapped before program termination. You only get one cost > > line "discarded" for all plugins. The solution would be to allow dumping > > while the plugin is mapped. Or you dump the information of BBCCs when the > > according object is unmapped. > > > > Or you don't delete BBCCs and accumate cost to "discarded", but keep > > them. Now you have the problem that a BB start address isn't unique to > > get to the right BBCC, but you have to store the SegInfo* in the BBCCs to > > distinguish among BBs from different plugins at the same address. > > > > BTW, my Calltree skin should be buggy in this regard, as it doesn't > > delete any BBCCs on unmapping, but still only uses the BB address. But > > checking the SegInfo* should be easy enough for a fix. > > Perhaps it's even better to take BBCCs of unmapped objects from the hash > > into a list, and put them in again if the object is mapped again after > > adjusting the addresses if needed. The same object could be mapped on a > > different base address. > > Another approach: I've been thinking about, and half-implemented, this > solution: when a program munmap()s an executable segment, Valgrind doesn't > really munmap it, but just leaves it in that address space. Then the > symbols don't get unloaded. The only downside is that it wastes address > space, but hopefully code sizes aren't so big that this is a problem. I really see a problem if an application loads the same plugin multiple times in a row, and you get thus one plugin mapped multiple times simultaniously. E.g. in KDevelop (a C++ IDE), every feature for project development is in a plugin. Now every time you open another project, the plugins of the old project are unmapped and the plugins of the new one are mapped (around 20). It's easy to keep a list of objects the application has unmapped, and in a mmap, eventually return the already mapped object if a plugin is loaded a second time. > One problem is that debug info is currently read in lazily, which clashes > with this, for tedious reasons. Getting back to this is somewhere in the Why? If the same plugin is loaded at mostly once, I don't see a problem. Josef > lower half of my ever-growing todo list... > > N > > > > ------------------------------------------------------- > This SF.net email is sponsored by: ValueWeb: > Dedicated Hosting for just $79/mo with 500 GB of bandwidth! > No other company gives more support or power for your dedicated server > http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users -- -- Dipl.-Inform. Josef Weidendorfer LRR-TUM - Raum 01.06.056 - Tel. +49 / (0)89 / 289 - 18454 |
From: Nicholas N. <nj...@ca...> - 2003-04-07 14:55:32
|
On Mon, 7 Apr 2003, Josef Weidendorfer wrote: > > Hmm, in theory there should be no problems, Valgrind (and hence > > Cachegrind) can handle dlopen'd plugins ok... > > But they are unmapped before program termination. You only get one cost line > "discarded" for all plugins. The solution would be to allow dumping while the > plugin is mapped. Or you dump the information of BBCCs when the according > object is unmapped. > > Or you don't delete BBCCs and accumate cost to "discarded", but keep them. > Now you have the problem that a BB start address isn't unique to get to the > right BBCC, but you have to store the SegInfo* in the BBCCs to distinguish > among BBs from different plugins at the same address. > > BTW, my Calltree skin should be buggy in this regard, as it doesn't delete any > BBCCs on unmapping, but still only uses the BB address. But checking the > SegInfo* should be easy enough for a fix. > Perhaps it's even better to take BBCCs of unmapped objects from the hash into > a list, and put them in again if the object is mapped again after adjusting > the addresses if needed. The same object could be mapped on a different base > address. Another approach: I've been thinking about, and half-implemented, this solution: when a program munmap()s an executable segment, Valgrind doesn't really munmap it, but just leaves it in that address space. Then the symbols don't get unloaded. The only downside is that it wastes address space, but hopefully code sizes aren't so big that this is a problem. One problem is that debug info is currently read in lazily, which clashes with this, for tedious reasons. Getting back to this is somewhere in the lower half of my ever-growing todo list... N |
From: Nicholas N. <nj...@ca...> - 2003-04-07 14:50:53
|
Hi, I just committed a change to the CVS head that speeds up the Memcheck and Addrcheck skins a lot: by up to 28% and 36% respectively. Here are some figures for the SPEC2000 benchmark suite, using the "test" inputs: ======== memcheck ======== 164.gzip: vg1: 46.00s, vg2: 37.12s, speed-up: 19.3% 300.twolf: vg1: 7.34s, vg2: 6.63s, speed-up: 9.7% 197.parser: vg1: 70.60s, vg2: 57.24s, speed-up: 18.9% 186.crafty: vg1: 166.83s, vg2: 156.80s, speed-up: 6.0% 255.vortex: vg1: 392.94s, vg2: 283.37s, speed-up: 27.9% 256.bzip2: vg1: 152.97s, vg2: 141.28s, speed-up: 7.6% 176.gcc: vg1: 60.27s, vg2: 52.06s, speed-up: 13.6% 181.mcf: vg1: 4.24s, vg2: 4.11s, speed-up: 3.1% 254.gap: vg1: 31.40s, vg2: 24.73s, speed-up: 21.2% -------- 179.art: vg1: 364.32s, vg2: 373.53s, speed-up: -2.5% 183.equake: vg1: 68.86s, vg2: 63.54s, speed-up: 7.7% 188.ammp: vg1: 463.18s, vg2: 455.46s, speed-up: 1.7% 177.mesa: vg1: 113.69s, vg2: 100.01s, speed-up: 12.0% ======== addrcheck ======== 164.gzip: vg1: 35.97s, vg2: 27.37s, speed-up: 23.9% 300.twolf: vg1: 5.18s, vg2: 4.48s, speed-up: 13.5% 197.parser: vg1: 57.92s, vg2: 40.95s, speed-up: 29.3% 186.crafty: vg1: 119.23s, vg2: 93.99s, speed-up: 21.2% 255.vortex: vg1: 349.38s, vg2: 223.08s, speed-up: 36.1% 256.bzip2: vg1: 105.23s, vg2: 90.89s, speed-up: 13.6% 176.gcc: vg1: 43.52s, vg2: 33.97s, speed-up: 21.9% 181.mcf: vg1: 2.20s, vg2: 2.07s, speed-up: 5.9% 254.gap: vg1: 22.90s, vg2: 17.82s, speed-up: 22.2% -------- 179.art: vg1: 298.79s, vg2: 294.92s, speed-up: 1.3% 183.equake: vg1: 59.73s, vg2: 57.94s, speed-up: 3.0% 188.ammp: vg1: 404.55s, vg2: 399.96s, speed-up: 1.1% 177.mesa: vg1: 92.93s, vg2: 79.65s, speed-up: 14.3% For those of you interested in the details: the speed-up was possible because the skins' handling of %esp updates was decidedly sub-optimal... every %esp adjustment requires updating the accessibility bits of the relevant stack words. This was handled with the core events {new,die}_stack_mem{_aligned,}, after the core computed the %esp-delta at run-time. And %esp is updated very frequently. But most of the time the %esp-delta can be computed at compile time, which saves an extra function call. Now %esp updates are handled with {new,die}_stack_mem, and optionally the specialised versions {new,die}_stack_mem_{4,8,12,16,32}. The core uses these specialised versions if they are present, or falls back to the generic version if not, or if the delta cannot be determined at compile time. Addrcheck and Memcheck have unrolled-loop versions for the specialised cases. Thanks to Julian for pointing out the inefficiency in the old approach. I haven't tested this super-thoroughly, so I would appreciate it if others could try it out. I would also be interested to hear what kind of speed-ups others get on different machines, environments, etc. Also, since a few things got moved around in the source, this might cause some CVS clashes with those of actively hacking Valgrind; apologies if so, but I hope you agree the performance improvements are worth it. N |
From: Josef W. <Jos...@gm...> - 2003-04-07 14:48:57
|
On Monday 07 April 2003 14:40, Nicholas Nethercote wrote: > On Fri, 4 Apr 2003, Charlls Quarra wrote: > > I wonder if there are issues about profiling > > dinamically loaded plugins? valgrind --skin=cachegrind > > doesnt seem to produce any relevant info about code in > > those plugins, only code in statically linked > > libraries > > Hmm, in theory there should be no problems, Valgrind (and hence > Cachegrind) can handle dlopen'd plugins ok... But they are unmapped before program termination. You only get one cost line "discarded" for all plugins. The solution would be to allow dumping while the plugin is mapped. Or you dump the information of BBCCs when the according object is unmapped. Or you don't delete BBCCs and accumate cost to "discarded", but keep them. Now you have the problem that a BB start address isn't unique to get to the right BBCC, but you have to store the SegInfo* in the BBCCs to distinguish among BBs from different plugins at the same address. BTW, my Calltree skin should be buggy in this regard, as it doesn't delete any BBCCs on unmapping, but still only uses the BB address. But checking the SegInfo* should be easy enough for a fix. Perhaps it's even better to take BBCCs of unmapped objects from the hash into a list, and put them in again if the object is mapped again after adjusting the addresses if needed. The same object could be mapped on a different base address. Josef > > N > > > > ------------------------------------------------------- > This SF.net email is sponsored by: ValueWeb: > Dedicated Hosting for just $79/mo with 500 GB of bandwidth! > No other company gives more support or power for your dedicated server > http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Nicholas N. <nj...@ca...> - 2003-04-07 12:40:40
|
On Fri, 4 Apr 2003, Charlls Quarra wrote: > I wonder if there are issues about profiling > dinamically loaded plugins? valgrind --skin=cachegrind > doesnt seem to produce any relevant info about code in > those plugins, only code in statically linked > libraries Hmm, in theory there should be no problems, Valgrind (and hence Cachegrind) can handle dlopen'd plugins ok... N |
From: Robert W. <rj...@du...> - 2003-04-07 06:12:52
|
> IIRC, the wine folks have a reasonable configure time check for NPTL in > CVS, so maybe you can simply swipe their code ? I can't find that. I can see a --with-nptl switch, but no automated check. If doing it the automated way is too difficult, maybe having a --with-nptl switch is the way to go? Regards, Robert. --=20 Robert Walsh Amalgamated Durables, Inc. - "We don't make the things you buy." Email: rj...@du... |
From: Robert W. <rj...@du...> - 2003-04-07 06:07:39
|
> Now what? If the symbol is in the DSO, why can't I link to it? > Mystified. You seem to be saying "Mystified" a lot about the NPTL :-) This does seem strange, though. Are you sure you're picking up the correct library? Maybe you need to say "extern int pthread_join_np();" instead of just "extern int pthread_join_np;" (i.e. declare it as an external function and not variable.) I haven't checked, so this may be a red herring. BTW: my file-descriptor leakage support is almost ready. I'm just testing the "passing fds over a unix-domain socket" support at the moment. Sockets are weird. Regards, Robert. --=20 Robert Walsh Amalgamated Durables, Inc. - "We don't make the things you buy." Email: rj...@du... |