From: Bernd E. <eid...@we...> - 2008-11-19 14:33:38
|
Hey ya, just a quick question: Is the result of "make memcheck" - currently - supposed to return "0 errors " in the error summary, likewise "0 bytes definitely lost" in the leak summary? I compiled the trunk against TCL 8.5.5, the first time with "vtmalloc", the second time with original untouched Tcl. But the results on my local box are... a little bit scary right now. Bernd. |
From: Vasiljevic Z. <zv...@ar...> - 2008-11-19 14:41:02
|
On 19.11.2008, at 15:13, Bernd Eidenschink wrote: > I compiled the trunk against TCL 8.5.5, the first time with > "vtmalloc", the > second time with original untouched Tcl. > Cannot comment on 8.5.5. Still using 8.4 branch (conservative user- base, heh). But... > But the results on my local box are... a little bit scary right now. > ... scary is relative. So, how scary is your "scary", really? (BTW, if I were you, I would turn ANY "clever" memory allocator and use malloc/free everywhere when debugging memory.) As from our experience... if we leak (which is not unlikely) then it is far from scary, as otherwise I'd already pull-out all my hair. However, when looking in the mirror, I can say we are not leaking. Cheers Zoran |
From: Bernd E. <eid...@we...> - 2008-11-19 16:50:27
|
> > But the results on my local box are... a little bit scary right now. > > ... scary is relative. So, how scary is your "scary", really? Scary for me, the ignorant :-) ==3781== ERROR SUMMARY: 18 errors from 9 contexts (suppressed: 106 from 1) ==3781== malloc/free: in use at exit: 148,617,748 bytes in 4,027 blocks. ==3781== malloc/free: 6,481 allocs, 2,454 frees, 183,008,167 bytes allocated. ==3781== For counts of detected errors, rerun with: -v ==3781== searching for pointers to 4,027 not-freed blocks. ==3781== checked 258,234,212 bytes. ==3781== LEAK SUMMARY: ==3781== definitely lost: 72 bytes in 5 blocks. ==3781== indirectly lost: 120 bytes in 10 blocks. ==3781== possibly lost: 139,672,920 bytes in 2,504 blocks. ==3781== still reachable: 8,944,636 bytes in 1,508 blocks. ==3781== suppressed: 0 bytes in 0 blocks. (this is the non-vtmalloc-default result) > (BTW, if I were you, I would turn ANY "clever" memory allocator and use > malloc/free everywhere when debugging memory.) My idea was: Give vtmalloc a try (as freeing memory always is nice, we have lots of memory intense XML import stuff to do) - but as I've never used it before, i ran into this appealing "memcheck" option ... > As from our experience... if we leak (which is not unlikely) then > it is far from scary, as otherwise I'd already pull-out all my hair. > However, when looking in the mirror, I can say we are not leaking. There is no correlation, believe me: We ware not leaking also, but our hairs... :-) cu BE |
From: Vasiljevic Z. <zv...@ar...> - 2008-11-19 17:02:00
|
On 19.11.2008, at 16:45, Bernd Eidenschink wrote: > ==3781== LEAK SUMMARY: > ==3781== definitely lost: 72 bytes in 5 blocks. > ==3781== indirectly lost: 120 bytes in 10 blocks. > ==3781== possibly lost: 139,672,920 bytes in 2,504 blocks. > ==3781== still reachable: 8,944,636 bytes in 1,508 blocks. > ==3781== suppressed: 0 bytes in 0 blocks. > Valgrind manual tells: "possibly lost" means your program is probably leaking memory, -> unless you're doing funny things with pointers <- (Other memtools like Purify do the same) So we have few real leaks. Most of it is due to some external extension that allocates something, then points into the alocated block and then releases the memory. This could mean a leak/growth but must not be. To find if this is true, kick "top" and observe how the process virtual size behaves when you repeatedly hit your pages (or use ab to simulate load). If this grows obviously and increasingly as the requesst commence, then you have a leak. I would say that naviserver is reasonably leak-free under most circumstances, Tcl 8.4 as well. I can also speak for some extensions like Tdom, Xotcl. We use them *extensively* and have absolutely no memory issues (ok, perhaps we do, but not like the above). |
From: Stephen D. <sd...@gm...> - 2008-11-19 17:28:26
|
On Wed, Nov 19, 2008 at 3:45 PM, Bernd Eidenschink <eid...@we...> wrote: >> > But the results on my local box are... a little bit scary right now. >> >> ... scary is relative. So, how scary is your "scary", really? > > Scary for me, the ignorant :-) > > > ==3781== ERROR SUMMARY: 18 errors from 9 contexts (suppressed: 106 from 1) This is the important bit. There should be no errors. Scroll back up the log output and you should see details of the errors. Unlike the summary, valgrind outputs these as they happen. I haven't tested against 8.5 in a long time, so I just tried it. I always compile Tcl for testing with --enable-symbols=mem, which you would think would interact badly with valgrind, but I've found it's like a pig sniffing out truffles when it comes to buffer over runs and depending on uninitialized memory. These are the kinds of errors valgrind is talking about, in contrast to plain memory leaks which you might expect to find. Anyway, won't even run with 8.5.5 here: [19/Nov/2008:16:48:33][26547.b7ff46c0][-main-] Fatal: expected to create new entry for object map Program received signal SIGABRT, Aborted. 0x00110416 in __kernel_vsyscall () Missing separate debuginfos, use: debuginfo-install gcc.i386 glibc.i686 (gdb) bt #0 0x00110416 in __kernel_vsyscall () #1 0x00b42660 in raise () from /lib/libc.so.6 #2 0x00b44028 in abort () from /lib/libc.so.6 #3 0x00144f18 in Panic (fmt=0x2b0760 "expected to create new entry for object map") at log.c:617 #4 0x0025fec7 in Tcl_PanicVA (format=0x2b0760 "expected to create new entry for object map", argList=0xbfc06694 "") at /home/sd/src/tcl8.5.5/unix/../generic/tclPanic.c:93 #5 0x0025ffca in Tcl_Panic (format=0x2b0760 "expected to create new entry for object map") at /home/sd/src/tcl8.5.5/unix/../generic/tclPanic.c:132 #6 0x0025c7b6 in TclDbInitNewObj (objPtr=0x96b7220) at /home/sd/src/tcl8.5.5/unix/../generic/tclObj.c:637 #7 0x0024b919 in Tcl_DbNewListObj (objc=1, objv=0xbfc06700, file=0x2ac234 "/home/sd/src/tcl8.5.5/unix/../generic/tclFileName.c", line=787) at /home/sd/src/tcl8.5.5/unix/../generic/tclListObj.c:239 #8 0x00227d72 in Tcl_FSJoinToPath (pathPtr=0x96e4b50, objc=0, objv=0x0) at /home/sd/src/tcl8.5.5/unix/../generic/tclFileName.c:787 #9 0x0026639d in SetFsPathFromAny (interp=0x0, pathPtr=0x96e4b50) at /home/sd/src/tcl8.5.5/unix/../generic/tclPathObj.c:2441 #10 0x00265e4b in TclFSSetPathDetails (pathPtr=0x96e4b50, fsRecPtr=0x96b4640, clientData=0x0) at /home/sd/src/tcl8.5.5/unix/../generic/tclPathObj.c:2203 #11 0x0024a1c3 in Tcl_FSGetFileSystemForPath (pathPtr=0x96e4b50) at /home/sd/src/tcl8.5.5/unix/../generic/tclIOUtil.c:4437 #12 0x00248a63 in Tcl_FSGetCwd (interp=0x0) at /home/sd/src/tcl8.5.5/unix/../generic/tclIOUtil.c:2736 #13 0x00148603 in SetCwd (path=0x96f53c8 "/home/sd/ns-scratch-hg/tests") at nsmain.c:1063 #14 0x00147c24 in Ns_Main (argc=8, argv=0xbfc06b74, initProc=0x8048636 <ServerInit>) at nsmain.c:504 #15 0x0804862c in main (argc=Cannot access memory at address 0x67b3 ) at main.c:64 Seems innocuous enough: nsd/nsmain.c: char * SetCwd(char *path) { Tcl_Obj *pathObj; pathObj = Tcl_NewStringObj(path, -1); Tcl_IncrRefCount(pathObj); if (Tcl_FSChdir(pathObj) == -1) { Ns_Fatal("nsmain: chdir(%s) failed: '%s'", path, strerror(Tcl_GetErrno())); } Tcl_DecrRefCount(pathObj); pathObj = Tcl_FSGetCwd(NULL); if (pathObj == NULL) { Ns_Fatal("nsmain: can't resolve home directory path"); } return (char *)Tcl_FSGetTranslatedStringPath(NULL, pathObj); } Is this a bug though? Can Tcl really expect to have never seen this pointer before: tcl8.5.5/generic/tclObj.c: void TclDbInitNewObj( register Tcl_Obj *objPtr) { objPtr->refCount = 0; objPtr->bytes = tclEmptyStringRep; objPtr->length = 0; objPtr->typePtr = NULL; #ifdef TCL_THREADS /* * Add entry to a thread local map used to check if a Tcl_Obj was * allocated by the currently executing thread. */ if (!TclInExit()) { Tcl_HashEntry *hPtr; Tcl_HashTable *tablePtr; int isNew; ThreadSpecificData *tsdPtr = TCL_TSD_INIT(&dataKey); if (tsdPtr->objThreadMap == NULL) { tsdPtr->objThreadMap = (Tcl_HashTable *) ckalloc(sizeof(Tcl_HashTable)); Tcl_InitHashTable(tsdPtr->objThreadMap, TCL_ONE_WORD_KEYS); } tablePtr = tsdPtr->objThreadMap; hPtr = Tcl_CreateHashEntry(tablePtr, (char *) objPtr, &isNew); if (!isNew) { Tcl_Panic("expected to create new entry for object map"); } Tcl_SetHashValue(hPtr, NULL); } #endif /* TCL_THREADS */ } |
From: Bernd E. <eid...@we...> - 2008-11-20 08:45:36
|
> > ==3781== ERROR SUMMARY: 18 errors from 9 contexts (suppressed: 106 from > > 1) > > This is the important bit. There should be no errors. Scroll back up > the log output and you should see details of the errors. Unlike the > summary, valgrind outputs these as they happen. > > I haven't tested against 8.5 in a long time, so I just tried it. Compiling 8.5 was just out of curiosity. I memchecked both 8.5 and 8.4 (latest sources) on naviserver trunk code and had errors with both. Doesn't matter to me as I just need a running server (and the initial impulse for make memcheck was just my test with vtmalloc.). I was just curious if the result is somewhat normal, as memcheck runs all the tests and one or more of them might intentionally peek and poke. (I didn't do one single click, just watched what make memcheck runs until it presents the results). So the result: Using trunk naviserver source code, using latest TCL 8.4 source code, compiling and memchecking spits out errors that can be ignored, unless barking in uppercase letters or freezing the os :-) cu BE |
From: Stephen D. <sd...@gm...> - 2008-11-20 18:20:18
|
On Thu, Nov 20, 2008 at 8:46 AM, Bernd Eidenschink <eid...@we...> wrote: >> > ==3781== ERROR SUMMARY: 18 errors from 9 contexts (suppressed: 106 from >> > 1) >> >> This is the important bit. There should be no errors. Scroll back up >> the log output and you should see details of the errors. Unlike the >> summary, valgrind outputs these as they happen. >> >> I haven't tested against 8.5 in a long time, so I just tried it. > > Compiling 8.5 was just out of curiosity. I memchecked both 8.5 and 8.4 (latest > sources) on naviserver trunk code and had errors with both. > > Doesn't matter to me as I just need a running server (and the initial impulse > for make memcheck was just my test with vtmalloc.). > I was just curious if the result is somewhat normal, as memcheck runs all the > tests and one or more of them might intentionally peek and poke. > (I didn't do one single click, just watched what make memcheck runs until it > presents the results). > > So the result: Using trunk naviserver source code, using latest TCL 8.4 source > code, compiling and memchecking spits out errors that can be ignored, unless > barking in uppercase letters or freezing the os :-) > No no. Errors are bad, there should be none. Stop teasing and show us them... :-) ( Make sure you compile Tcl and naviserver with --enable-symbols ) |
From: Bernd E. <eid...@we...> - 2008-11-21 09:25:41
|
> No no. Errors are bad, there should be none. > > Stop teasing and show us them... :-) Here you go: http://www.kinetiqa.de/naviserver/memcheck.log.txt TCL: 8.4.19 sources Naviserver: trunk (CVS) /tmp/tease/naviserver$ ./autogen.sh --enable-symbols --enable-threads --prefix=/tmp/ns --with-tcl=/tmp/ns/lib (guess thats ignorable:) Running aclocal -I m4 /usr/share/aclocal/libmcrypt.m4:17: warning: underquoted definition of AM_PATH_LIBMCRYPT /usr/share/aclocal/libmcrypt.m4:17: run info '(automake)Extending aclocal' /usr/share/aclocal/libmcrypt.m4:17: or see http://sources.redhat.com/automake/automake.html#Extending-aclocal Running autoheader BTW: "make install" fails because of "install-docs" in Makefile, maybe it would make sense to change the Makefile to be more aware of missing "dtplite"-missing situations. Nice: The ns_thread.test(s) double the memory usage, saturate my box with 100% CPU load... but, once done, all falls back to where it started. cu BE |
From: Stephen D. <sd...@gm...> - 2008-11-21 19:34:53
|
On 11/21/08, Bernd Eidenschink <eid...@we...> wrote: > > > No no. Errors are bad, there should be none. > > > > Stop teasing and show us them... :-) > > > Here you go: > http://www.kinetiqa.de/naviserver/memcheck.log.txt > > TCL: 8.4.19 sources This is weird, 8.4 also has errors? (you were using 8.5.5 before?) 8.4.19 works for me. Are you compiling 32bit on a 64bit linux box? Anyway, these kinds of things are usually errors (the message is from valgrind): ==18662== Invalid read of size 4 ==18662== at 0x40151E3: (within /lib/ld-2.7.so) ==18662== by 0x4005C59: (within /lib/ld-2.7.so) ==18662== by 0x4007A87: (within /lib/ld-2.7.so) ==18662== by 0x4011533: (within /lib/ld-2.7.so) ==18662== by 0x400D5C5: (within /lib/ld-2.7.so) ==18662== by 0x4010F4D: (within /lib/ld-2.7.so) ==18662== by 0x41E9C18: (within /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x400D5C5: (within /lib/ld-2.7.so) ==18662== by 0x41EA2BB: (within /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x41E9B50: dlopen (in /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x41CF245: TclpDlopen (tclLoadDl.c:78) ==18662== by 0x419195C: Tcl_FSLoadFile (tclIOUtil.c:2791) ==18662== Address 0x4fd2fc0 is 32 bytes inside a block of size 35 alloc'd ==18662== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==18662== by 0x4006FC4: (within /lib/ld-2.7.so) ==18662== by 0x40079C9: (within /lib/ld-2.7.so) ==18662== by 0x4011533: (within /lib/ld-2.7.so) ==18662== by 0x400D5C5: (within /lib/ld-2.7.so) ==18662== by 0x4010F4D: (within /lib/ld-2.7.so) ==18662== by 0x41E9C18: (within /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x400D5C5: (within /lib/ld-2.7.so) ==18662== by 0x41EA2BB: (within /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x41E9B50: dlopen (in /lib/tls/i686/cmov/libdl-2.7.so) ==18662== by 0x41CF245: TclpDlopen (tclLoadDl.c:78) ==18662== by 0x419195C: Tcl_FSLoadFile (tclIOUtil.c:2791) > BTW: "make install" fails because of "install-docs" in Makefile, > maybe it would make sense to change the Makefile to be more > aware of missing "dtplite"-missing situations. The idea is that if you are building from a released tarball then the built documentation is included and you don't need dtplite. If you're building direct from the repo, you need dtplite (and autoconf, etc.). > Nice: The ns_thread.test(s) double the memory usage, saturate my box with > 100% CPU load... but, once done, all falls back to where it started. Well at least something's working! |
From: Bernd E. <eid...@we...> - 2008-11-24 07:24:50
|
> This is weird, 8.4 also has errors? (you were using 8.5.5 before?) > > 8.4.19 works for me. Tried, out of a habit, the latest 8.5.5, but need latest 8.4.x to know our app runs. So, both. > Are you compiling 32bit on a 64bit linux box? 32bit/os Distributor ID: Ubuntu Description: Ubuntu 8.04.1 Release: 8.04 Codename: hardy Linux box 2.6.24-16-server #1 SMP Thu Apr 10 13:58:00 UTC 2008 i686 GNU/Linux processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 39 model name : AMD Athlon(tm) 64 Processor 3700+ stepping : 1 cpu MHz : 1000.000 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm ts fid vid ttp tm stc bogomips : 2001.67 clflush size : 64 > > BTW: "make install" fails because of "install-docs" in Makefile, > > maybe it would make sense to change the Makefile to be more > > aware of missing "dtplite"-missing situations. > > The idea is that if you are building from a released tarball then the > built documentation is included and you don't need dtplite. If you're > building direct from the repo, you need dtplite (and autoconf, etc.). That's ok. But as we do not offer released tarballs very often, maybe it's super easy for a Makefile wizard to test for DTPLITE existence and work around? For now, it just breaks the installation step... Bernd. |