|
From: <dan...@in...> - 2009-04-29 18:23:27
|
Hello all, This is my first post. I am using Valgrind 3.4.1. The application I was testing for what seems to be memory corruption is compiled along with dlmalloc-2.8.3.c Has anyone had experience with this? Will I be seeing "false positives"? If I recall, I had used this in a past (3.1?) Valgrind and had to remove dlmalloc due to too many reports from helgrind - my memory is poor but I seem to recall something of the sort. Right now, it is only reporting a very small amount of entries and they do seem to be around 'fishy' code for the most part. However, one such report has me baffled, it usually appears in most all reports: ==28103== Thread #3 was created ==28103== at 0x3329FC3AE1: clone (in /lib64/tls/libc-2.3.3.so) ==28103== by 0x332BB0645D: pthread_create@@GLIBC_2.2.5 (in /lib64/tls/libpthread-2.3.3.so) ==28103== by 0x4907974: pthread_create@* (hg_intercepts.c:214) ==28103== by 0x608694: tp::Thread::start() (Thread.C:56) ==28103== by 0x5C831E: RnrFw::RnrFactory::startRnr() (RnrFactory.cpp:242) ==28103== by 0x4E53F2: RnrProxy::startup() (RnrProxy.C:121) ==28103== by 0x4941FD: main (....C:894) ==28103== ==28103== Thread #1 is the program's root thread ==28103== ==28103== Possible data race during read of size 1 at 0x82bee0 by thread #3 ==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416) ==28103== by 0x332BDAFF28: __cxa_free_exception (in /usr/lib64/libstdc++.so.6.0.3) ==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned long) (SendHandler.cpp:120) ..stack trace snipped ==28103== by 0x49078C3: mythread_wrapper (hg_intercepts.c:194) ==28103== by 0x332BB05F80: start_thread (in /lib64/tls/libpthread-2.3.3.so) ==28103== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so) ==28103== This conflicts with a previous write of size 1 by thread #1 ==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407) ==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*, MetaToken const&, int) (symboldb.C:589) ==28103== by 0x4D33B3: SymbolDataStore::updateIToken(symbolRec*, MetaToken const&, long) (symboldb.C:635) ... more stack trace I suspected either something I am not seeing in the way the exception is created or more likely (I think? ;) ) something occuring as a result of the freeing of the exception memory after the stack was unwound - perhaps due to dlmalloc either bug or false positive. Note, dlmalloc is compiled into the app with -USE_LOCKS set. Thanks! Dan Now, while writing this, I ran under DRD (I didn't know about this tool until now) - I have not yet examined this as I have a meetign in 7 minutes. ==11369== Thread 3: ==11369== Conflicting load by thread 3/3 at 0x0082bf00 size 1 ==11369== at 0x4A9111: free (dlmalloc-2.8.3.c:2416) ==11369== by 0x332BDAFF28: __cxa_free_exception (in /usr/lib64/libstdc++.so.6.0.3) ==11369== by 0x5D3081: RnrFw::SendHandler::send(char const*, unsigned long) (SendHandler.cpp:121) ..snipped stack trace... ==11369== by 0x49085BE: vg_thread_wrapper (drd_pthread_intercepts.c:193) ==11369== by 0x332BB05F80: start_thread (in /lib64/tls/libpthread-2.3.3.so) ==11369== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so) ==11369== Allocation context: BSS section of /home/ddelgado/... ==11369== Other segment start (thread 1/1) ==11369== at 0x4909096: pthread_mutex_lock (drd_pthread_intercepts.c:417) ==11369== by 0x4A9AE0: malloc (dlmalloc-2.8.3.c:3358) ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/...) ...snipped stack trace.... ==11369== by 0x4944A2: main (...:954) ==11369== Other segment end (thread 1/1) ==11369== at 0x49093F4: pthread_mutex_unlock (drd_pthread_intercepts.c:463) ==11369== by 0x4A9B31: malloc (dlmalloc-2.8.3.c:3410) ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/....) ...snipped stack trace.... ==11369== by 0x4944A2: main (...:954) Dan Delgado | Senior Software Engineer * Interactive Data Real-Time Services | 100 Hillside Ave. | White Plains, NY 10603 | USA ( 914-313-4296 8 dan...@in... |
|
From: Konstantin S. <kon...@gm...> - 2009-04-29 18:33:12
|
Neither of these tools understand custom malloc functions. But it's easy to tech them. Way1: hack the dlmalloc code to include helgrind's client requests (see helgrind.h in valgrind distro) Way2: hack helgrind to intercept the custom malloc functions. ThreadSanitizer (code.google.com/p/data-race-test/wiki/ThreadSanitizer) uses this way, but it is not a part of the valgrind distro. Let us know if you need more details (sorry, have to run too :) --kcc On Wed, Apr 29, 2009 at 9:56 PM, <dan...@in...> wrote: > > Hello all, > > This is my first post. I am using Valgrind 3.4.1. The application I was > testing for what seems to be memory corruption is compiled along with > dlmalloc-2.8.3.c > > Has anyone had experience with this? Will I be seeing "false positives"? If > I recall, I had used this in a past (3.1?) Valgrind and had to remove > dlmalloc due to too many reports from helgrind - my memory is poor but I > seem to recall something of the sort. Right now, it is only reporting a very > small amount of entries and they do seem to be around 'fishy' code for the > most part. > > However, one such report has me baffled, it usually appears in most all > reports: > ==28103== Thread #3 was created > ==28103== at 0x3329FC3AE1: clone (in /lib64/tls/libc-2.3.3.so) > ==28103== by 0x332BB0645D: pthread_create@@GLIBC_2.2.5 (in > /lib64/tls/libpthread-2.3.3.so) > ==28103== by 0x4907974: pthread_create@* (hg_intercepts.c:214) > ==28103== by 0x608694: tp::Thread::start() (Thread.C:56) > ==28103== by 0x5C831E: RnrFw::RnrFactory::startRnr() (RnrFactory.cpp:242) > ==28103== by 0x4E53F2: RnrProxy::startup() (RnrProxy.C:121) > ==28103== by 0x4941FD: main (....C:894) > ==28103== > ==28103== Thread #1 is the program's root thread > ==28103== > ==28103== Possible data race during read of size 1 at 0x82bee0 by thread #3 > ==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416) > ==28103== by 0x332BDAFF28: __cxa_free_exception (in > /usr/lib64/libstdc++.so.6.0.3) > ==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned > long) (SendHandler.cpp:120) > ..stack trace snipped > ==28103== by 0x49078C3: mythread_wrapper (hg_intercepts.c:194) > ==28103== by 0x332BB05F80: start_thread (in > /lib64/tls/libpthread-2.3.3.so) > ==28103== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so) > ==28103== This conflicts with a previous write of size 1 by thread #1 > ==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407) > ==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*, MetaToken > const&, int) (symboldb.C:589) > ==28103== by 0x4D33B3: SymbolDataStore::updateIToken(symbolRec*, > MetaToken const&, long) (symboldb.C:635) > ... more stack trace > > > I suspected either something I am not seeing in the way the exception is > created or more likely (I think? ;) ) something occuring as a result of the > freeing of the exception memory after the stack was unwound - perhaps due to > dlmalloc either bug or false positive. > > Note, dlmalloc is compiled into the app with -USE_LOCKS set. > > Thanks! > Dan > > > > Now, while writing this, I ran under DRD (I didn't know about this tool > until now) - I have not yet examined this as I have a meetign in 7 minutes. > ==11369== Thread 3: > ==11369== Conflicting load by thread 3/3 at 0x0082bf00 size 1 > ==11369== at 0x4A9111: free (dlmalloc-2.8.3.c:2416) > ==11369== by 0x332BDAFF28: __cxa_free_exception (in > /usr/lib64/libstdc++.so.6.0.3) > ==11369== by 0x5D3081: RnrFw::SendHandler::send(char const*, unsigned > long) (SendHandler.cpp:121) > ..snipped stack trace... > ==11369== by 0x49085BE: vg_thread_wrapper (drd_pthread_intercepts.c:193) > ==11369== by 0x332BB05F80: start_thread (in > /lib64/tls/libpthread-2.3.3.so) > ==11369== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so) > ==11369== Allocation context: BSS section of /home/ddelgado/... > ==11369== Other segment start (thread 1/1) > ==11369== at 0x4909096: pthread_mutex_lock (drd_pthread_intercepts.c:417) > ==11369== by 0x4A9AE0: malloc (dlmalloc-2.8.3.c:3358) > ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/...) > ...snipped stack trace.... > ==11369== by 0x4944A2: main (...:954) > > ==11369== Other segment end (thread 1/1) > ==11369== at 0x49093F4: pthread_mutex_unlock > (drd_pthread_intercepts.c:463) > ==11369== by 0x4A9B31: malloc (dlmalloc-2.8.3.c:3410) > ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/....) > ...snipped stack trace.... > ==11369== by 0x4944A2: main (...:954) > > > > Dan Delgado | Senior Software Engineer > * Interactive Data Real-Time Services | 100 Hillside Ave. | White Plains, > NY 10603 | USA > ( 914-313-4296 8 dan...@in... > > > > > > > ------------------------------------------------------------------------------ > Register Now & Save for Velocity, the Web Performance & Operations > Conference from O'Reilly Media. Velocity features a full day of > expert-led, hands-on workshops and two days of sessions from industry > leaders in dedicated Performance & Operations tracks. Use code vel09scf > and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > |
|
From: <dan...@in...> - 2009-04-29 20:52:46
|
Konstantin,
The first method you suggest sounds the best for maintaining the changes
necessary in the one product that necessitates them:
"Way1: hack the dlmalloc code to include helgrind's client requests
(see helgrind.h in valgrind distro)"
I apologize, however, and must admit my ignorance as to what this would
entail. I looked at the helgrind.h header and saw:
#include "valgrind.h"
typedef
enum {
VG_USERREQ__HG_CLEAN_MEMORY = VG_USERREQ_TOOL_BASE('H','G'),
/* The rest are for Helgrind's internal use. Not for end-user
use. Do not use them unless you are a Valgrind developer. */
/* Notify the tool what this thread's pthread_t is. */
_VG_USERREQ__HG_SET_MY_PTHREAD_T = VG_USERREQ_TOOL_BASE('H','G')
+ 256,
....
/* Clean memory state. This makes Helgrind forget everything it knew
about the specified memory range, and resets it to New. This is
particularly useful for memory allocators that wish to recycle
memory. */
#define VALGRIND_HG_CLEAN_MEMORY(_qzz_start, _qzz_len) \
If you mean I should add this macro to any memory location returned from
the various methods in dlmalloc, would this be the equivalent of simply
removing dlmalloc.c from the makefile and using the default allocator?
If this is the case, I have yet to see any helgrind reports when this is
done (I tried this early on.) However, the line I first listed actually
DID have a Race condition (since fixed) in the SendHandler code that was
not noticed before. It was not possible that this was the problem as
instrumentiation proves this call was done once at startup and never again
(it was a reconnect code gone awry) in testing. Yet we got those errors.
I started wondering if it was something related to something presumably
allocated in (in /usr/lib64/libstdc++.so.6.0.3) and freed in
(dlmalloc-2.8.3.c:2416) due to:
==28103== Possible data race during read of size 1 at 0x82bee0 by thread
#3
==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
==28103== by 0x332BDAFF28: __cxa_free_exception (in
/usr/lib64/libstdc++.so.6.0.3)
==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned
long) (SendHandler.cpp:120)
...
==28103== This conflicts with a previous write of size 1 by thread #1
==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407)
==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*,
MetaToken const&, int) (symboldb.C:589)
...
I am, as I said, ignorant in this area - but I'd assume that libstdc is
using dlmalloc if it is compiled in since the library exporting
SendHandler is a librnrfw.a file (if that even makes a difference)
But I was reaching for any clues...
In the past (3.1.x?) when I used Helgrind, it found me very clear "x is
writing to y without using previously used z mutex" style errors which
helped find some sneaky errors in code I was debugging. This time, I don't
even know what is being referred to in order to infer some ideas as to the
cause.
ie: What is "size 1"? What is "0x82bee0" (which is always the address
being reported every time) - is it the address of some dlmalloc method,
variable?
Thanks again - I'm ... unfortunately... the "valgrind expert" here, since
I wrote a suppressions system for our apps and have read the manual...
obviously that an "expert" does not make by any stretch :) so here I am
confused.
In case this helps in your answe:
... snipped from dlmalloc-2.8.3.c source file...
...
/* ------------------- Declarations of public routines -------------------
*/
#ifndef USE_DL_PREFIX
#define dlcalloc calloc
#define dlfree free
#define dlmalloc malloc
#define dlmemalign memalign
#define dlrealloc realloc
#define dlvalloc valloc
#define dlpvalloc pvalloc
#define dlmallinfo mallinfo
#define dlmallopt mallopt
#define dlmalloc_trim malloc_trim
#define dlmalloc_stats malloc_stats
#define dlmalloc_usable_size malloc_usable_size
#define dlmalloc_footprint malloc_footprint
#define dlmalloc_max_footprint malloc_max_footprint
#define dlindependent_calloc independent_calloc
#define dlindependent_comalloc independent_comalloc
#endif /* USE_DL_PREFIX */
/*
malloc(size_t n)
Returns a pointer to a newly allocated chunk of at least n bytes, or
null if no space is available, in which case errno is set to ENOMEM
on ANSI C systems.
If n is zero, malloc returns a minimum-sized chunk. (The minimum
size is 16 bytes on most 32bit systems, and 32 bytes on 64bit
systems.) Note that size_t is an unsigned type, so calls with
arguments that would be negative if signed are interpreted as
requests for huge amounts of space, which will often fail. The
maximum supported value of n differs across systems, but is in all
cases less than the maximum representable value of a size_t.
*/
void* dlmalloc(size_t);
/*
free(void* p)
Releases the chunk of memory pointed to by p, that had been previously
allocated using malloc or a related routine such as realloc.
It has no effect if p is null. If p was not malloced or already
freed, free(p) will by default cause the current program to abort.
*/
void dlfree(void*);
... more stuff ...
Konstantin Serebryany <kon...@gm...>
04/29/2009 02:33 PM
To
Daniel Delgado/ComStock@spc
cc
val...@li...
Subject
Re: [Valgrind-users] dlmalloc and helgrind (drd?) ?
Neither of these tools understand custom malloc functions.
But it's easy to tech them.
Way1: hack the dlmalloc code to include helgrind's client requests
(see helgrind.h in valgrind distro)
Way2: hack helgrind to intercept the custom malloc functions.
ThreadSanitizer
(code.google.com/p/data-race-test/wiki/ThreadSanitizer) uses this way,
but it is not a part of the valgrind distro.
Let us know if you need more details (sorry, have to run too :)
--kcc
On Wed, Apr 29, 2009 at 9:56 PM, <dan...@in...>
wrote:
>
> Hello all,
>
> This is my first post. I am using Valgrind 3.4.1. The application I was
> testing for what seems to be memory corruption is compiled along with
> dlmalloc-2.8.3.c
>
> Has anyone had experience with this? Will I be seeing "false
positives"? If
> I recall, I had used this in a past (3.1?) Valgrind and had to remove
> dlmalloc due to too many reports from helgrind - my memory is poor but I
> seem to recall something of the sort. Right now, it is only reporting a
very
> small amount of entries and they do seem to be around 'fishy' code for
the
> most part.
>
> However, one such report has me baffled, it usually appears in most all
> reports:
> ==28103== Thread #3 was created
> ==28103== at 0x3329FC3AE1: clone (in /lib64/tls/libc-2.3.3.so)
> ==28103== by 0x332BB0645D: pthread_create@@GLIBC_2.2.5 (in
> /lib64/tls/libpthread-2.3.3.so)
> ==28103== by 0x4907974: pthread_create@* (hg_intercepts.c:214)
> ==28103== by 0x608694: tp::Thread::start() (Thread.C:56)
> ==28103== by 0x5C831E: RnrFw::RnrFactory::startRnr()
(RnrFactory.cpp:242)
> ==28103== by 0x4E53F2: RnrProxy::startup() (RnrProxy.C:121)
> ==28103== by 0x4941FD: main (....C:894)
> ==28103==
> ==28103== Thread #1 is the program's root thread
> ==28103==
> ==28103== Possible data race during read of size 1 at 0x82bee0 by thread
#3
> ==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
> ==28103== by 0x332BDAFF28: __cxa_free_exception (in
> /usr/lib64/libstdc++.so.6.0.3)
> ==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned
> long) (SendHandler.cpp:120)
> ..stack trace snipped
> ==28103== by 0x49078C3: mythread_wrapper (hg_intercepts.c:194)
> ==28103== by 0x332BB05F80: start_thread (in
> /lib64/tls/libpthread-2.3.3.so)
> ==28103== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so)
> ==28103== This conflicts with a previous write of size 1 by thread #1
> ==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407)
> ==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*,
MetaToken
> const&, int) (symboldb.C:589)
> ==28103== by 0x4D33B3: SymbolDataStore::updateIToken(symbolRec*,
> MetaToken const&, long) (symboldb.C:635)
> ... more stack trace
>
>
> I suspected either something I am not seeing in the way the exception is
> created or more likely (I think? ;) ) something occuring as a result of
the
> freeing of the exception memory after the stack was unwound - perhaps
due to
> dlmalloc either bug or false positive.
>
> Note, dlmalloc is compiled into the app with -USE_LOCKS set.
>
> Thanks!
> Dan
>
>
>
> Now, while writing this, I ran under DRD (I didn't know about this tool
> until now) - I have not yet examined this as I have a meetign in 7
minutes.
> ==11369== Thread 3:
> ==11369== Conflicting load by thread 3/3 at 0x0082bf00 size 1
> ==11369== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
> ==11369== by 0x332BDAFF28: __cxa_free_exception (in
> /usr/lib64/libstdc++.so.6.0.3)
> ==11369== by 0x5D3081: RnrFw::SendHandler::send(char const*, unsigned
> long) (SendHandler.cpp:121)
> ..snipped stack trace...
> ==11369== by 0x49085BE: vg_thread_wrapper
(drd_pthread_intercepts.c:193)
> ==11369== by 0x332BB05F80: start_thread (in
> /lib64/tls/libpthread-2.3.3.so)
> ==11369== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so)
> ==11369== Allocation context: BSS section of /home/ddelgado/...
> ==11369== Other segment start (thread 1/1)
> ==11369== at 0x4909096: pthread_mutex_lock
(drd_pthread_intercepts.c:417)
> ==11369== by 0x4A9AE0: malloc (dlmalloc-2.8.3.c:3358)
> ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/...)
> ...snipped stack trace....
> ==11369== by 0x4944A2: main (...:954)
>
> ==11369== Other segment end (thread 1/1)
> ==11369== at 0x49093F4: pthread_mutex_unlock
> (drd_pthread_intercepts.c:463)
> ==11369== by 0x4A9B31: malloc (dlmalloc-2.8.3.c:3410)
> ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/....)
> ...snipped stack trace....
> ==11369== by 0x4944A2: main (...:954)
>
>
>
> Dan Delgado | Senior Software Engineer
> * Interactive Data Real-Time Services | 100 Hillside Ave. | White
Plains,
> NY 10603 | USA
> ( 914-313-4296 8 dan...@in...
>
>
>
>
>
>
>
------------------------------------------------------------------------------
> Register Now & Save for Velocity, the Web Performance & Operations
> Conference from O'Reilly Media. Velocity features a full day of
> expert-led, hands-on workshops and two days of sessions from industry
> leaders in dedicated Performance & Operations tracks. Use code vel09scf
> and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>
>
|
|
From: Konstantin S. <kon...@gm...> - 2009-04-30 06:52:18
|
On Thu, Apr 30, 2009 at 12:52 AM, <dan...@in...> wrote:
>
> Konstantin,
>
> The first method you suggest sounds the best for maintaining the changes
> necessary in the one product that necessitates them:
> "Way1: hack the dlmalloc code to include helgrind's client requests
> (see helgrind.h in valgrind distro)"
>
> I apologize, however, and must admit my ignorance as to what this would
> entail. I looked at the helgrind.h header and saw:
>
> #include "valgrind.h"
>
> typedef
> enum {
> VG_USERREQ__HG_CLEAN_MEMORY = VG_USERREQ_TOOL_BASE('H','G'),
>
> /* The rest are for Helgrind's internal use. Not for end-user
> use. Do not use them unless you are a Valgrind developer. */
>
> /* Notify the tool what this thread's pthread_t is. */
> _VG_USERREQ__HG_SET_MY_PTHREAD_T = VG_USERREQ_TOOL_BASE('H','G')
> + 256,
>
> ....
>
>
> /* Clean memory state. This makes Helgrind forget everything it knew
> about the specified memory range, and resets it to New. This is
> particularly useful for memory allocators that wish to recycle
> memory. */
> #define VALGRIND_HG_CLEAN_MEMORY(_qzz_start, _qzz_len) \
>
> If you mean I should add this macro to any memory location returned from the
> various methods in dlmalloc,
Yes, this is the right macro.
You would probably call it in the beginning of free() as well.
(Hm. free() has no information about the size of this memory. I'm
puzzled. Probably helgrind needs annotations equivalent to those in
memcheck.h)
> would this be the equivalent of simply removing
> dlmalloc.c from the makefile and using the default allocator?
No. You will still run the dlmalloc code.
But if you can easily remove dlmalloc from linking -- this will be the
simplest way for you.
>
> If this is the case, I have yet to see any helgrind reports when this is
> done (I tried this early on.) However, the line I first listed actually DID
> have a Race condition (since fixed) in the SendHandler code that was not
> noticed before. It was not possible that this was the problem as
> instrumentiation proves this call was done once at startup and never again
> (it was a reconnect code gone awry) in testing. Yet we got those errors.
>
> I started wondering if it was something related to something presumably
> allocated in (in /usr/lib64/libstdc++.so.6.0.3) and freed in
> (dlmalloc-2.8.3.c:2416) due to:
> ==28103== Possible data race during read of size 1 at 0x82bee0 by thread #3
> ==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
> ==28103== by 0x332BDAFF28: __cxa_free_exception (in
> /usr/lib64/libstdc++.so.6.0.3)
> ==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned
> long) (SendHandler.cpp:120)
> ...
>
> ==28103== This conflicts with a previous write of size 1 by thread #1
> ==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407)
> ==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*, MetaToken
> const&, int) (symboldb.C:589)
> ...
>
> I am, as I said, ignorant in this area - but I'd assume that libstdc is
> using dlmalloc if it is compiled in since the library exporting SendHandler
> is a librnrfw.a file (if that even makes a difference)
>
> But I was reaching for any clues...
> In the past (3.1.x?) when I used Helgrind, it found me very clear "x is
> writing to y without using previously used z mutex" style errors which
> helped find some sneaky errors in code I was debugging. This time, I don't
> even know what is being referred to in order to infer some ideas as to the
> cause.
>
> ie: What is "size 1"? What is "0x82bee0" (which is always the address being
> reported every time) - is it the address of some dlmalloc method, variable?
Hm. Let me advertise ThreadSanitizer once more (see the screen shot:
http://data-race-test.googlecode.com/svn/trunk/images/tsan-in-vim3.png)
--kcc
>
>
> Thanks again - I'm ... unfortunately... the "valgrind expert" here, since I
> wrote a suppressions system for our apps and have read the manual...
> obviously that an "expert" does not make by any stretch :) so here I am
> confused.
>
> In case this helps in your answe:
>
> ... snipped from dlmalloc-2.8.3.c source file...
> ...
> /* ------------------- Declarations of public routines -------------------
> */
>
> #ifndef USE_DL_PREFIX
> #define dlcalloc calloc
> #define dlfree free
> #define dlmalloc malloc
> #define dlmemalign memalign
> #define dlrealloc realloc
> #define dlvalloc valloc
> #define dlpvalloc pvalloc
> #define dlmallinfo mallinfo
> #define dlmallopt mallopt
> #define dlmalloc_trim malloc_trim
> #define dlmalloc_stats malloc_stats
> #define dlmalloc_usable_size malloc_usable_size
> #define dlmalloc_footprint malloc_footprint
> #define dlmalloc_max_footprint malloc_max_footprint
> #define dlindependent_calloc independent_calloc
> #define dlindependent_comalloc independent_comalloc
> #endif /* USE_DL_PREFIX */
>
>
> /*
> malloc(size_t n)
> Returns a pointer to a newly allocated chunk of at least n bytes, or
> null if no space is available, in which case errno is set to ENOMEM
> on ANSI C systems.
>
> If n is zero, malloc returns a minimum-sized chunk. (The minimum
> size is 16 bytes on most 32bit systems, and 32 bytes on 64bit
> systems.) Note that size_t is an unsigned type, so calls with
> arguments that would be negative if signed are interpreted as
> requests for huge amounts of space, which will often fail. The
> maximum supported value of n differs across systems, but is in all
> cases less than the maximum representable value of a size_t.
> */
> void* dlmalloc(size_t);
>
> /*
> free(void* p)
> Releases the chunk of memory pointed to by p, that had been previously
> allocated using malloc or a related routine such as realloc.
> It has no effect if p is null. If p was not malloced or already
> freed, free(p) will by default cause the current program to abort.
> */
> void dlfree(void*);
>
> ... more stuff ...
>
>
>
>
> Konstantin Serebryany <kon...@gm...>
>
> 04/29/2009 02:33 PM
>
> To
> Daniel Delgado/ComStock@spc
> cc
> val...@li...
> Subject
> Re: [Valgrind-users] dlmalloc and helgrind (drd?) ?
>
>
>
> Neither of these tools understand custom malloc functions.
> But it's easy to tech them.
>
> Way1: hack the dlmalloc code to include helgrind's client requests
> (see helgrind.h in valgrind distro)
> Way2: hack helgrind to intercept the custom malloc functions.
> ThreadSanitizer
> (code.google.com/p/data-race-test/wiki/ThreadSanitizer) uses this way,
> but it is not a part of the valgrind distro.
>
> Let us know if you need more details (sorry, have to run too :)
>
> --kcc
>
>
> On Wed, Apr 29, 2009 at 9:56 PM, <dan...@in...>
> wrote:
>>
>> Hello all,
>>
>> This is my first post. I am using Valgrind 3.4.1. The application I was
>> testing for what seems to be memory corruption is compiled along with
>> dlmalloc-2.8.3.c
>>
>> Has anyone had experience with this? Will I be seeing "false positives"?
>> If
>> I recall, I had used this in a past (3.1?) Valgrind and had to remove
>> dlmalloc due to too many reports from helgrind - my memory is poor but I
>> seem to recall something of the sort. Right now, it is only reporting a
>> very
>> small amount of entries and they do seem to be around 'fishy' code for the
>> most part.
>>
>> However, one such report has me baffled, it usually appears in most all
>> reports:
>> ==28103== Thread #3 was created
>> ==28103== at 0x3329FC3AE1: clone (in /lib64/tls/libc-2.3.3.so)
>> ==28103== by 0x332BB0645D: pthread_create@@GLIBC_2.2.5 (in
>> /lib64/tls/libpthread-2.3.3.so)
>> ==28103== by 0x4907974: pthread_create@* (hg_intercepts.c:214)
>> ==28103== by 0x608694: tp::Thread::start() (Thread.C:56)
>> ==28103== by 0x5C831E: RnrFw::RnrFactory::startRnr()
>> (RnrFactory.cpp:242)
>> ==28103== by 0x4E53F2: RnrProxy::startup() (RnrProxy.C:121)
>> ==28103== by 0x4941FD: main (....C:894)
>> ==28103==
>> ==28103== Thread #1 is the program's root thread
>> ==28103==
>> ==28103== Possible data race during read of size 1 at 0x82bee0 by thread
>> #3
>> ==28103== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
>> ==28103== by 0x332BDAFF28: __cxa_free_exception (in
>> /usr/lib64/libstdc++.so.6.0.3)
>> ==28103== by 0x5D3051: RnrFw::SendHandler::send(char const*, unsigned
>> long) (SendHandler.cpp:120)
>> ..stack trace snipped
>> ==28103== by 0x49078C3: mythread_wrapper (hg_intercepts.c:194)
>> ==28103== by 0x332BB05F80: start_thread (in
>> /lib64/tls/libpthread-2.3.3.so)
>> ==28103== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so)
>> ==28103== This conflicts with a previous write of size 1 by thread #1
>> ==28103== at 0x4A9B22: malloc (dlmalloc-2.8.3.c:3407)
>> ==28103== by 0x4D303D: SymbolDataStore::tokenAlloc(symbolRec*,
>> MetaToken
>> const&, int) (symboldb.C:589)
>> ==28103== by 0x4D33B3: SymbolDataStore::updateIToken(symbolRec*,
>> MetaToken const&, long) (symboldb.C:635)
>> ... more stack trace
>>
>>
>> I suspected either something I am not seeing in the way the exception is
>> created or more likely (I think? ;) ) something occuring as a result of
>> the
>> freeing of the exception memory after the stack was unwound - perhaps due
>> to
>> dlmalloc either bug or false positive.
>>
>> Note, dlmalloc is compiled into the app with -USE_LOCKS set.
>>
>> Thanks!
>> Dan
>>
>>
>>
>> Now, while writing this, I ran under DRD (I didn't know about this tool
>> until now) - I have not yet examined this as I have a meetign in 7
>> minutes.
>> ==11369== Thread 3:
>> ==11369== Conflicting load by thread 3/3 at 0x0082bf00 size 1
>> ==11369== at 0x4A9111: free (dlmalloc-2.8.3.c:2416)
>> ==11369== by 0x332BDAFF28: __cxa_free_exception (in
>> /usr/lib64/libstdc++.so.6.0.3)
>> ==11369== by 0x5D3081: RnrFw::SendHandler::send(char const*, unsigned
>> long) (SendHandler.cpp:121)
>> ..snipped stack trace...
>> ==11369== by 0x49085BE: vg_thread_wrapper
>> (drd_pthread_intercepts.c:193)
>> ==11369== by 0x332BB05F80: start_thread (in
>> /lib64/tls/libpthread-2.3.3.so)
>> ==11369== by 0x3329FC3AF2: clone (in /lib64/tls/libc-2.3.3.so)
>> ==11369== Allocation context: BSS section of /home/ddelgado/...
>> ==11369== Other segment start (thread 1/1)
>> ==11369== at 0x4909096: pthread_mutex_lock
>> (drd_pthread_intercepts.c:417)
>> ==11369== by 0x4A9AE0: malloc (dlmalloc-2.8.3.c:3358)
>> ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/...)
>> ...snipped stack trace....
>> ==11369== by 0x4944A2: main (...:954)
>>
>> ==11369== Other segment end (thread 1/1)
>> ==11369== at 0x49093F4: pthread_mutex_unlock
>> (drd_pthread_intercepts.c:463)
>> ==11369== by 0x4A9B31: malloc (dlmalloc-2.8.3.c:3410)
>> ==11369== by 0x48EE4B: zcalloc (in /home/ddelgado/....)
>> ...snipped stack trace....
>> ==11369== by 0x4944A2: main (...:954)
>>
>>
>>
>> Dan Delgado | Senior Software Engineer
>> * Interactive Data Real-Time Services | 100 Hillside Ave. | White Plains,
>> NY 10603 | USA
>> ( 914-313-4296 8 dan...@in...
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Register Now & Save for Velocity, the Web Performance & Operations
>> Conference from O'Reilly Media. Velocity features a full day of
>> expert-led, hands-on workshops and two days of sessions from industry
>> leaders in dedicated Performance & Operations tracks. Use code vel09scf
>> and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
>> _______________________________________________
>> Valgrind-users mailing list
>> Val...@li...
>> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>>
>>
>
> ------------------------------------------------------------------------------
> Register Now & Save for Velocity, the Web Performance & Operations
> Conference from O'Reilly Media. Velocity features a full day of
> expert-led, hands-on workshops and two days of sessions from industry
> leaders in dedicated Performance & Operations tracks. Use code vel09scf
> and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>
>
|
|
From: Bart V. A. <bar...@gm...> - 2009-04-30 06:28:48
|
On Wed, Apr 29, 2009 at 7:56 PM, <dan...@in...> wrote: > This is my first post. I am using Valgrind 3.4.1. The application I was > testing for what seems to be memory corruption is compiled along with > dlmalloc-2.8.3.c Have you considered to relink your application such that it uses glibc's memory allocator instead of dlmalloc before analyzing it with Helgrind or DRD ? Bart. |
|
From: Konstantin S. <kon...@gm...> - 2009-04-30 06:39:04
|
On Thu, Apr 30, 2009 at 10:28 AM, Bart Van Assche <bar...@gm...> wrote: > On Wed, Apr 29, 2009 at 7:56 PM, <dan...@in...> wrote: >> This is my first post. I am using Valgrind 3.4.1. The application I was >> testing for what seems to be memory corruption is compiled along with >> dlmalloc-2.8.3.c > > Have you considered to relink your application such that it uses > glibc's memory allocator instead of dlmalloc before analyzing it with > Helgrind or DRD ? That may or may not work depending on a particular use of a custom malloc library. In our case (we use tcmalloc) we cannot replace the custom malloc with the glibc's malloc because many our applications rely on the fact that we use tcmalloc. --kcc > > Bart. > |
|
From: <dan...@in...> - 2009-04-30 17:37:52
|
Thusfar the helgrind reports have gone away each of the times I ran this application without dlmalloc. It IS a race condition and while I was able to run it 3-4 times without report that's only 1-2 more times than the cases where I get one report in 1-2 run. The time it takes to do the minimal test is about 40 minutes, the helgrind report can appear (with the exact same test data - but the data is being fed via sockets from a capture file) after 3-4 mins or 35 mins so I am never sure until I do extensive trials if the problem "goes away" - I just know when it exists. The thing is though, the line in SendHandler referenced actually led me to find an actual race condition where a thread can delete a memory area, another get it in allocation then another deallocate the memory that was just deleted in the first thread then given to a different thread for a different object (really really bad design led to this sneaking through - but it is now fixed. The window was quite small, 2 other threads had to do the "right" wrong thing during the window of execution between: delete tcpClient; tcpClient = 0; I know this never happened in production (or rather not as common as the problem we see) because this only occurs when we cannot connect upstream. However, I fixed it, of course. So, the fact that it was near a real issue had me thinking that was the issue but it did not go away once I fixed the code. In the past when I compiled against dlmalloc (the app that uses it is from another division) I had too many reports to see the real errors and removed dlmalloc. This time, only this one report. Which made me think helgrind (or dlmalloc) changed to work with each other and makes me want to see what it meant. It's the opposite of crying wolf this time - it is the only dlmalloc complaint. Of course... it could be a false positive and helgrind is just not seeing (or drd) the issue that is the real problem at all (obviously not unlikely.) ---Dan On Thu, Apr 30, 2009 at 10:28 AM, Bart Van Assche <bar...@gm...> wrote: > On Wed, Apr 29, 2009 at 7:56 PM, <dan...@in...> wrote: >> This is my first post. I am using Valgrind 3.4.1. The application I was >> testing for what seems to be memory corruption is compiled along with >> dlmalloc-2.8.3.c > > Have you considered to relink your application such that it uses > glibc's memory allocator instead of dlmalloc before analyzing it with > Helgrind or DRD ? That may or may not work depending on a particular use of a custom malloc library. In our case (we use tcmalloc) we cannot replace the custom malloc with the glibc's malloc because many our applications rely on the fact that we use tcmalloc. --kcc > > Bart. > |
|
From: Bart V. A. <bar...@gm...> - 2009-05-01 12:34:47
|
On Thu, Apr 30, 2009 at 7:37 PM, <dan...@in...> wrote: > Thusfar the helgrind reports have gone away each of the times I ran this application > without dlmalloc. Are you familiar with the VALGRIND_MALLOCLIKE_BLOCK() and VALGRIND_FREELIKE_BLOCK() macro's defined in the header file <valgrind/valgrind.h> ? These macro's allow a memory allocation library to tell a Valgrind tool about its allocation and deallocation activities, and could be used to instrument the dlmalloc source code. Several Valgrind tools, including the trunk version of DRD, support these macro's. Bart. |