|
From: Dave G. <go...@mc...> - 2012-05-09 22:36:16
Attachments:
foo.c
|
I recently ran into some surprising behavior in Valgrind's mempool handling in the presence of --malloc-fill and --free-fill options. Basically, mempool client requests respect the value given to --free-fill, but not the value given to --malloc-fill. I've attached a small test case, which when run as "valgrind -q --malloc-fill=0xaf --free-fill=0xdb ./a.out 2>/dev/null" yields the following output: ----8<---- *s=0x5 after MEMPOOL_ALLOC *s=0xdb after MEMPOOL_FREE *s=0xdb after second MEMPOOL_ALLOC ----8<---- I would expect instead to see either [0xaf,0xdb,0xaf] or [0x5,0x5,0x5] as the output. I have a slight preference towards the latter since I'm already managing my own memory and this behavior makes me do extra work to patch up the free nodes. Is this behavior expected? I'll can create a bugzilla for it either way, since a clarifying doc fix makes sense to me if it is expected behavior. Should the behavior depend on the mempool's "is_zeroed" value? (it doesn't right now) It's clear how this is happening in the Valgrind code; "MC_(new_block)" and "die_and_free_mem" have asymmetric behavior when "MC_AllocCustom" is passed as "kind" to "MC_(new_block)". It looks easy enough to fix in either function, depending on what behavior is desired. -Dave |
|
From: Julian S. <js...@ac...> - 2012-05-10 07:04:57
|
> Is this behavior expected? I'll can create a bugzilla for it either way, > since a clarifying doc fix makes sense to me if it is expected behavior. Yes, please file a bug report. If you have a tentative fix pls add that as a patch. > It's clear how this is happening in the Valgrind code; "MC_(new_block)" and > "die_and_free_mem" have asymmetric behavior when "MC_AllocCustom" is > passed as "kind" to "MC_(new_block)". It looks easy enough to fix in > either function, depending on what behavior is desired. I'm unclear why you speak of the possibility of two different behaviours. Shouldn't the clreqs also simply respect the value of --malloc-fill? If so, why are there two possible fixes? J |
|
From: Dave G. <go...@mc...> - 2012-05-10 15:16:40
|
On May 10, 2012, at 2:04 AM CDT, Julian Seward wrote: >> Is this behavior expected? I'll can create a bugzilla for it either way, >> since a clarifying doc fix makes sense to me if it is expected behavior. > > Yes, please file a bug report. If you have a tentative fix pls add that > as a patch. Will do. >> It's clear how this is happening in the Valgrind code; "MC_(new_block)" and >> "die_and_free_mem" have asymmetric behavior when "MC_AllocCustom" is >> passed as "kind" to "MC_(new_block)". It looks easy enough to fix in >> either function, depending on what behavior is desired. > > I'm unclear why you speak of the possibility of two different behaviours. > Shouldn't the clreqs also simply respect the value of --malloc-fill? If > so, why are there two possible fixes? For my use case I would actually prefer that it ignore the values of both --malloc-fill and --free-fill for the clreqs, or at least it should be controllable by "is_zeroed" or another mempool creation parameter. I can deal with either behavior though, so it isn't a huge deal if you think that "always use both -fill parameters" is the correct interpretation. -Dave |
|
From: Dave G. <go...@mc...> - 2012-05-10 18:50:33
|
On May 10, 2012, at 10:16 AM CDT, Dave Goodell wrote: > On May 10, 2012, at 2:04 AM CDT, Julian Seward wrote: > >>> Is this behavior expected? I'll can create a bugzilla for it either way, >>> since a clarifying doc fix makes sense to me if it is expected behavior. >> >> Yes, please file a bug report. If you have a tentative fix pls add that >> as a patch. > > Will do. To close the email loop on this, here's the BZ link: https://bugs.kde.org/show_bug.cgi?id=299756 -Dave |
|
From: Philippe W. <phi...@sk...> - 2012-05-10 21:44:14
|
On Thu, 2012-05-10 at 10:16 -0500, Dave Goodell wrote: > On May 10, 2012, at 2:04 AM CDT, Julian Seward wrote: > > >> Is this behavior expected? I'll can create a bugzilla for it either way, > >> since a clarifying doc fix makes sense to me if it is expected behavior. > > > > Yes, please file a bug report. If you have a tentative fix pls add that > > as a patch. > > Will do. > > >> It's clear how this is happening in the Valgrind code; "MC_(new_block)" and > >> "die_and_free_mem" have asymmetric behavior when "MC_AllocCustom" is > >> passed as "kind" to "MC_(new_block)". It looks easy enough to fix in > >> either function, depending on what behavior is desired. > > > > I'm unclear why you speak of the possibility of two different behaviours. > > Shouldn't the clreqs also simply respect the value of --malloc-fill? If > > so, why are there two possible fixes? > > For my use case I would actually prefer that it ignore the values of both --malloc-fill > and --free-fill for the clreqs, or at least it should be controllable by "is_zeroed" > or another mempool creation parameter. I can deal with either behavior though, so it > isn't a huge deal if you think that "always use both -fill parameters" is the correct interpretation. > There are two different sets of client requests that can be used to inform memcheck about "client maintained" heaps. For a new block being allocated, these are: VG_USERREQ__MEMPOOL_ALLOC VG_USERREQ__MALLOCLIKE_BLOCK The MALLOCLIKE one has a is_zeroed argument (so this can be given for each block). The MEMPOOL_ALLOC has no such arg, but the mempool creation has a is_zeroed argument. I do not think that it is consistent to have memcheck using malloc-fill to fill a 'is_zeroed' MALLOCLIKE block. I understand the 'is_zeroed' means: the caller has put zero in the malloc like block, so memcheck can mark the block as initialised. MEMPOOL_ALLOC behaves similarly. If 'not is_zeroed', I think both MALLOCLIKE and MEMPOOL_ALLOC could fill the memory (so this would be a change compared to the current behaviour, but I see no problem with this change of behaviour). Not clear if the FREELIKE and MEMPOOL_FREE should fill the block. Using is_zeroed for that seems not very logical, and can only be done for mempool in any case. Having the caller of such a client request properly marking the freed block is tricky in any case (see e.g. the INNER markers in coregrind/m_mallocfree.c to run Valgrind in Valgrind). So, if the caller wants to fill the block before calling the client request, it is really trivial to do. So, from my point of view, I have a slight preference to never use free-fill for FREELIST and MEMPOOL_FREE client requests, as I believe that typically the caller will have e.g. re-inserted this block in the list of mempool blocks or similar, and this should not be erased. See coregrind/m_mallocfree.c Note: I did not try to run Valgrind in Valgrind using free-fill. I guess it will fail miserably :). Philippe |
|
From: Julian S. <js...@ac...> - 2012-05-10 22:42:21
|
On Thursday, May 10, 2012, Philippe Waroquiers wrote: > So, from my point of view, I have a slight preference to never > use free-fill for FREELIST and MEMPOOL_FREE client requests, as > I believe that typically the caller will have e.g. re-inserted this > block in the list of mempool blocks or similar, and this should > not be erased. See coregrind/m_mallocfree.c IIUC, then, by comparison with https://bugs.kde.org/show_bug.cgi?id=299756#c3, 2nd para, you and Dave have the same preference .. is that correct? J |
|
From: Philippe W. <phi...@sk...> - 2012-05-10 23:16:08
|
On Fri, 2012-05-11 at 00:41 +0200, Julian Seward wrote: > On Thursday, May 10, 2012, Philippe Waroquiers wrote: > > > So, from my point of view, I have a slight preference to never > > use free-fill for FREELIST and MEMPOOL_FREE client requests, as > > I believe that typically the caller will have e.g. re-inserted this > > block in the list of mempool blocks or similar, and this should > > not be erased. See coregrind/m_mallocfree.c > > IIUC, then, by comparison with > https://bugs.kde.org/show_bug.cgi?id=299756#c3, 2nd para, > you and Dave have the same preference .. is that correct? Yes, that is correct. We clearly have the same (explicit) preference for free-fill. For what concerns malloc-fill: it would not arm to have the non zeroed MALLOCLIKE and MEMPOOL_ALLOC filling the blocks. But like Dave, I think here that symmetry and simplicity is better: MALLOCLIKE, MEMPOOL_ALLOC, FREELIKE, MEMPOOL_FREE do not modify the block given, they only change the accessibility/definedness. I can take in charge looking the patch of Dave, and adding the relevant sentences in the documentation and/or headers. Philippe |
|
From: Julian S. <js...@ac...> - 2012-05-11 07:57:17
|
On Friday, May 11, 2012, Philippe Waroquiers wrote: > I can take in charge looking the patch of Dave, and adding > the relevant sentences in the documentation and/or headers. Yes, please do. (kind-of unrelated) There's also a configure script change from Dave w.r.t. MPI, that we should take, #274078 I think. J |
|
From: Philippe W. <phi...@sk...> - 2012-05-11 22:25:31
|
On Fri, 2012-05-11 at 09:56 +0200, Julian Seward wrote: > On Friday, May 11, 2012, Philippe Waroquiers wrote: > > I can take in charge looking the patch of Dave, and adding > > the relevant sentences in the documentation and/or headers. > > Yes, please do. Revision 12560 fixes Bug 299756 - for symmetry, --free-fill must be ignored for MEMPOOL_FREE and FREELIKE client requests > > (kind-of unrelated) > There's also a configure script change from Dave w.r.t. MPI, > that we should take, #274078 I think. I took a quick look at this but I am not too sure I understand the problem (and I have no mpi installation to experiment with). IIUC, it seems linked with static mpi library. Note that with the introduction of --soname-synonyms in rev 12559, it should be possible to support a statically linked mpi library with small modifications in the mpi wrappers. Philippe |
|
From: Dave G. <go...@mc...> - 2012-05-11 22:42:18
|
On May 11, 2012, at 5:25 PM CDT, Philippe Waroquiers wrote: > On Fri, 2012-05-11 at 09:56 +0200, Julian Seward wrote: >> On Friday, May 11, 2012, Philippe Waroquiers wrote: >>> I can take in charge looking the patch of Dave, and adding >>> the relevant sentences in the documentation and/or headers. >> >> Yes, please do. > Revision 12560 fixes Bug 299756 - for symmetry, --free-fill must be > ignored for MEMPOOL_FREE and FREELIKE client requests Thanks! >> (kind-of unrelated) >> There's also a configure script change from Dave w.r.t. MPI, >> that we should take, #274078 I think. > > I took a quick look at this but I am not too sure I understand the > problem (and I have no mpi installation to experiment with). > IIUC, it seems linked with static mpi library. The issue is that the configure logic that checks for a functional "mpicc" script doesn't match up with the actual build logic used at "make" time. So it's possible to end up with Valgrind "successfully" detecting an mpicc that can't actually build the MPI wrappers. One very common scenario is where MPICH2 is built with static libs only (the default) but the MPI wrappers are built relocatable: http://thread.gmane.org/gmane.comp.debugging.valgrind/11102/focus=11105 Another scenario is on Darwin when the MPI library has only been built for one architecture but Valgrind is building bi-arch, so one of the architectures fails: http://thread.gmane.org/gmane.comp.debugging.valgrind/12039/focus=12041 > Note that with the introduction of --soname-synonyms in rev 12559, > it should be possible to support a statically linked mpi library > with small modifications in the mpi wrappers. This would probably help reduce the number of failures due to the first scenario, but the improved configure logic should still be implemented. The MPI implementation may not be compatible with building the Valgrind MPI wrappers for a variety of reasons, and Valgrind users shouldn't get bizarre make-time failures when building Valgrind. -Dave |
|
From: Dave G. <go...@mc...> - 2012-05-15 21:32:33
|
Was my explanation sufficiently clear/compelling to get this patch included? Is there something I can do to make this easier? Maybe a reproducer build script that clearly shows the problem? There's already plenty of user pain when using most HPC software. It would be great if we could remove this pain point when building Valgrind in the presence of MPI implementations. (I'm not trying to be annoying about this, but the patch is fairly small, very helpful, and it's been sitting in the bug tracker for about a year now… Feel free to tell me to buzz off if I'm becoming obnoxious.) -Dave On May 11, 2012, at 5:42 PM CDT, Dave Goodell wrote: > On May 11, 2012, at 5:25 PM CDT, Philippe Waroquiers wrote: > >> On Fri, 2012-05-11 at 09:56 +0200, Julian Seward wrote: >>> (kind-of unrelated) >>> There's also a configure script change from Dave w.r.t. MPI, >>> that we should take, #274078 I think. >> >> I took a quick look at this but I am not too sure I understand the >> problem (and I have no mpi installation to experiment with). >> IIUC, it seems linked with static mpi library. > > The issue is that the configure logic that checks for a functional "mpicc" script doesn't match up with the actual build logic used at "make" time. So it's possible to end up with Valgrind "successfully" detecting an mpicc that can't actually build the MPI wrappers. > > One very common scenario is where MPICH2 is built with static libs only (the default) but the MPI wrappers are built relocatable: > > http://thread.gmane.org/gmane.comp.debugging.valgrind/11102/focus=11105 > > Another scenario is on Darwin when the MPI library has only been built for one architecture but Valgrind is building bi-arch, so one of the architectures fails: > > http://thread.gmane.org/gmane.comp.debugging.valgrind/12039/focus=12041 > >> Note that with the introduction of --soname-synonyms in rev 12559, >> it should be possible to support a statically linked mpi library >> with small modifications in the mpi wrappers. > > This would probably help reduce the number of failures due to the first scenario, but the improved configure logic should still be implemented. The MPI implementation may not be compatible with building the Valgrind MPI wrappers for a variety of reasons, and Valgrind users shouldn't get bizarre make-time failures when building Valgrind. |
|
From: Philippe W. <phi...@sk...> - 2012-05-16 19:44:37
|
On Tue, 2012-05-15 at 16:32 -0500, Dave Goodell wrote: > Was my explanation sufficiently clear/compelling to get this patch included? > > Is there something I can do to make this easier? Maybe a reproducer build script that clearly shows the problem? An off-line discussion with Dave has somewhat increased my courage, so I will take a look at this in the coming days and give feedback. Philippe |