|
From: Philippe W. <phi...@sk...> - 2013-10-09 22:26:51
|
On Wed, 2013-10-09 at 06:53 +0200, Matthias Schwarzott wrote: > On 02.10.2013 23:03, Philippe Waroquiers wrote: > > Have you tested these heuristics with a (big) c++ application ? Philippe > > Hi Philippe, > > I did some real tests with our application. Thanks for the below feedback, very interesting. > multipleinheritance: 1 matching block, this is correct > stdstring: lots of std::string, everything correct here > newarray: No data allocated by new[] here, only memory from > icu_50::UnicodeString::cloneArrayIfNeeded > > The problem with icu_50::UnicodeStringis, that is has the reference > count in the word before the location pointed at. > And that is 1 at the allocation time and later in most cases only a > small integer. As long as it is 1, 2 or 4 it should always match the > allocated size. In other cases it might match. Is the pointer to a unicodestring pointing one word after the ref count ? Then the heuristic is maybe not designed for that, but it properly detects that the unicode string is effectively reachable (and not possibly lost). I quickly looked at the unicode library. It looks like the allocation function can be redefined to be directly on top of malloc. So, it is not possible to add the condition of seeing "new[]" in the stack trace, as this "new[]" is not necessarily the function called to allocate the memory of a c++ array (IIUC : my c++ knowledge is very close to 0). > > Then I have the heuristic "array64length" I have written. This one > checks if the pointer has offset 8 and the 64bits before match the > remaining length (a subset of newarray on a 64bit platform). > This one only matches memory allocated by sqlite3 (e.g. sqlite3MemMalloc). > > The remaining possible leaks can be grouped like this: > - icu_50::UnicodeString (when reference count does not match length) > - Some internal Pool implementations that maybe should be instrumented > - sqlite3MemMalloc (why is this not matching Array64length ??)- pthreads > TLS: 38x (one per thread) > ==2503== 144 bytes in 1 blocks are possibly lost in loss record 18,895 > of 20,196 > ==2503== at 0x4006256: calloc (vg_replace_malloc.c:618) > ==2503== by 0x4D8477B8: allocate_dtv (dl-tls.c:297) > ==2503== by 0x4D847F5E: _dl_allocate_tls (dl-tls.c:461) > ==2503== by 0x4DA1A6A0: pthread_create@@GLIBC_2.1 (allocatestack.c:572) > […] > > So multipleinheritance and std::string have no false positives for me. > Newarray is not as reliable, as each memory block that starts with a > word of 1,2 or 4 will match (if the pointer has offset 4). > Maybe some kind of matching is needed for combinations of pointer offset > and allocating callstack. Yes, we could add a special kind of suppression that would suppress a block to be found via an heuristic if the alloc stack trace matches the "heuristic suppression". This is probably not very difficult to implement but as usual need to look at the balance between additional complexity (and cpu implied) and the provided functionality. A fully "user configurable" heuristic would be to allow the user to specify an expression that should return true. This will then allow very flexible heuristic to be done outside of memcheck. We need a (relatively fast) expression reader and evaluator in memcheck then. Philippe |