|
From: <sf...@ge...> - 2003-10-30 15:32:33
|
I saw the code that skips the stack during leak detection. I've coded and am currently testing a modification to the memory leak detection to flag lost blocks that are pointed to by unused memory (free mem or other lost blocks). E.g. a circular linked list or a double linked list will be detected. As will a leaked block that is pointed to by a freed block (it's not enough to free the head of a list, you have to walk the list). Hopefully I'll make Nicholas happy (as per his last line in http://sourceforge.net/mailarchive/message.php?msg_id=4892174 ). I will outline the algorithm below. Please let me know if you see anything incorrect. When everything is ready, I'll provide "diff -u" patches. Should I also update any docs, possibly with the info contained later in this e-mail? When I designed the change (without yet carefully reading the code) I wanted to structure the code differently than it turns out to be structured. I wanted to call a memory scanner multiple times with various start and end addresses. But vg_scan_all_valid_memory() loops through the pages and then through the addresses within the pages, and I didn't want to do a bunch of re-structuring. So I ended up with changing vg_scan_all_valid_memory() to use a slightly kludgy multi-pass algorithm where pass 0 is special. Here's the way it works. The vg_scan_all_valid_memory() main "page" loop is turned into an infinite loop which scans pages of memory in multiple passes. The first pass (pass 0) starts at the very first page and ends at the last page. It intentionally SKIPS memory addresses that are within the malloced memory area (as defined by lc_min_mallocd_addr and lc_max_mallocd_addr). All, non-malloced memory that contains pointers to malloced blocks will be passed to notify_word(), which will set the containing malloc block's lc_reachedness[] element to something > Unreached. Thus, global pointers to malloced blocks will set the malloced blocks to an in-use state (Proper or Interior). Subsequent passes (> 0) start at the first page that contains malloced memory and ends at the last page that contains malloced memory. Only addresses *inside* the malloced memory area are considered. Each address is checked to see if it is contained in a malloced block. If not (i.e. if it's part of free memory), the address is skipped. If it IS part of a malloced block, the lc_reachedness[] for that block is checked. If it is Unreached, then the address is skipped. The address will be passed to notify_word() ONLY if the address is inside an in-use block. The loop keeps track of how many malloced blocks changed their lc_reachedness[] state. When a pass is completed and one or more blocks were flagged as in-use, then another pass is scheduled. The main (infinite) loop finally exits when a pass is clean (no new blocks flagged in-use). The worst case would be a non-leaked linked list (i.e. with a valid global variable pointing to the head) with the blocks linked in reverse order in memory. Pass 0 will mark the first block as in-use, pass 1 will mark the second block (pointed to by the first) as in-use, pass 2 will mark the third block as in-use, etc. To improve efficiency somewhat, I also created a parallel array to lc_reachedness[] named lc_notifiedness[]. Once an in-use block is scanned (i.e. its contents passed to notify_word()), it won't be re-scanned in future passes. Thus, from the point of view of calls to notify_word(), the new algorithm is slightly MORE efficient (no address is passed in more than once, and addresses inside of free memory are not passed in at all). However, I suspect that the overall efficency is less since vg_scan_all_valid_memory() has to call find_shadow_for() to determine if the address is in a malloced block, determine its reachedness, and it's notifiedness. That *does* have to be done repeatedly with every pass for each address in the malloced area. If there are very many malloced blocks (which slows down binary search in find_shadow_for()), and if those blocks are arranged in memory such that many passes are required, then the memory leak check could take a while. But maybe that is ok. Good programmers free all their memory before program exit. The only programmers who will suffer from the added time will be the lazy ones. (Or more likely, the poor souls who have to maintain lazy programmers' legacies.) |
|
From: Dimitri Papadopoulos-O. <pap...@sh...> - 2003-10-30 16:01:06
|
Hi, > But maybe that is ok. Good programmers free all their memory > before program exit. The only programmers who will suffer from > the added time will be the lazy ones. (Or more likely, the poor > souls who have to maintain lazy programmers' legacies.) Sometimes good programers do not free all the memory before program exit, for efficiency reasons. Well, of course one could argue that memory should always be released, and relevant code #ifdef'ed out when needed. Anyway, I don't think it's wise to classify programers as "good" and "lazy" based on this single criterion. -- Dimitri |
|
From: Nicholas N. <nj...@ca...> - 2003-10-30 16:08:50
|
On Thu, 30 Oct 2003 sf...@ge... wrote: > I saw the code that skips the stack during leak detection. > > I've coded and am currently testing a modification to the memory leak > detection to flag lost blocks that are pointed to by unused memory > (free mem or other lost blocks). E.g. a circular linked list or a > double linked list will be detected. As will a leaked block that is > pointed to by a freed block (it's not enough to free the head of a list, > you have to walk the list). > > Hopefully I'll make Nicholas happy (as per his last line in > http://sourceforge.net/mailarchive/message.php?msg_id=4892174 ). > > I will outline the algorithm below. Please let me know if you see Have you considered more standard garbage collection algorithms, eg. tracing through all the reachable objects from a root set? I think the algorithm you described might, in effect, do the same thing, but I wonder if doing it in a more straightforward GC-style way would be simpler? > But maybe that is ok. Good programmers free all their memory > before program exit. Really? Hmm. Good programmers don't create space leaks either :) N |
|
From: <sf...@ge...> - 2003-10-30 21:17:05
|
On Thu, 30 Oct 2003, Nicholas Nethercote wrote: > Have you considered more standard garbage collection algorithms, eg. > tracing through all the reachable objects from a root set? I think the > algorithm you described might, in effect, do the same thing, but I wonder > if doing it in a more straightforward GC-style way would be simpler? I did consider it. The actual code changes I made were surprisingly minimal. Creating a non-looping N-ary tree of references would involve significantly additional coding with the only advantage being that it would run faster for programs that exit with lots of non-freed blocks. I decided it wasn't worth the effort or risk (of introducing bugs). > > But maybe that is ok. Good programmers free all their memory > > before program exit. > > Really? Hmm. Good programmers don't create space leaks either :) Heh. I never claimed to be a good programmer. During 10+ years of embedded programming, I considered it a mortal sin to not fully clean up on exit. Now, 5 years later, I'm appalled at how lazy I've gotten. --- I just made an improvement that makes sure that a block won't be counted as "reachable" unless it is either directly or indirectly properly pointed to from non-heap memory. In other words, if a non-heap pointer points inside a block, it is set to possibly lost. If that block points at new block (either properly or internally), that new block is set to possibly lost. If a fully-reachable block contains an interior pointer to a new block, that new block is set to possibly lost. Before this improvement, a lost linked list of blocks would show the first block as lost (or possibly lost), but subsequent blocks in the list would be shown as reachable. I should be done testing tomorrow. |
|
From: Olly B. <ol...@su...> - 2003-10-31 01:02:42
|
On Thu, Oct 30, 2003 at 01:16:55PM -0800, sf...@ge... wrote:
> On Thu, 30 Oct 2003, Nicholas Nethercote wrote:
>
> > > But maybe that is ok. Good programmers free all their memory
> > > before program exit.
> >
> > Really? Hmm. Good programmers don't create space leaks either :)
>
> Heh. I never claimed to be a good programmer. During 10+ years of
> embedded programming, I considered it a mortal sin to not fully clean
> up on exit.
I think this may be due to a difference between platforms. On Unix,
libc expands the data segment size using brk() and allocates memory from
this using malloc/free (or new/delete for C++).
On process exit, the whole data segment is returned to the operating
system for reallocation so carefully freeing up all malloc allocations
only results in slowing down process exit. It's perhaps useful to write
code to do it, as it makes leak checking tools (such as valgrind) easier
to use. But to enable it in production builds is senseless.
All the non-embedded operating systems I've encountered work in a
similar way to this, though the exact details differ.
In an embedded environment, it's possible the implementation might be
such that if you don't explicitly free an allocation from malloc, it
will be lost until the device is restarted (for an example of how this
might be, suppose the OS/libc allocated malloc blocks from a common
heap...)
Cheers,
Olly
|