[Valgrind-users] New mem leak detection algorithm finds more bugs

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I saw the code that skips the stack during leak detection.

I've coded and am currently testing a modification to the memory leak
detection to flag lost blocks that are pointed to by unused memory
(free mem or other lost blocks).  E.g. a circular linked list or a
double linked list will be detected.  As will a leaked block that is
pointed to by a freed block (it's not enough to free the head of a list,
you have to walk the list).

Hopefully I'll make Nicholas happy (as per his last line in
http://sourceforge.net/mailarchive/message.php?msg_id=4892174 ).

I will outline the algorithm below.  Please let me know if you see
anything incorrect.  When everything is ready, I'll provide "diff -u"
patches.  Should I also update any docs, possibly with the info
contained later in this e-mail?

When I designed the change (without yet carefully reading the code)
I wanted to structure the code differently than it turns out to be
structured.  I wanted to call a memory scanner multiple times with
various start and end addresses.  But vg_scan_all_valid_memory() loops
through the pages and then through the addresses within the pages, and
I didn't want to do a bunch of re-structuring.  So I ended up with
changing vg_scan_all_valid_memory() to use a slightly kludgy multi-pass
algorithm where pass 0 is special.

Here's the way it works.  The vg_scan_all_valid_memory() main "page"
loop is turned into an infinite loop which scans pages of memory
in multiple passes.  The first pass (pass 0) starts at the very
first page and ends at the last page.  It intentionally SKIPS memory
addresses that are within the malloced memory area (as defined by
lc_min_mallocd_addr and lc_max_mallocd_addr).  All, non-malloced
memory that contains pointers to malloced blocks will be passed to
notify_word(), which will set the containing malloc block's
lc_reachedness[] element to something > Unreached.  Thus, global
pointers to malloced blocks will set the malloced blocks to an in-use
state (Proper or Interior).

Subsequent passes (> 0) start at the first page that contains malloced
memory and ends at the last page that contains malloced memory.
Only addresses *inside* the malloced memory area are considered.  Each
address is checked to see if it is contained in a malloced block.  If not
(i.e. if it's part of free memory), the address is skipped.  If it IS
part of a malloced block, the lc_reachedness[] for that block is checked.
If it is Unreached, then the address is skipped.  The address will be
passed to notify_word() ONLY if the address is inside an in-use block.

The loop keeps track of how many malloced blocks changed their
lc_reachedness[] state.  When a pass is completed and one or more
blocks were flagged as in-use, then another pass is scheduled.  The
main (infinite) loop finally exits when a pass is clean (no new
blocks flagged in-use).

The worst case would be a non-leaked linked list (i.e. with a valid
global variable pointing to the head) with the blocks linked in reverse
order in memory.  Pass 0 will mark the first block as in-use, pass 1 will
mark the second block (pointed to by the first) as in-use, pass 2 will
mark the third block as in-use, etc.

To improve efficiency somewhat, I also created a parallel array to
lc_reachedness[] named lc_notifiedness[].  Once an in-use block is
scanned (i.e. its contents passed to notify_word()), it won't be
re-scanned in future passes.  

Thus, from the point of view of calls to notify_word(), the new
algorithm is slightly MORE efficient (no address is passed in more
than once, and addresses inside of free memory are not passed in
at all).  However, I suspect that the overall efficency is less
since vg_scan_all_valid_memory() has to call find_shadow_for()
to determine if the address is in a malloced block, determine its
reachedness, and it's notifiedness.  That *does* have to be done
repeatedly with every pass for each address in the malloced area.
If there are very many malloced blocks (which slows down binary
search in find_shadow_for()), and if those blocks are arranged in
memory such that many passes are required, then the memory leak
check could take a while.

But maybe that is ok.  Good programmers free all their memory
before program exit.  The only programmers who will suffer from
the added time will be the lazy ones.  (Or more likely, the poor
souls who have to maintain lazy programmers' legacies.)