|
From: Bryan M. <om...@br...> - 2005-10-30 16:37:11
|
Dear Valgrind Team,
Title
=====
Omega - A Valgrind tool for instant memory leak detection.
Designed by Bryan "Brain Murders" Meredith with grateful thanks to the
Valgrind team for making their tool and techniques available under the GPL.
Synopsis
========
Whilst Valgrind's MemCheck tool can currently detect and report that a memory
leak has occurred, the debugging output only shows where the memory that has
leaked was allocated. There are no clues as to where the last reference to
the allocated block was lost. Omega uses a modified version of MemCheck's
"definedness" ('V' bit tracking) technique in order to help track pointers to
allocated memory, outputting debugging information when the final (hence
Omega) pointer to an allocated block is over-written or lost through a call
to free() or the stack unwinding.
How it Works
============
The critical point in tracking leaking pointers is when a value is written
into memory from a register. This can result in the creation of a new
reference to an allocated block, the destruction of a current reference to an
allocated block or both. If we determine that either the register to be
written or the memory location to be written contains a pointer that we are
tracking, we update our tracking system and report if we are losing the last
pointer to a block.
Because checking every single write to memory is going to cause a huge
overhead, we use a modified version of MemCheck's 'V' bit tracking to give us
a simple "are we dealing with a tracked pointer" check before we go on to
more heavyweight processing.
A Simple Example
----------------
The program under test calls malloc (or one of the other heap allocation
functions). We generate an internal record to track the address and size of
the new block and set the 'V' bits of the shadow register to indicate 'valid
(0)'. Note that as with MemCheck, the default state of all shadow memory is
'invalid (1)'.
Immediately before the register is written out to memory, we create a new
temporary register that contains the result of a bitwise AND between the 'V'
bits of the register and the memory location to be written. We now call a
helper function passing this register. If the results of the AND is '0', we
know that there is a state change that we have to track and so process
accordingly, adding and/or deleting new back references in the internal
record and generating user diagnostic as required.
When we call free (or one of the other heap de-allocation functions), we do
cleanup processing on our internal record. There are two key activities that
must be performed at this point.
1) Set 'V' bits to any hanging pointers to 'invalid'.
2) Recursively remove references to any pointers that are within the memory
area that we are about to free.
As an option, during part 1 we could report on the hanging pointers.
Note that stack unwinding would also perform part 2 to ensure that we don't
leak through automatic variables going out of scope.
'V' bit Propagation
-------------------
Unlike MemCheck, we only have one size of data to track - namely sizeof(void
*). Another key difference is that we need the pointer to be unmolested for
it to be of interest to us. The upshot of these two properties are that in
the first version of Omega, not only can we further limit our checks to
transfers of the appropriate width and alignment but any operation that
changes the value during propagation causes the 'V' bits to be set to
'invalid', thus saving us from performing the expensive lookups and
processing.
As a later feature addition, it should be possible through some mechanism
(possibly the suppression functionality) to indicate that a pointer at some
positive offset to the allocated memory address should also be tracked. This
would be of benefit to those that implement their own memory management.
Optimisation
------------
As part of the tracking process, we also hold a sparse bitmap in nodes that
cover an area of memory 64kB in size. Each bit within a bitmap node
represents an aligned pointer in memory. The point of this second reference
is to assist in the detection of tracked references within the stack and
allocated memory blocks. Due to the compressed nature of the information, we
can check if an area of memory is free of tracked pointers 32 possible
pointers at a time.
What We Can Detect
==================
Using the above techniques, we can track the following leaks:
Simple over-write of a tracked pointer.
lastP = blah;
Tracked pointer being modified.
lastP++;
Automatic variable going out of scope.
{
void *lastP = malloc(64);
return;
}
Tracked pointer within allocated block being returned to OS.
{
void *arrayP = malloc(10 * sizeof(void *));
arrayP[1] = malloc(64);
free(arrayP);
}
What We Don't Detect
--------------------
This never results in the tracked pointer being saved into memory.
It should be possible to add some state tracking to check if an allocated
block has references when the stack unwinds but this sort of thing is usually
either deliberate or easily fixed after a run through MemCheck.
{
new CFred();
return;
}
What Next?
==========
I have to code it ;P
I have produced this in order to obtain feedback as soon as possible on what
will be a very useful tool both to my company and hopefully others. Any and
all feedback gratefully received. I understand that some parts have probably
been oversimplified and testing of the first implementation will reveal some
shortcomings. The main thing is that the principle of using the 'V' bits to
track pointers to allocated blocks is sound - implementation of the internal
data structures for the allocated blocks, the reverse lookups and handling
can be optimised over time as required and thus is not detailed here. This is
not to say that I don't have a solution, it's just that it is one of many
possible so not as interesting.
The above description can also be read here:
http://www.brainmurders.eclipse.co.uk/omega_introduction.txt
Thanks for your time and looking forward to any feedback that you might wish
to give.
Bryan "Brain Murders" Meredith
|
|
From: Tom H. <to...@co...> - 2005-10-30 16:50:51
|
In message <200...@br...>
Bryan Meredith <om...@br...> wrote:
> A Simple Example
> ----------------
> The program under test calls malloc (or one of the other heap allocation
> functions). We generate an internal record to track the address and size of
> the new block and set the 'V' bits of the shadow register to indicate 'valid
> (0)'. Note that as with MemCheck, the default state of all shadow memory is
> 'invalid (1)'.
>
> Immediately before the register is written out to memory, we create a new
> temporary register that contains the result of a bitwise AND between the 'V'
> bits of the register and the memory location to be written. We now call a
> helper function passing this register. If the results of the AND is '0', we
> know that there is a state change that we have to track and so process
> accordingly, adding and/or deleting new back references in the internal
> record and generating user diagnostic as required.
This sounds like you are doing the tracking at bit level which seems a
bit wasteful - surely the most you need to track at is byte level (more
like memcheck's A bits) or even word level?
It also isn't clear to me that your scheme will cope with pointers that
never make it to memory - if I allocate memory with the pointer going
into a register and then that register is overwritten without the memory
being freed will you spot it?
Presumably you could spot that because a write to a register that had
it's V bits set to zero could check that the block the register was
pointing to had other pointers?
> As a later feature addition, it should be possible through some mechanism
> (possibly the suppression functionality) to indicate that a pointer at some
> positive offset to the allocated memory address should also be tracked. This
> would be of benefit to those that implement their own memory management.
I suspect some way of coping with this will be needed - I know that our
code base at work has some widely used data structures where only an
interior pointer is kept around.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Bryan M. <om...@br...> - 2005-10-30 17:08:49
|
On Sunday 30 Oct 2005 16:50, Tom Hughes wrote: > In message <200...@br...> > Bryan Meredith <om...@br...> wrote: > > > A Simple Example > > ---------------- > > The program under test calls malloc (or one of the other heap allocation > > functions). We generate an internal record to track the address and size of > > the new block and set the 'V' bits of the shadow register to indicate 'valid > > (0)'. Note that as with MemCheck, the default state of all shadow memory is > > 'invalid (1)'. > > > > Immediately before the register is written out to memory, we create a new > > temporary register that contains the result of a bitwise AND between the 'V' > > bits of the register and the memory location to be written. We now call a > > helper function passing this register. If the results of the AND is '0', we > > know that there is a state change that we have to track and so process > > accordingly, adding and/or deleting new back references in the internal > > record and generating user diagnostic as required. > > This sounds like you are doing the tracking at bit level which seems a > bit wasteful - surely the most you need to track at is byte level (more > like memcheck's A bits) or even word level? Very true - it is memory intensive - the alternate is very processor intensive and you are forced to do lookups and caching in internal structures instead. Although this is bit level, we set all the bits one way or the other in one go - there is no mixing of defined and undefined values. As you can see further down, internally there is a mechanism for tracking pointers in 64k chunks of memory (each bit represents either 4 or 8 bytes). The downside is the expense of doing a lookup every time we read or write memory - thats why I went for the faster (but more memory intensive) 'V' bit technique. > > It also isn't clear to me that your scheme will cope with pointers that > never make it to memory - if I allocate memory with the pointer going > into a register and then that register is overwritten without the memory > being freed will you spot it? > > Presumably you could spot that because a write to a register that had > it's V bits set to zero could check that the block the register was > pointing to had other pointers? Yes - it can be tracked but I wasnt planning to. This is one of the ommisions / over simplifications that I mentioned. As I put in the example of what we dont track, this should be easilly spotted after a MemCheck run as the pointer wont be making it out of the function that calls the allocator. > > > As a later feature addition, it should be possible through some mechanism > > (possibly the suppression functionality) to indicate that a pointer at some > > positive offset to the allocated memory address should also be tracked. This > > would be of benefit to those that implement their own memory management. > > I suspect some way of coping with this will be needed - I know that our > code base at work has some widely used data structures where only an > interior pointer is kept around. Indeed - we have this issue as well so it is definitely desireable. > > Tom > > -- > Tom Hughes (to...@co...) > http://www.compton.nu/ > Thanks for the rapid feedback! Bryan "Brain Murders" Meredith |
|
From: Nicholas N. <nj...@cs...> - 2005-11-01 17:08:05
|
On Sun, 30 Oct 2005, Bryan Meredith wrote:
> Title
> =====
> Omega - A Valgrind tool for instant memory leak detection.
> Designed by Bryan "Brain Murders" Meredith with grateful thanks to the
> Valgrind team for making their tool and techniques available under the GPL.
Interesting :)
You should read this paper:
@InProceedings{Mae:precise2004,
author = {Jonas Maebe and Michiel Ronsse and Koen De Bosschere},
title = {Precise detection of memory leaks},
booktitle = {Proceedings of the Second International Workshop on Dynamic
Analysis (WODA 2004)},
pages = {},
address = {Edinburgh, Scotland},
month = may,
year = 2004,
summary = {Dynamic leak detector. Notable in that it detects the
leaks when they happen. Does this by doing reference
counting on each block, and tracking extra information about
each pointer. When a pointer is copied or clobbered, the
counts are updated. Per-pointer-info is stored in a tree,
with one part for stack pointers and the other half for the
rest. Some minor complications are used to avoid false
positives, eg. caused by advancing a pointer past the start
of the block and then later back. Memory values are
shadowed, registers not, but registers are all checked
linearly when necessary (eg. when deciding whether a block
has leaked). Implemented in DIOTA, not surprisingly the
current version runs programs 200--300 times slower.}
}
I haven't completely understood your scheme, but it's unclear to me how it
will work without reference counting.
> What Next?
> ==========
> I have to code it ;P
> I have produced this in order to obtain feedback as soon as possible on
> what will be a very useful tool both to my company and hopefully others.
> Any and all feedback gratefully received. I understand that some parts
> have probably been oversimplified and testing of the first
> implementation will reveal some shortcomings. The main thing is that the
> principle of using the 'V' bits to track pointers to allocated blocks is
> sound - implementation of the internal data structures for the allocated
> blocks, the reverse lookups and handling can be optimised over time as
> required and thus is not detailed here. This is not to say that I don't
> have a solution, it's just that it is one of many possible so not as
> interesting.
It sounds ok. In my experience the journey between the initial idea for a
tool and the resulting implementation can be tortuous (but hopefully not
torturous). The paper above should give you ideas about problems you
haven't thought of, but you won't really know if it's feasible until you
implement it. It looks like you've thought it through and have a good
idea of how Valgrind works which is an excellent start. Good luck :)
Feel free to ask more questions as you go.
Nick
|
|
From: Bryan M. <om...@br...> - 2005-11-01 18:52:25
|
Nick,
thanks for the input.
My system does indeed keep track of all the pointers that point to an
allocated block - there really is no other way. For convenience a count is
kept but it could quite easily be done by checking the pointer list for
entries when removing a pointer.
The special handling for the stack might be interesting so I will have a read
of that - I was planning to treat the stack as normal memory for simplicity
but optimising this at a later stage may well be required.
I want to get something working as soon as possible so that I can get it in
front of other programmers. Only this way can the benefits of open source be
reaped and on a project like this, that can only be a good thing. So, once it
passes a small number of tests that cover most of the standard issues, I will
post it up so that together we can cover the other cases and make it go
faster.
Bryan
On Tuesday 01 Nov 2005 17:07, Nicholas Nethercote wrote:
> On Sun, 30 Oct 2005, Bryan Meredith wrote:
>
> > Title
> > =====
> > Omega - A Valgrind tool for instant memory leak detection.
> > Designed by Bryan "Brain Murders" Meredith with grateful thanks to the
> > Valgrind team for making their tool and techniques available under the
GPL.
>
> Interesting :)
>
> You should read this paper:
>
> @InProceedings{Mae:precise2004,
> author = {Jonas Maebe and Michiel Ronsse and Koen De Bosschere},
> title = {Precise detection of memory leaks},
> booktitle = {Proceedings of the Second International Workshop on
Dynamic
> Analysis (WODA 2004)},
> pages = {},
> address = {Edinburgh, Scotland},
> month = may,
> year = 2004,
> summary = {Dynamic leak detector. Notable in that it detects the
> leaks when they happen. Does this by doing reference
> counting on each block, and tracking extra information
about
> each pointer. When a pointer is copied or clobbered,
the
> counts are updated. Per-pointer-info is stored in a
tree,
> with one part for stack pointers and the other half for
the
> rest. Some minor complications are used to avoid false
> positives, eg. caused by advancing a pointer past the
start
> of the block and then later back. Memory values are
> shadowed, registers not, but registers are all checked
> linearly when necessary (eg. when deciding whether a
block
> has leaked). Implemented in DIOTA, not surprisingly the
> current version runs programs 200--300 times slower.}
> }
>
> I haven't completely understood your scheme, but it's unclear to me how it
> will work without reference counting.
>
>
> > What Next?
> > ==========
> > I have to code it ;P
> > I have produced this in order to obtain feedback as soon as possible on
> > what will be a very useful tool both to my company and hopefully others.
> > Any and all feedback gratefully received. I understand that some parts
> > have probably been oversimplified and testing of the first
> > implementation will reveal some shortcomings. The main thing is that the
> > principle of using the 'V' bits to track pointers to allocated blocks is
> > sound - implementation of the internal data structures for the allocated
> > blocks, the reverse lookups and handling can be optimised over time as
> > required and thus is not detailed here. This is not to say that I don't
> > have a solution, it's just that it is one of many possible so not as
> > interesting.
>
> It sounds ok. In my experience the journey between the initial idea for a
> tool and the resulting implementation can be tortuous (but hopefully not
> torturous). The paper above should give you ideas about problems you
> haven't thought of, but you won't really know if it's feasible until you
> implement it. It looks like you've thought it through and have a good
> idea of how Valgrind works which is an excellent start. Good luck :)
> Feel free to ask more questions as you go.
>
> Nick
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by:
> Tame your development challenges with Apache's Geronimo App Server. Download
> it for free - -and be entered to win a 42" plasma tv or your very own
> Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
|
|
From: Maurice v. d. P. <gri...@ge...> - 2005-11-06 16:05:13
|
Were you planning on having your tool report the place of allocation as well? It would help in those cases where the last reference to memory is not the one that was erroneously forgotten/overwritten. I'm not sure how common this case is though. Regards, Maurice. --=20 Maurice van der Pot Gentoo Linux Developer gri...@ge... http://www.gentoo.org Creator of BiteMe! gri...@kf... http://www.kfk4ever.com |
|
From: Bryan M. <om...@br...> - 2005-11-06 16:23:18
|
On Sunday 06 Nov 2005 16:04, Maurice van der Pot wrote: > Were you planning on having your tool report the place of allocation as > well? It would help in those cases where the last reference to memory is > not the one that was erroneously forgotten/overwritten. I'm not sure how > common this case is though. > > Regards, > Maurice. > > -- > Maurice van der Pot > > Gentoo Linux Developer gri...@ge... http://www.gentoo.org > Creator of BiteMe! gri...@kf... http://www.kfk4ever.com > > Hi Maurice, The place that the block was allocated (call to malloc() et. al) is reported along with the place that the leak occurred. If you mean the place that the pointer was allocated, that is a possibility as I do record this information (my current version has moved on somewhat from the alpha I posted :D). Would you like it to be reported as part of the leak message or is there something else you were thinking of? I am thinking of a detail trace option that will report all tracked pointers for a single object / objects created by a single line of code. I'm not sure how to select this yet but once I get the rest of the detection going, it would probably be a nice addition. I hope the suppression api might come to my rescue but I will quite happilly do it the open source way (ie. someone that knows what they are on about looks at the code and suggests a good solution). Bryan "Brain Murders" Meredith |
|
From: Maurice v. d. P. <gri...@ge...> - 2005-11-06 21:04:10
|
On Sun, Nov 06, 2005 at 04:23:06PM +0000, Bryan Meredith wrote: > The place that the block was allocated (call to malloc() et. al) is repor= ted=20 > along with the place that the leak occurred. That's what I meant. > Would you like it to be reported as part of the leak message or is there= =20 > something else you were thinking of? Never mind, I mixed things up. A while back I was thinking about tracking= =20 undefined values to report where they came from, but that is more of a prob= lem=20 because there are so many of them. Regards, Maurice. --=20 Maurice van der Pot Gentoo Linux Developer gri...@ge... http://www.gentoo.org Creator of BiteMe! gri...@kf... http://www.kfk4ever.com |