|
From: Paul F. <pj...@wa...> - 2023-03-30 20:36:08
|
Hi
I'm getting towards the end of adding checks for aligned and sized
allocation / deallocation checks (valid alignment values and that size /
aligment matches new and delete).
In C++ these checks are not a big deal as the size and alignment used
with new and delete are normally all dealt with by the compiler. The
upcoming C23 functions will be more dangerous as the arguments will be
user-provided.
One thing that is a bit thorny is what to do with MC_Chunk and the
alignment value that needs storing. At the moment I have
struct _MC_Chunk {
struct _MC_Chunk* next;
Addr data; // Address of the actual block.
SizeT szB : (sizeof(SizeT)*8)-2; // size 30 or 62 bits
MC_AllocKind allockind : 2; // 2 bits
// Added alignment here for now
SizeT alignB;
ExeContext* where[0]; // VLA 0 to 2 pointers
}
As it is that will add 1 word overhead per heap allocation.
Wee need at least 6 more bits for the alignment (if we store it as a
log2). Valgrind imposes a 16Mbyte limit on the alignment, so we need 24
bits if we store the raw value.
The choices that I see are
1. Take it on the chin and accept the extra overhead.
2. Make this 64-bit only. As far as I know the largest physical memory
space implemented in hardware is 52 bits (4 million gigabytes), meaning
we have 12 bits to spare. That's enough for the log2 value of the
alignment. 32-bit can't really afford to have more bits of size stolen.
That would mean some ugly code with lots of ifdefs.It would also mean
not being able to store the original alignment value (which in some
cases is not necessarily a power of 2).
3. Make this optional.
That would mean something like making a new union
union MC_Extra
SizeT alignB;
ExeContext* where;
};
And then replacing the last element
MC_extra extra[0];
'extra' would contain 0 to 3 elements, the first the alignment (if
enabled) and the next 0 to 2 elements as before for where[].
I would use --show-mismatched-frees to control adding the alignment and
turn it off by default.
I don't have a strong personal preference. 1 is the cleanest to
implement. 2 is hackey but somewhat pragmatic. 3 is a bit of a compromise.
A+
Paul
|
|
From: Nicholas N. <n.n...@gm...> - 2023-03-30 23:55:58
|
I'd do the simple thing.
BTW, will this actually cause increased memory consumption? I can't
remember the details of Valgrind allocator and whether it involves size
classes. Does an allocation of, say, 40 bytes get rounded up to 48 or
something like that? If so, then you might get this for free. (It probably
depends on how frequently the VLA at the end takes on different sizes.)
Also, if space really is tight here, I wonder if the `next` pointer could
be avoided somehow.
Nick
On Fri, 31 Mar 2023 at 07:37, Paul Floyd <pj...@wa...> wrote:
> Hi
>
> I'm getting towards the end of adding checks for aligned and sized
> allocation / deallocation checks (valid alignment values and that size /
> aligment matches new and delete).
>
> In C++ these checks are not a big deal as the size and alignment used
> with new and delete are normally all dealt with by the compiler. The
> upcoming C23 functions will be more dangerous as the arguments will be
> user-provided.
>
> One thing that is a bit thorny is what to do with MC_Chunk and the
> alignment value that needs storing. At the moment I have
>
> struct _MC_Chunk {
> struct _MC_Chunk* next;
> Addr data; // Address of the actual block.
> SizeT szB : (sizeof(SizeT)*8)-2; // size 30 or 62 bits
> MC_AllocKind allockind : 2; // 2 bits
>
>
> // Added alignment here for now
>
> SizeT alignB;
>
>
> ExeContext* where[0]; // VLA 0 to 2 pointers
> }
>
>
> As it is that will add 1 word overhead per heap allocation.
>
> Wee need at least 6 more bits for the alignment (if we store it as a
> log2). Valgrind imposes a 16Mbyte limit on the alignment, so we need 24
> bits if we store the raw value.
>
>
> The choices that I see are
>
> 1. Take it on the chin and accept the extra overhead.
>
> 2. Make this 64-bit only. As far as I know the largest physical memory
> space implemented in hardware is 52 bits (4 million gigabytes), meaning
> we have 12 bits to spare. That's enough for the log2 value of the
> alignment. 32-bit can't really afford to have more bits of size stolen.
>
> That would mean some ugly code with lots of ifdefs.It would also mean
> not being able to store the original alignment value (which in some
> cases is not necessarily a power of 2).
>
>
> 3. Make this optional.
>
> That would mean something like making a new union
>
> union MC_Extra
> SizeT alignB;
> ExeContext* where;
> };
>
> And then replacing the last element
>
> MC_extra extra[0];
>
> 'extra' would contain 0 to 3 elements, the first the alignment (if
> enabled) and the next 0 to 2 elements as before for where[].
>
> I would use --show-mismatched-frees to control adding the alignment and
> turn it off by default.
>
>
> I don't have a strong personal preference. 1 is the cleanest to
> implement. 2 is hackey but somewhat pragmatic. 3 is a bit of a compromise.
>
> A+
> Paul
>
>
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
|
|
From: Julian S. <jse...@gm...> - 2023-03-31 07:32:32
|
On 31/03/2023 01:55, Nicholas Nethercote wrote: > I'd do the simple thing. I agree. We already use so much memory that I imagine the extra overhead is close to being in the noise. If an application has 1 million blocks on the go, this is only going to add 8 MB to the overall space use -- perhaps a bit more or bit less, per Nick's comments about 40 vs 48 etc. But this is insignificant considering that (eg) the default translated code cache size is already several hundred megabytes. J |
|
From: Paul F. <pj...@wa...> - 2023-05-17 06:45:33
|
On 31-03-23 09:32, Julian Seward wrote: > On 31/03/2023 01:55, Nicholas Nethercote wrote: >> I'd do the simple thing. > > I agree. We already use so much memory that I imagine the extra overhead > is close to being in the noise. If an application has 1 million blocks on > the go, this is only going to add 8 MB to the overall space use -- perhaps > a bit more or bit less, per Nick's comments about 40 vs 48 etc. But > this is > insignificant considering that (eg) the default translated code cache size > is already several hundred megabytes. > > J > Hi I did a quick test, 10 million ints allocated. Without this patch 1636M, with this patch 1708M, or about 4.4% more. A bit less than I expected. A+ Paul |