|
From: Florian K. <fl...@ei...> - 2015-02-02 18:16:42
|
Consider this snippet from m_addrinfo.c:
const NSegment *seg = VG_(am_find_nsegment) (a);
...
if (seg->kind == SkFileC)
ai->Addr.SegmentKind.filename
= VG_(strdup)("mc.da.skfname", VG_(am_get_filename)(seg));
ai->Addr.SegmentKind.hasR = seg->hasR;
ai->Addr.SegmentKind.hasW = seg->hasW;
ai->Addr.SegmentKind.hasX = seg->hasX;
There is nothing wrong here. It is clear, though, that the code
assumes that strdup does not modify the memory pointed to by 'seg'.
But that is not necessarily true. The problem occurs when there is
no memory available to store the string. In that case we may get the
following call chain:
- strdup
- arena_malloc
- newSuperblock
- am_mmap_anon_float_valgrind
- add_segment
- split_nsegments_lo_and_hi
- split_nsegment_at
The last function in this chain may do this:
for (j = nsegments_used-1; j > i; j--)
nsegments[j+1] = nsegments[j];
So... with 'seg' being a pointer into the nsegments array it means that
after the strdup the contents of the memory pointed to by seg
may have been changed by above loop.
Not so good.
The pointers to NSegment we return and deal with in the various
externally visible aspacemgr functions cannot be pointers into the
sorted array of nsegments if its contents gets moved around.
I'm currently thinking of adding
NSegment *sorted_nsegments[VG_N_SEGMENTS]
as a fix. Basically, introducing another level of indirection. But I
haven't given it too much thought yet and an open for ideas.
I won't be getting to fixing this until sometime next week.
Florian
|
|
From: Julian S. <js...@ac...> - 2015-02-03 13:46:38
|
On 02/02/15 19:16, Florian Krohm wrote:
> Consider this snippet from m_addrinfo.c:
>
> const NSegment *seg = VG_(am_find_nsegment) (a);
> ...
> if (seg->kind == SkFileC)
> ai->Addr.SegmentKind.filename
> = VG_(strdup)("mc.da.skfname", VG_(am_get_filename)(seg));
> ai->Addr.SegmentKind.hasR = seg->hasR;
> ai->Addr.SegmentKind.hasW = seg->hasW;
> ai->Addr.SegmentKind.hasX = seg->hasX;
The basic problem is that any dynamic memory allocation potentially
invalidates any NSegment* that you might have. I don't know how we
can mechanically (compile time, or run-time assertions etc) avoid that.
> I'm currently thinking of adding
> NSegment *sorted_nsegments[VG_N_SEGMENTS]
> as a fix. Basically, introducing another level of indirection.
Can you say how that would help?
In this case -- considering that we are trying to describe a file segment
that belongs to the client (a SkFileC) it can't be the case that calling
VG_(malloc) makes the segment become invalid, because at worst, dynamic
memory allocation will cause expansion or merging of anonymous segments
that belong to V (SkAnonV's). So we're OK. But I agree, it's pretty
fragile. Maybe we should split NSegment* into two different types, one for
client segments, one for V segments? Or something like that?
J
|
|
From: Florian K. <fl...@ei...> - 2015-02-03 17:46:22
|
On 03.02.2015 14:46, Julian Seward wrote:
>
>> I'm currently thinking of adding
>> NSegment *sorted_nsegments[VG_N_SEGMENTS]
>> as a fix. Basically, introducing another level of indirection.
>
> Can you say how that would help?
sorted_nsegments is an array of pointers to NSegment. When iterated over
the segments appear sorted by address (like nsegments today).
Moving elements in sorted_nsegments around is OK, as it is just pointers
to NSegment. We still would have the nsegments array but its elements
would never get moved around.
But this would not help with merging segments. So forget about it.
> Maybe we should split NSegment* into two different types, one for
> client segments, one for V segments? Or something like that?
I'd say, let's make the definition of NSegment private to m_aspacemgr.
If we can do this without introducing new issues, when we would have
eliminated the problem.
I have not looked at all instances where NSegment occurs in the source
code but for those I have looked at all that is needed is some kind of
wrapper function. For instance in m_redir.c
static Bool is_plausible_guest_addr(Addr a)
{
NSegment const* seg = VG_(am_find_nsegment)(a);
return seg != NULL
&& (seg->kind == SkAnonC || seg->kind == SkFileC)
&& (seg->hasX || seg->hasR);
}
could be replaced with (sketch):
return VG_(am_find_segment_with_properties)(a, AnonC | FileC,
READ | EXE);
I'm sure there are spots where it is more complex.
Florian
|
|
From: Julian S. <js...@ac...> - 2015-02-04 11:08:59
|
On 03/02/15 18:46, Florian Krohm wrote:
> static Bool is_plausible_guest_addr(Addr a)
> {
> NSegment const* seg = VG_(am_find_nsegment)(a);
> return seg != NULL
> && (seg->kind == SkAnonC || seg->kind == SkFileC)
> && (seg->hasX || seg->hasR);
> }
>
> could be replaced with (sketch):
>
> return VG_(am_find_segment_with_properties)(a, AnonC | FileC,
> READ | EXE);
You could do that, but this example isn't one of the problematic ones.
The problematic ones have the form
seg = VG_(am_find_nsegment)(...)
do dynamic memory allocation
assume seg is still valid
so your proposal would tidy up the NSegment interface a bit, but I don't
think it addresses the underlying issue. But I could be misunderstanding
what you intend.
J
|
|
From: Philippe W. <phi...@sk...> - 2015-02-04 20:42:40
|
On Wed, 2015-02-04 at 12:08 +0100, Julian Seward wrote:
> On 03/02/15 18:46, Florian Krohm wrote:
>
> > static Bool is_plausible_guest_addr(Addr a)
> > {
> > NSegment const* seg = VG_(am_find_nsegment)(a);
> > return seg != NULL
> > && (seg->kind == SkAnonC || seg->kind == SkFileC)
> > && (seg->hasX || seg->hasR);
> > }
> >
> > could be replaced with (sketch):
> >
> > return VG_(am_find_segment_with_properties)(a, AnonC | FileC,
> > READ | EXE);
>
> You could do that, but this example isn't one of the problematic ones.
> The problematic ones have the form
>
> seg = VG_(am_find_nsegment)(...)
> do dynamic memory allocation
> assume seg is still valid
>
> so your proposal would tidy up the NSegment interface a bit, but I don't
> think it addresses the underlying issue. But I could be misunderstanding
> what you intend.
In fact, anything such as:
1. query the aspace mgr to get some information
2. do anything that potentially changes the segments (such as VG_(malloc))
3. assume the information is still valid
is potentially wrong.
Getting a NSegment in 1. is one case, but if the interface does not
return NSegment, it might be wrong, e.g.
1. check address is not mapped: mapped = VG_(am_find_segment)(address) == NULL;
2. VG_(malloc)
3. do anything based on the assumption that address is not mapped
A possible approach might be:
* tidy up the aspacemgr interface, as proposed by Florian.
i.e. have more functions that do 'all the needed work/conditions'
such as VG_(am_find_segment_with_properties)
(which should rather be named: VG_(is_address_in_a_segment_with_properties)
* for algorithms/places that really need a 'stable' aspacemgr state,
ensure that the aspacemgr is 'frozen' : while it is frozen, any attempt
to change the list of segments will assert, so as to detect the bug(s).
This can eitehr be done explicitely, with
void VG_(am_freeze)(); /// +1 in the freezing level
void VG_(am_unfreeze)() /// -1 in the freezing level
Bool VG_(am_assert_if_frozen)() /// asserts if freezing level != 0
or implictely:
any function returning an NSegment increases the freezing level
the caller must 'give back' all NSegment* received
(this obliges all places that are getting an NSegment to carefully
'return' each NSegment* to the aspacemgr
With the above, if a segment is merged or split while am is frozen,
an assert will automatically fail in aspacemgr.
At relevant places, we can insert calls to VG_(am_assert_if_frozen)
e.g. at the entry of VG_(malloc) : we will detect a bug even if
no segment is really split.
Philippe
|
|
From: Florian K. <fl...@ei...> - 2015-02-04 20:45:00
|
On 04.02.2015 12:08, Julian Seward wrote: > The problematic ones have the form > > seg = VG_(am_find_nsegment)(...) > do dynamic memory allocation > assume seg is still valid > > so your proposal would tidy up the NSegment interface a bit, but I don't > think it addresses the underlying issue. But I could be misunderstanding > what you intend. Let me quote what I said: ...let's make the definition of NSegment private to m_aspacemgr. If we can do this without introducing new issues, when we would have eliminated the problem. In other words, there will be no pointers to NSegment available outside m_aspacemgr. This will require rewriting some code like the snippet above, introducing new functions, moving stuff around, etc.. Florian |
|
From: Philippe W. <phi...@sk...> - 2015-02-04 21:03:43
|
On Wed, 2015-02-04 at 21:44 +0100, Florian Krohm wrote: > On 04.02.2015 12:08, Julian Seward wrote: > Let me quote what I said: > > ...let's make the definition of NSegment private to m_aspacemgr. > If we can do this without introducing new issues, when we would have > eliminated the problem. Having NSegment* being returned is for sure not helping. But if what is needed is a 'transaction' on the aspacemgr data, then either this transaction is provided by a single call to an aspacemgr function, or alternatively, the 'frozen/unfrozen" technique will allow to detect the not expected aspacemgr state change between calls that are supposed to be in a transaction Philippe |
|
From: Julian S. <js...@ac...> - 2015-02-05 09:33:16
|
On 04/02/15 21:44, Florian Krohm wrote:
> This will require rewriting some code like the snippet above,
> introducing new functions, moving stuff around, etc..
To stick with the original example, which I think was "get me, in malloc'd
memory, a copy of the file name associated with the segment in which this
address 'a' falls", the possibilities are
>> seg = VG_(am_find_nsegment)(...)
>> do dynamic memory allocation
>> assume seg is still valid
in which case we have the danger that seg becomes invalid due to the dynamic
memory allocation,
or you could try to push it in-line, by creating a function exported by
m_aspacemgr to do all of the above. But then that can't work because
m_aspacemgr can't do dynamic memory allocation.
<a bit of background, apologies if you know this already>
This difficulty comes from ancient history of the project (around version 1)
in which it would sometimes, apparently randomly depending on exact details
of the memory layout, segfault at startup, because m_aspacemgr and
m_mallocfree were mutually dependent. Eventually (after much effort)
m_aspacemgr was implemented so as to be almost entirely standalone, so as to
guarantee that the cycle was broken, but the downside is that you can't use
dynamic memory allocation inside it.
m_aspacemgr is essentially responsible for maintaining a data structure
that describes, as accurately as possible, the kernel's view of memory
layout of the process. And to some extent controls it by inspecting
mmap calls. The difficulty comes if m_aspacemgr's internal actions cause
the mapping itself to change.
Hence the comments at aspacemgr-linux.c:37 and priv_aspacemgr.h:35 and :63.
</end background>
This is why I was thinking along the lines of splitting NSegment* into two
different but related types, one of which has the special property that it
remains valid across dynamic memory allocation, and the other that is the
same as the current NSegment*. I say that because this sequence
seg = VG_(am_find_nsegment)(addr)
if (seg is 'file belonging to the client' (SkAnonC))
str = VG_(strdup)( seg->filename )
is in fact harmless, because doing dynamic memory allocation can only
increase or merge anonymous segments belonging to V. But it can't possibly
cause the client's files to be unmapped -- that would be insane -- so 'seg'
here is effectively one of the special not-volatile-across-dynamic-mem-alloc
variants. Hence I am suggesting to distinguish these by giving them a
different type in the sources (or something like that.)
J
|