Thread: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

Brought to you by: blaisorblade, derrichard, jdike, rusty

user-mode-linux-devel

[uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: John R. <jreiser@BitWagon.com> - 2007-12-04 23:31:47

Function do_io() in arch/um/drivers/ubd_kern.c can read uninitialized memory
when scanning the .sector_mask.  During startup the maximum .length
is 64K (and has been observed), so 128 bits are needed [512-byte sectors.]
Initialized .sector_mask has 32 bits, and initialized .cow_offset has 64,
so 32 bits must come from .bitmap_words[0]; but .bitmap_words is not
initialized by prepare_request().

If .fds[0]==.fds[1], as it is in early startup, then the net effect
of scanning the uninit .bitmap_words[0] is "merely" a randomness and
possible slowdown in I/O operations, which is better to avoid anyway.

Either call blk_queue_max_sectors() much earlier (and always) in order
to restrict all transfers to at most 32 [or 96] sectors, else apply
the attached patch to clear .bitmap_words[0].

-- 
John Reiser, jreiser@BitWagon.com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: Jeff D. <jd...@ad...> - 2007-12-05 00:22:30

Quite aside from the actual bug (which I haven't looked at yet), I note this:

> + * 2007-12-04 jreiser (valgrind/memcheck)

Have you actually got UML running under valgrind?

				Jeff

-- 
Work email - jdike at linux dot intel dot com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: John R. <jreiser@BitWagon.com> - 2007-12-05 00:46:36

> Have you actually got UML running under valgrind?

It's up to the point of user mode for process 1; see log below.
That means much of initialization, annotating the slab allocator
(kmalloc) enough to "hide" the allocation structures from outside access
but allow the allocator to reference them itself, and enough changes
to valgrind to tolerate more forms of clone() and to handle switching
stacks via longjmp().  I sent my changes for valgrind-3.2.3 to
valgrind-developers last weekend.

----- console transcript running UML under memcheck
  [snip]
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Initialized stdio console driver
Console initialized on /dev/tty0
Initializing software serial port version 1
**24577** new_thread_handler  fn=0x807FD8C  arg=0x822AF48
  [snip: panic due to SIGSEGV seen from wait_stub_done()]
-----

-- 
John Reiser, jreiser@BitWagon.com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: Jeff D. <jd...@ad...> - 2007-12-05 02:17:23

On Tue, Dec 04, 2007 at 04:46:36PM -0800, John Reiser wrote:
> It's up to the point of user mode for process 1; see log below.
> That means much of initialization, annotating the slab allocator
> (kmalloc) enough to "hide" the allocation structures from outside access
> but allow the allocator to reference them itself, and enough changes
> to valgrind to tolerate more forms of clone() and to handle switching
> stacks via longjmp().  I sent my changes for valgrind-3.2.3 to
> valgrind-developers last weekend.

Neato, that's further than I ever managed.

				Jeff

-- 
Work email - jdike at linux dot intel dot com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: Jeff D. <jd...@ad...> - 2007-12-06 02:10:25

On Tue, Dec 04, 2007 at 04:46:36PM -0800, John Reiser wrote:
> and enough changes
> to valgrind to tolerate more forms of clone() 

I was going to ask whether you were treating the clone that creates
the userspace process differently from other clones, but I guess the
bit below answers that question:

> **24577** new_thread_handler  fn=0x807FD8C  arg=0x822AF48
>   [snip: panic due to SIGSEGV seen from wait_stub_done()]

And I guess it's not treated differently and valgrind continues
grinding the child, and it craps out when it unmaps everything in the
child address space except the stubs.

Whether you want the userspace process to escape valgrind or not
depends on whether you want full-system analysis (which some people
do, and cachegrind in particular would be very interesting on a full
system) or whether you want to smoke out kernel bugs (which is my main
interest here).

It'll be easier to let the userspace child escape - invent an
annotation for "don't follow this clone".

Longer-term, following this clone and allowing full-system analysis
would require some thought.  The first problem, of course, is that
about the first the stub does in the new process is unmap everything
but itself, including valgrind.

				Jeff

-- 
Work email - jdike at linux dot intel dot com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: John R. <jreiser@BitWagon.com> - 2007-12-06 04:37:31

Jeff Dike wrote:
> It'll be easier to let the userspace child escape - invent an
> annotation for "don't follow this clone".

Thanks for the tip.  I was getting stuck trying to figure out the
ptrace() shenanigans involved with the stub [skas0 mode] that turns
SIGSEGV into SIGUSR1.  Valgrind has its own ideas about what
should happen with signals, and the ptracing gets complicated.

> Longer-term, following this clone and allowing full-system analysis
> would require some thought.  The first problem, of course, is that
> about the first the stub does in the new process is unmap everything
> but itself, including valgrind.

The very first thing that the skas0 stub does is field SIGSEGV because
the page which contains the entry point of the PT_INTERP of /sbin/init
is not present.   The signal handler in the stub forwards this SIGSGEV
to uml as SIGUSR1, and the ptrace()ing code inside uml "understands"
what the child is doing.  Not so when valgrind (or any other
virtualization) is involved.  Perhaps valgrind could recognize the
skas0 stub (SIGSEGV.sa_handler >= 0xbffe0000, etc.) and adapt,
or perhaps both valgrind and uml should cooperate here.  I'm still
puzzling over this one.

-- 
John Reiser, jreiser@BitWagon.com

Re: [uml-devel] reading uninit memory in do_io (ubd_kern.c)

From: Jeff D. <jd...@ad...> - 2007-12-06 05:01:30

On Wed, Dec 05, 2007 at 08:37:30PM -0800, John Reiser wrote:
> Thanks for the tip.  I was getting stuck trying to figure out the
> ptrace() shenanigans involved with the stub [skas0 mode] that turns
> SIGSEGV into SIGUSR1.  Valgrind has its own ideas about what
> should happen with signals, and the ptracing gets complicated.

I think if you just don't follow that clone, things will be better.
However, since you ask, what happens is this:
	process (at startup, in userspace_tramp) sets the handler in
the stub as its SIGSEGV handler
	process accesses memory that hasn't been mapped and segfaults
	UML kernel sees SIGSEGV and allows it to be delivered
	stub SIGSEGV handler reads page fault information out of its
sigcontext struct and puts it someplace the UML kernel can find
	handler sends itself a SIGUSR1
	SIGUSR1 is masked in this handler, so it's delivered right
after the sigreturn
	UML kernel sees SIGUSR1 and knows the page fault data is
available

> The very first thing that the skas0 stub does is field SIGSEGV because
> the page which contains the entry point of the PT_INTERP of /sbin/init
> is not present.

Right, that's the first page fault.

> The signal handler in the stub forwards this SIGSGEV
> to uml as SIGUSR1, and the ptrace()ing code inside uml "understands"
> what the child is doing.

See above, it's not really describable as forwarding.

> Perhaps valgrind could recognize the
> skas0 stub (SIGSEGV.sa_handler >= 0xbffe0000, etc.) and adapt,
> or perhaps both valgrind and uml should cooperate here.  I'm still
> puzzling over this one.

I would just let this clone escape - things should just work then.

				Jeff

-- 
Work email - jdike at linux dot intel dot com