From: John R. <jreiser@BitWagon.com> - 2007-12-04 23:31:47
Attachments:
ubd_kern.c.patch
|
Function do_io() in arch/um/drivers/ubd_kern.c can read uninitialized memory when scanning the .sector_mask. During startup the maximum .length is 64K (and has been observed), so 128 bits are needed [512-byte sectors.] Initialized .sector_mask has 32 bits, and initialized .cow_offset has 64, so 32 bits must come from .bitmap_words[0]; but .bitmap_words is not initialized by prepare_request(). If .fds[0]==.fds[1], as it is in early startup, then the net effect of scanning the uninit .bitmap_words[0] is "merely" a randomness and possible slowdown in I/O operations, which is better to avoid anyway. Either call blk_queue_max_sectors() much earlier (and always) in order to restrict all transfers to at most 32 [or 96] sectors, else apply the attached patch to clear .bitmap_words[0]. -- John Reiser, jreiser@BitWagon.com |
From: Jeff D. <jd...@ad...> - 2007-12-05 00:22:30
|
Quite aside from the actual bug (which I haven't looked at yet), I note this: > + * 2007-12-04 jreiser (valgrind/memcheck) Have you actually got UML running under valgrind? Jeff -- Work email - jdike at linux dot intel dot com |
From: John R. <jreiser@BitWagon.com> - 2007-12-05 00:46:36
|
> Have you actually got UML running under valgrind? It's up to the point of user mode for process 1; see log below. That means much of initialization, annotating the slab allocator (kmalloc) enough to "hide" the allocation structures from outside access but allow the allocator to reference them itself, and enough changes to valgrind to tolerate more forms of clone() and to handle switching stacks via longjmp(). I sent my changes for valgrind-3.2.3 to valgrind-developers last weekend. ----- console transcript running UML under memcheck [snip] TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Initialized stdio console driver Console initialized on /dev/tty0 Initializing software serial port version 1 **24577** new_thread_handler fn=0x807FD8C arg=0x822AF48 [snip: panic due to SIGSEGV seen from wait_stub_done()] ----- -- John Reiser, jreiser@BitWagon.com |
From: Jeff D. <jd...@ad...> - 2007-12-05 02:17:23
|
On Tue, Dec 04, 2007 at 04:46:36PM -0800, John Reiser wrote: > It's up to the point of user mode for process 1; see log below. > That means much of initialization, annotating the slab allocator > (kmalloc) enough to "hide" the allocation structures from outside access > but allow the allocator to reference them itself, and enough changes > to valgrind to tolerate more forms of clone() and to handle switching > stacks via longjmp(). I sent my changes for valgrind-3.2.3 to > valgrind-developers last weekend. Neato, that's further than I ever managed. Jeff -- Work email - jdike at linux dot intel dot com |
From: Jeff D. <jd...@ad...> - 2007-12-06 02:10:25
|
On Tue, Dec 04, 2007 at 04:46:36PM -0800, John Reiser wrote: > and enough changes > to valgrind to tolerate more forms of clone() I was going to ask whether you were treating the clone that creates the userspace process differently from other clones, but I guess the bit below answers that question: > **24577** new_thread_handler fn=0x807FD8C arg=0x822AF48 > [snip: panic due to SIGSEGV seen from wait_stub_done()] And I guess it's not treated differently and valgrind continues grinding the child, and it craps out when it unmaps everything in the child address space except the stubs. Whether you want the userspace process to escape valgrind or not depends on whether you want full-system analysis (which some people do, and cachegrind in particular would be very interesting on a full system) or whether you want to smoke out kernel bugs (which is my main interest here). It'll be easier to let the userspace child escape - invent an annotation for "don't follow this clone". Longer-term, following this clone and allowing full-system analysis would require some thought. The first problem, of course, is that about the first the stub does in the new process is unmap everything but itself, including valgrind. Jeff -- Work email - jdike at linux dot intel dot com |
From: John R. <jreiser@BitWagon.com> - 2007-12-06 04:37:31
|
Jeff Dike wrote: > It'll be easier to let the userspace child escape - invent an > annotation for "don't follow this clone". Thanks for the tip. I was getting stuck trying to figure out the ptrace() shenanigans involved with the stub [skas0 mode] that turns SIGSEGV into SIGUSR1. Valgrind has its own ideas about what should happen with signals, and the ptracing gets complicated. > Longer-term, following this clone and allowing full-system analysis > would require some thought. The first problem, of course, is that > about the first the stub does in the new process is unmap everything > but itself, including valgrind. The very first thing that the skas0 stub does is field SIGSEGV because the page which contains the entry point of the PT_INTERP of /sbin/init is not present. The signal handler in the stub forwards this SIGSGEV to uml as SIGUSR1, and the ptrace()ing code inside uml "understands" what the child is doing. Not so when valgrind (or any other virtualization) is involved. Perhaps valgrind could recognize the skas0 stub (SIGSEGV.sa_handler >= 0xbffe0000, etc.) and adapt, or perhaps both valgrind and uml should cooperate here. I'm still puzzling over this one. -- John Reiser, jreiser@BitWagon.com |
From: Jeff D. <jd...@ad...> - 2007-12-06 05:01:30
|
On Wed, Dec 05, 2007 at 08:37:30PM -0800, John Reiser wrote: > Thanks for the tip. I was getting stuck trying to figure out the > ptrace() shenanigans involved with the stub [skas0 mode] that turns > SIGSEGV into SIGUSR1. Valgrind has its own ideas about what > should happen with signals, and the ptracing gets complicated. I think if you just don't follow that clone, things will be better. However, since you ask, what happens is this: process (at startup, in userspace_tramp) sets the handler in the stub as its SIGSEGV handler process accesses memory that hasn't been mapped and segfaults UML kernel sees SIGSEGV and allows it to be delivered stub SIGSEGV handler reads page fault information out of its sigcontext struct and puts it someplace the UML kernel can find handler sends itself a SIGUSR1 SIGUSR1 is masked in this handler, so it's delivered right after the sigreturn UML kernel sees SIGUSR1 and knows the page fault data is available > The very first thing that the skas0 stub does is field SIGSEGV because > the page which contains the entry point of the PT_INTERP of /sbin/init > is not present. Right, that's the first page fault. > The signal handler in the stub forwards this SIGSGEV > to uml as SIGUSR1, and the ptrace()ing code inside uml "understands" > what the child is doing. See above, it's not really describable as forwarding. > Perhaps valgrind could recognize the > skas0 stub (SIGSEGV.sa_handler >= 0xbffe0000, etc.) and adapt, > or perhaps both valgrind and uml should cooperate here. I'm still > puzzling over this one. I would just let this clone escape - things should just work then. Jeff -- Work email - jdike at linux dot intel dot com |