From: Michael <le...@nt...> - 2002-05-08 18:36:54
|
Hello, First a bit of unashamed gushing about how good uml is, I've just got around to playing with it today and "it rocks" :o) That should +1 the counter. I'm having a problem running the current debian unstable version (looks pretty recent, if not the latest) with jail getting an error I've seen in the list archives, so apologies if this is really done and dusted. Kernel panic: protect_vm_page : protect failed, errno = -12 I traced this to devfs_alloc_unique_number, which calls vmalloc and gets back 0xa3000000 as an address, when it does the bits memset in that function, it gets a sig 11 (and the signal processing then fails unprotecting the memory as well, giving the panic) gdb says it can't access 0xa3000000 as well. Just to confuse things, there has been a bug in devfs in the version of the kernel I'm using in that function caused by the memset, but I believe I've patched that - besides it seems more that the vmalloc return value is b0rked rather than the amount of data being set. Kernel version of uml Linux usermode 2.4.18-22um #5 Wed May 8 17:39:52 BST 2002 i686 unknown #0 panic (fmt=0xa0125f40 "protect_vm_page : protect failed, errno = %d\n") at panic.c:52 #1 0xa00bdde6 in protect_vm_page (addr=2734686208, w=1, must_succeed=0) at tlb.c:145 #2 0xa00bde44 in mprotect_kernel_vm (w=1) at tlb.c:162 #3 0xa00c13c1 in mprotect_kernel_mem (w=1, delay_signals=0) at process_kern.c:652 #4 0xa00c140f in unprotect_kernel_mem (delay_signals=0) at process_kern.c:663 #5 0xa00bf2cf in sig_handler (sig=11, sc= {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, __dsh = 0, edi = 2734686208, esi = 0, ebp = 2693347564, esp = 2693347524, ebx = 128, edx = 16, ecx = 4, eax = 0, trapno = 14, err = 6, eip = 2684746593, cs = 35, __csh = 0, eflags = 66066, esp_at_signal = 2693347524, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 134283264, cr2 = 2734686208}) at trap_user.c:462 #6 <signal handler called> #7 0xa005fb61 in devfs_alloc_unique_number (space=0xa014af80) (gdb) info line *2684746593 Line 411 of "/home/michael/user-mode-linux-2.4.18.22um/kernel-source-2.4.18/include/asm/arch/string.h" starts at address 0xa005fb59 <devfs_alloc_unique_number+257> and ends at 0xa005fb70 <devfs_alloc_unique_number+280>. (gdb) info sym 2684746593 devfs_alloc_unique_number + 265 in section .text TIA, If you need anything else, just shout... -- Michael. |
From: Matt Z. <md...@de...> - 2002-05-08 18:58:12
|
On Wed, May 08, 2002 at 07:36:48PM +0100, Michael wrote: > I'm having a problem running the current debian unstable version (looks > pretty recent, if not the latest)[...] FYI, I uploaded 2.4.18-23um last night, and it should be in unstable later today. I don't know whether it will address your bug or not. Here are Jeff's changes from the announcement: a couple of small iomem bug fixes cleanups of the mconsole and hostaudio drivers compilation fixes in smp.c better handling of /proc/cpuinfo -- - mdz |
From: Jeff D. <jd...@ka...> - 2002-05-08 20:33:07
|
le...@nt... said: > Kernel panic: protect_vm_page : protect failed, errno = -12 This has been reported before, but I don't understand what is happening. -12 == -ENOMEM, so somehow the host thinks it doesn't have enough memory to do the map (which different from actually allocating that page, which happens later, when it's touched). Basically, UML seems to be doing things right, but the host is returning -ENOMEM and messing things up. The return from vmalloc is fine, BTW. What happens with virtual mappings is that vmalloc returns a pointer to an empty region of address space and the seg fault handler will put a page behind it when it is touched. This was seen by David Coulson: http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=101792519627117&w=2 and I never did get a good idea what was going on there. From reading mmap, it seemed that it could fail this way only if the process already had an insane number (>65536 iirc) of maps, or the host was short of memory and a kmalloc failed. But neither seemed to be true for David. Jeff |
From: Michael <le...@nt...> - 2002-05-08 21:15:08
|
On Wed, May 08, 2002 at 04:34:10PM -0400, Jeff Dike wrote: > The return from vmalloc is fine, BTW. What happens with virtual mappings > is that vmalloc returns a pointer to an empty region of address space > and the seg fault handler will put a page behind it when it is touched. Ah, light-bulb over head. The sig11 is supposed to happen and that's really the place for me to start debugging from...thanks, I'll try that. > >From reading mmap, it seemed that it could fail this way only if the process > already had an insane number (>65536 iirc) of maps, or the host was short > of memory and a kmalloc failed. But neither seemed to be true for David. Nod, this is during boot, before much has happened at all, so I can't see it being a valid oom condition. -- Michael. |
From: Michael <le...@nt...> - 2002-05-08 21:51:33
|
On Wed, May 08, 2002 at 10:15:02PM +0100, Michael wrote: > On Wed, May 08, 2002 at 04:34:10PM -0400, Jeff Dike wrote: > > The return from vmalloc is fine, BTW. What happens with virtual mappings > > is that vmalloc returns a pointer to an empty region of address space > > and the seg fault handler will put a page behind it when it is touched. > > Ah, light-bulb over head. The sig11 is supposed to happen and that's really the place for > me to start debugging from...thanks, I'll try that. Sorry to reply to myself..it looks like it's something in 2.4.19-pre7-ac2 as I've just booted the host into 2.4.18 and it worked fine. -- Michael. |
From: Michael <le...@nt...> - 2002-05-08 22:26:56
|
On Wed, May 08, 2002 at 04:34:10PM -0400, Jeff Dike wrote: > le...@nt... said: > > Kernel panic: protect_vm_page : protect failed, errno = -12 > > This has been reported before, but I don't understand what is happening. > > -12 == -ENOMEM, so somehow the host thinks it doesn't have enough memory > to do the map (which different from actually allocating that page, which > happens later, when it's touched). It looks straightforward now, a couple of places in mprotect.c return ENOMEM not EFAULT in 2.4.19-pre, I guess these are triggered in 2.4.18 but you handle the EFAULT case. -- Michael. |
From: Jeff D. <jd...@ka...> - 2002-05-09 04:17:54
|
le...@nt... said: > It looks straightforward now, a couple of places in mprotect.c return > ENOMEM not EFAULT in 2.4.19-pre, I guess these are triggered in 2.4.18 > but you handle the EFAULT case. Can you tell what exactly is triggering them? Jeff |
From: Michael <le...@nt...> - 2002-05-09 09:15:36
|
On Thu, May 09, 2002 at 12:20:22AM -0500, Jeff Dike wrote: > le...@nt... said: > > It looks straightforward now, a couple of places in mprotect.c return > > ENOMEM not EFAULT in 2.4.19-pre, I guess these are triggered in 2.4.18 > > but you handle the EFAULT case. > > Can you tell what exactly is triggering them? http://uwsg.iu.edu/hypermail/linux/kernel/0203.2/1097.html <hc...@ca...> (02/03/15 1.197.1.2) [PATCH] Error return fixes Hi Marcelo, it looks like SuSE did some audit of the syscall error return values. (Maybe for LSB?), the attached patch, which I extracted from their tree, contains following fixes: o msync/mrptotect are not supposed to return EFAULT, return ENOMEM instead. This, I believe, first appeared in 2.4.19-pre4. Looks wrong to me (at least from the man page) -- Michael. |