From: thomas g. <lis...@sp...> - 2001-09-27 06:51:10
|
i just tried to get uml and the xfs tree from sgi to work together but it hangs as soon as xfs filesystem gets loaded (either on boot if statically compiled in or loaded as a module) ... is there any idea where this might come from befroe looking deeper? so far the uml patch went cleanly into the xfs tree (top of the cvs tree) with only one very easy to fix reject ... to get it to compile i used the following small patch: --- ./arch/um/kernel/ksyms.c.org Wed Sep 26 14:47:01 2001 +++ ./arch/um/kernel/ksyms.c Wed Sep 26 14:47:13 2001 @@ -21,3 +21,4 @@ EXPORT_SYMBOL(__do_strncpy_from_user); EXPORT_SYMBOL(flush_tlb_range); EXPORT_SYMBOL(__do_clear_user); +EXPORT_SYMBOL(__do_strnlen_user); --- ./fs/xfs/linux/xfs_globals.c.org Wed Sep 26 17:31:15 2001 +++ ./fs/xfs/linux/xfs_globals.c Wed Sep 26 17:31:30 2001 @@ -45,5 +45,5 @@ int restricted_chown = 0; int scache_linemask = 0x1f; /* second level cache line size mask */ prid_t dfltprid; -unsigned long physmem; +extern unsigned long physmem; int ndquot; --- ./include/asm-um/uaccess.h.org Wed Sep 26 14:45:03 2001 +++ ./include/asm-um/uaccess.h Wed Sep 26 14:45:19 2001 @@ -153,6 +153,14 @@ ¤t->thread.fault_catcher) : len); } +static inline int __clear_user(void *mem, int len) +{ + return(access_ok(VERIFY_WRITE, mem, len) ? + __do_clear_user(mem, len, + ¤t->thread.fault_addr, + ¤t->thread.fault_catcher) : len); +} + extern int __do_strnlen_user(const char *str, unsigned long n, void **fault_addr, void **fault_catcher); maybe something is wrong with it too (but i don't think so) ... any ideas? - thanks in advance t -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |
From: thomas g. <lis...@sp...> - 2001-09-27 12:36:14
|
maybe some more datails about this: 2.4.10-xfs plus uml-2.4.10-3 and when loading the xfs module i get the following backtrace running in debug (gdb) mode: (gdb) bt #0 panic (fmt=0xa01ba1a0 "Kernel mode fault at addr 0x%lx, ip 0x%lx") at panic.c:52 #1 0xa011844b in segv (address=12, ip=2684487610, is_write=0, is_user=0) at trap_kern.c:74 #2 0xa0118ffc in segv_handler (sig=11, sc=0xa46ffb78, usermode=0) at trap_user.c:309 #3 0xa0119124 in sig_handler (sig=11, sc= {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, __dsh = 0, edi = 3, esi = 0, ebp = 2758803044, esp = 2758803024, ebx = 2694052000, edx = 0, ecx = 0, eax = 2692755456, trapno = 14, err = 4, eip = 2684487610, cs = 35, __csh = 0, eflags = 534, esp_at_signal = 2758803024, ss = 43, __ssh = 0, fpstate = 0xa46ffbd0, oldmask = 268435456, cr2 = 12}) at trap_user.c:354 #4 <signal handler called> #5 kfree (objp=0xa093f8a0) at slab.c:1430 #6 0xa000c76e in sys_init_module (name_user=0x80d9e30 "xfs", mod_user=0x403bf008) at module.c:574 #7 0xa0117445 in execute_syscall (regs= {regs = {135110192, 1077669896, 25, 134519728, 135195456, 2684342216, 4294967258, 43, 43, 0, 0, 128, 134765230, 35, 646, 2684342160, 43}}) at syscall_kern.c:325 #8 0xa011756b in syscall_handler (unused=0x0) at syscall_user.c:82 (gdb) maybe this helps - also it might be of interest that the xfs.o module compiled with debugging is 20+ mb in size :-) t thomas graichen <lis...@sp...> wrote: > i just tried to get uml and the xfs tree from sgi to work together > but it hangs as soon as xfs filesystem gets loaded (either on boot > if statically compiled in or loaded as a module) ... is there any > idea where this might come from befroe looking deeper? > so far the uml patch went cleanly into the xfs tree (top of the > cvs tree) with only one very easy to fix reject ... to get it to > compile i used the following small patch: > --- ./arch/um/kernel/ksyms.c.org Wed Sep 26 14:47:01 2001 > +++ ./arch/um/kernel/ksyms.c Wed Sep 26 14:47:13 2001 > @@ -21,3 +21,4 @@ > EXPORT_SYMBOL(__do_strncpy_from_user); > EXPORT_SYMBOL(flush_tlb_range); > EXPORT_SYMBOL(__do_clear_user); > +EXPORT_SYMBOL(__do_strnlen_user); > --- ./fs/xfs/linux/xfs_globals.c.org Wed Sep 26 17:31:15 2001 > +++ ./fs/xfs/linux/xfs_globals.c Wed Sep 26 17:31:30 2001 > @@ -45,5 +45,5 @@ > int restricted_chown = 0; > int scache_linemask = 0x1f; /* second level cache line size mask */ > prid_t dfltprid; > -unsigned long physmem; > +extern unsigned long physmem; > int ndquot; > --- ./include/asm-um/uaccess.h.org Wed Sep 26 14:45:03 2001 > +++ ./include/asm-um/uaccess.h Wed Sep 26 14:45:19 2001 > @@ -153,6 +153,14 @@ > ¤t->thread.fault_catcher) : len); > } > > +static inline int __clear_user(void *mem, int len) > +{ > + return(access_ok(VERIFY_WRITE, mem, len) ? > + __do_clear_user(mem, len, > + ¤t->thread.fault_addr, > + ¤t->thread.fault_catcher) : len); > +} > + > extern int __do_strnlen_user(const char *str, unsigned long n, > void **fault_addr, void **fault_catcher); > maybe something is wrong with it too (but i don't think so) ... > any ideas? - thanks in advance > t > -- > thomas graichen <tg...@sp...> ... perfection is reached, not > when there is no longer anything to add, but when there is no > longer anything to take away. --- antoine de saint-exupery > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |
From: Jeff D. <jd...@ka...> - 2001-09-27 13:24:50
|
lis...@sp... said: > #5 kfree (objp=0xa093f8a0) at slab.c:1430 > #6 0xa000c76e in sys_init_module (name_user=0x80d9e30 "xfs", > mod_user=0x403bf008) at module.c:574 Hmmm, since xfs seems to be the problem, and the crash is happening when it's being initialize, it ought to be straightforward (for a corruption bug, anyway) to figure out what the bug is. Assuming that this particular crash is reproducable, of course. Figure out exactly what the corruption is and where that buffer is allocated. Then step through the xfs init and see when the corruption happens. Jeff |
From: thomas g. <lis...@sp...> - 2001-09-30 13:59:16
|
Jeff Dike <jd...@ka...> wrote: > lis...@sp... said: >> #5 kfree (objp=0xa093f8a0) at slab.c:1430 >> #6 0xa000c76e in sys_init_module (name_user=0x80d9e30 "xfs", >> mod_user=0x403bf008) at module.c:574 > Hmmm, since xfs seems to be the problem, and the crash is happening when it's > being initialize, it ought to be straightforward (for a corruption bug, > anyway) to figure out what the bug is. Assuming that this particular crash > is reproducable, of course. > Figure out exactly what the corruption is and where that buffer is allocated. > Then step through the xfs init and see when the corruption happens. ok - looks like this problem came from the fact, that the xfs module is too big with debugging enabled to be usable with a default uml kernel (i.e. 32mb "ram") - using mem=48M solved it but now i run into a different problem which looks different - maybe someone has an idea where to start here or seen something similar ... Initializing stdio console driver Initializing software serial port version 1 mconsole initialized on /tmp/uml/n17wu4/mconsole VFS: Mounted root (ext2 filesystem) readonly. ... here (i.e. on starting init) it panics with Breakpoint 1, panic (fmt=0xa01d7fe9 "protect failed, errno = %d") at panic.c:52 52 bust_spinlocks(1); (gdb) bt #0 panic (fmt=0xa01d7fe9 "protect failed, errno = %d") at panic.c:52 #1 0xa013c687 in protect (addr=12093, len=2743062723, r=1, w=1, x=0) at user_util.c:203 #2 0xa013673a in flush_thread () at exec_kern.c:62 #3 0xa0031808 in flush_old_exec (bprm=0xa08f3e58) at exec.c:566 #4 0xa003f4fa in load_elf_binary (bprm=0xa08f3e58, regs=0x0) at binfmt_elf.c:585 #5 0xa0031ca9 in search_binary_handler (bprm=0xa08f3e58, regs=0x0) at exec.c:809 #6 0xa0031f18 in do_execve (filename=0xa01a168d "/sbin/init", argv=0xa01f0160, envp=0xa01f01a0, regs=0x0) at exec.c:902 #7 0xa01367ff in execve1 (file=0xa01a168d "/sbin/init", argv=0xa01f0160, env=0xa01f01a0) at exec_kern.c:86 #8 0xa013683c in um_execve (file=0xa01a168d "/sbin/init", argv=0xa01f0160, env=0xa01f01a0) at exec_kern.c:96 #9 0xa0008826 in init (unused=0x0) at /usr/src/uml/linux/include/asm/unistd.h:54 #10 0xa013cad9 in new_thread_proc (t=0xa08f0000) at process_kern.c:120 (gdb) a lot of thanks in advance t -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |
From: Jeff D. <jd...@ka...> - 2001-10-01 02:37:16
|
lis...@sp... said: > #1 0xa013c687 in protect (addr=12093, len=2743062723, r=1, w=1, x=0) > at user_util.c:203 That's a very bogus address. That's coming from this line: protect(physmem, high_physmem - physmem, 1, 1, 0); so physmem is apparently very confused, or gdb is. If physmem is 0x2f3d and high_physmem - physmem is 0xa37fd0c3, then high_physmem is 0xa3800000, which is completely reasonable. This says that you booted this UML with 'mem=48M', which is confirmed by > using mem=48M solved it So, it looks like gdb is sane and the value of physmem isn't. How reproducable is this? If it is, then this should be easy to track down. It's happening during the first exec, which is done by process 1. So, I would check the value at the first call to kernel_thread(). This will tell you if it was corrupted by process 0 or 1. Then step from the start of the guilty thread, checking the value, and see where it is when physmem gets messed up. Jeff |
From: thomas g. <lis...@sp...> - 2001-10-02 00:15:10
|
Jeff Dike <jd...@ka...> wrote: > So, it looks like gdb is sane and the value of physmem isn't. that one brought me to the right direction: physmem really was wrong due to my own change - i assumed that the physmem in uml and the one from xfs are the same and thus declared one of them as external to avoid the conflicting symbols while linking the kernel - but that assumption was wrong - the physmem in xfs has nothing directly to do with the physmem in uml - so we have some real namecollision between those two projects here which have to be resolved somehow ... even more: the eicon isdn driver in the kernel also seems to use physmem for itself - i'll mail this posting to the three maintainers of the respective projects to coordinate how to solve this in a clean way (cc'ed to alan cox - maybe it would be a good idea to put this some- where central into the kernel so that anyone who needs it may use it if all those physmems are meaning the same - did not look very close at it - because i think the name really calles for further trouble :-) my solution so far was to rename the physmem of uml to uml_physmem and a kernel build that way booted without any problems ... # cat /proc/version Linux version 2.4.9-xfs-8um (ro...@ca...) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release / Linux-Mandrake 8.0)) #16 Mon Oct 1 22:02:24 CEST 2001 # cat /proc/filesystems nodev proc nodev sockfs nodev tmpfs nodev pipefs ext2 nodev devfs nodev devpts xfs # ... and i assume it to work without problems now too - it's too late now to try it out ok here are the patches which made xfs and uml coexisting happy to- gether - maybe someone else likes to have them and it would be nice if they might in some way go into further uml releases (at least the __clear_user addition to uaccess.h which is required for xfs) and the conflicts regarding physmem will be solved ... t --- ./arch/um/include/user_util.h.physmem Mon Oct 1 21:06:01 2001 +++ ./arch/um/include/user_util.h Mon Oct 1 21:06:36 2001 @@ -32,7 +32,7 @@ extern unsigned long low_physmem; extern unsigned long high_physmem; -extern unsigned long physmem; +extern unsigned long uml_physmem; extern unsigned long end_vm; extern unsigned long start_vm; --- ./arch/um/kernel/mem.c.physmem Mon Oct 1 21:12:39 2001 +++ ./arch/um/kernel/mem.c Mon Oct 1 21:12:53 2001 @@ -66,7 +66,7 @@ for(i=0;i<sizeof(zones_size)/sizeof(zones_size[0]);i++) zones_size[i] = 0; zones_size[1] = (high_physmem >> PAGE_SHIFT) - - (physmem >> PAGE_SHIFT) - zones_size[0]; + (uml_physmem >> PAGE_SHIFT) - zones_size[0]; free_area_init(zones_size); } --- ./arch/um/kernel/exec_kern.c.physmem Mon Oct 1 21:11:49 2001 +++ ./arch/um/kernel/exec_kern.c Mon Oct 1 21:55:42 2001 @@ -59,7 +59,7 @@ current->thread.extern_pid = new_pid; free_page(stack); - protect(physmem, high_physmem - physmem, 1, 1, 0); + protect(uml_physmem, high_physmem - uml_physmem, 1, 1, 0); task_protections((unsigned long) current); force_flush_all(); unblock_signals(); --- ./arch/um/kernel/um_arch.c.physmem Mon Oct 1 21:15:13 2001 +++ ./arch/um/kernel/um_arch.c Mon Oct 1 21:15:35 2001 @@ -115,7 +115,7 @@ #define START 0xa0000000 #endif -unsigned long physmem; +unsigned long uml_physmem; unsigned long start_vm; unsigned long end_vm; @@ -234,7 +234,7 @@ remap_data(ROUND_DOWN(&__bss_start), ROUND_UP(brk_start)); /* Start physical memory at least 4M after the current brk */ - physmem = ROUND_4M(brk_start) + (1 << 22); + uml_physmem = ROUND_4M(brk_start) + (1 << 22); /* Create fake command line from argv[]. */ have_root = 0; @@ -299,7 +299,7 @@ * of physical memory or the remaining space left in the kernel * area of the address space, whichever is smaller. */ - start_vm = physmem + physmem_size + VMALLOC_OFFSET; + start_vm = uml_physmem + physmem_size + VMALLOC_OFFSET; if(start_vm >= get_kmem_end()) panic("Physical memory too large to allow any kernel " "virtual memory"); @@ -313,16 +313,16 @@ printk(KERN_INFO "Kernel virtual memory size shrunk to %ld " "bytes\n", virtmem_size); - setup_range(-1, NULL, physmem, physmem_size, + setup_range(-1, NULL, uml_physmem, physmem_size, physmem_size + VMALLOC_OFFSET + virtmem_size); setup_memory(); - high_physmem = physmem + physmem_size; + high_physmem = uml_physmem + physmem_size; - start_pfn = PFN_UP(__pa(physmem)); + start_pfn = PFN_UP(__pa(uml_physmem)); end_pfn = PFN_DOWN(__pa(high_physmem)); bootmap_size = init_bootmem(start_pfn, end_pfn - start_pfn); - free_bootmem(__pa(physmem) + bootmap_size, - high_physmem - physmem - bootmap_size); + free_bootmem(__pa(uml_physmem) + bootmap_size, + high_physmem - uml_physmem - bootmap_size); #ifdef CONFIG_BLK_DEV_INITRD if(initrd != NULL) read_initrd(initrd); #endif --- ./arch/um/kernel/ksyms.c.physmem Mon Oct 1 21:12:09 2001 +++ ./arch/um/kernel/ksyms.c Mon Oct 1 21:12:18 2001 @@ -10,7 +10,7 @@ EXPORT_SYMBOL(stop); EXPORT_SYMBOL(strtok); -EXPORT_SYMBOL(physmem); +EXPORT_SYMBOL(uml_physmem); EXPORT_SYMBOL(current_task); EXPORT_SYMBOL(set_signals); EXPORT_SYMBOL(kernel_thread); --- ./arch/um/kernel/process_kern.c.physmem Mon Oct 1 21:14:10 2001 +++ ./arch/um/kernel/process_kern.c Mon Oct 1 21:14:14 2001 @@ -521,7 +521,7 @@ { force_flush_all(); if(current->mm != current->p_pptr->mm) - protect(physmem, high_physmem - physmem, 1, 1, 0); + protect(uml_physmem, high_physmem - uml_physmem, 1, 1, 0); task_protections((unsigned long) current); if(current->thread.request.u.fork_finish.from) schedule_tail(current->thread.request.u.fork_finish.from); @@ -748,7 +748,7 @@ start_stack = (unsigned long) current; end_stack = start_stack + PAGE_SIZE * 4; - protect(physmem, start_stack - physmem, 1, 1, 1); + protect(uml_physmem, start_stack - uml_physmem, 1, 1, 1); protect(end_stack, high_physmem - end_stack, 1, 1, 1); } @@ -758,7 +758,7 @@ start_stack = (unsigned long) current; end_stack = start_stack + PAGE_SIZE * 4; - protect(physmem, start_stack - physmem, 1, 1, 1); + protect(uml_physmem, start_stack - uml_physmem, 1, 1, 1); protect(end_stack, high_physmem - end_stack, 1, 1, 1); } --- ./include/asm-um/dma.h.physmem Mon Oct 1 21:07:05 2001 +++ ./include/asm-um/dma.h Mon Oct 1 21:30:45 2001 @@ -5,6 +5,6 @@ #undef MAX_DMA_ADDRESS -#define MAX_DMA_ADDRESS (physmem) +#define MAX_DMA_ADDRESS (uml_physmem) #endif --- ./include/asm-um/page.h.physmem Mon Oct 1 21:08:01 2001 +++ ./include/asm-um/page.h Mon Oct 1 21:08:37 2001 @@ -28,14 +28,14 @@ #endif /* __ASSEMBLY__ */ -extern unsigned long physmem; +extern unsigned long uml_physmem; -#define PAGE_OFFSET (physmem) +#define PAGE_OFFSET (uml_physmem) #define __va_space (8*1024*1024) -#define __pa(x) ((unsigned long) (x) - (physmem)) -#define __va(x) ((void *) ((unsigned long) (x) + (physmem))) +#define __pa(x) ((unsigned long) (x) - (uml_physmem)) +#define __va(x) ((void *) ((unsigned long) (x) + (uml_physmem))) #define virt_to_page(kaddr) (mem_map + (__pa(kaddr) >> PAGE_SHIFT)) #define VALID_PAGE(page) ((page - mem_map) < max_mapnr) --- ./include/asm-um/uaccess.h.physmem Tue Oct 2 01:15:50 2001 +++ ./include/asm-um/uaccess.h Mon Oct 1 21:30:43 2001 @@ -35,7 +35,7 @@ #define set_fs(x) (current->addr_limit = (x)) extern unsigned long end_vm; -extern unsigned long physmem; +extern unsigned long uml_physmem; #define under_task_size(addr, size) \ (((unsigned long) (addr) < TASK_SIZE) && \ @@ -146,6 +146,14 @@ void **fault_catcher); static inline int clear_user(void *mem, int len) +{ + return(access_ok(VERIFY_WRITE, mem, len) ? + __do_clear_user(mem, len, + ¤t->thread.fault_addr, + ¤t->thread.fault_catcher) : len); +} + +static inline int __clear_user(void *mem, int len) { return(access_ok(VERIFY_WRITE, mem, len) ? __do_clear_user(mem, len, -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |
From: thomas g. <lis...@sp...> - 2001-10-02 06:27:26
|
thomas graichen <lis...@sp...> wrote: > ... and i assume it to work without problems now too - it's too late > now to try it out final test: it works :-) # cat /proc/version Linux version 2.4.9-xfs-8um (ro...@ca...) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release / Linux-Mandrake 8.0)) #16 Mon Oct 1 22:02:24 CEST 2001 # cat /proc/mounts /dev/ubd/0 / xfs rw 0 0 none /proc proc rw 0 0 # t -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |
From: thomas g. <lis...@sp...> - 2001-09-30 13:59:17
|
Jeff Dike <jd...@ka...> wrote: > lis...@sp... said: >> #5 kfree (objp=0xa093f8a0) at slab.c:1430 >> #6 0xa000c76e in sys_init_module (name_user=0x80d9e30 "xfs", >> mod_user=0x403bf008) at module.c:574 > Hmmm, since xfs seems to be the problem, and the crash is happening when it's > being initialize, it ought to be straightforward (for a corruption bug, > anyway) to figure out what the bug is. Assuming that this particular crash > is reproducable, of course. > Figure out exactly what the corruption is and where that buffer is allocated. > Then step through the xfs init and see when the corruption happens. please ignore my last post - the problem is still there - i just got a bit confused with how the problem looks with xfs being a module or not - but with both ways it does not work - i will try to look a bit deeper into this t -- thomas graichen <tg...@sp...> ... perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. --- antoine de saint-exupery |