From: Blaisorblade <bla...@ya...> - 2007-01-17 16:16:22
|
I've found now the time to run a test, and I found that while on Ubuntu, until now, I couldn't compile a 64bit UML (I thought of some regression I didn't have the time to debug), compiling the same code with gcc 3.4 gets a fully working UML. The used UML is release 2.6.18.6. Ubuntu's gcc is: $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc,ada,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.0 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk-default --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.0-1.4.2.0/jre --enable-mpfr --disable-werror --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5) Without any errors, after these messages: kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. it hangs giving the following result at strace -p (I've printed two consecutive iterations of the same messages to show that they are the same): --- SIGCHLD (Child exited) @ 0 (0) --- wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) = 31586 ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) = 31586 ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) = 31586 ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) = 31586 ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 I'll have to verify whether some code in the stubs is miscompiled. But not until... well, I dunno when I'll be back... Anybody else with the same problem? -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |
From: Christopher S. A. <ca...@th...> - 2007-01-17 18:17:38
|
Blaisorblade wrote: [snip] > Without any errors, after these messages: > > kjournald starting. Commit interval 5 seconds > EXT3-fs: mounted filesystem with ordered data mode. > VFS: Mounted root (ext3 filesystem) readonly. > > it hangs giving the following result at strace -p (I've printed two > consecutive iterations of the same messages to show that they are the same): > > --- SIGCHLD (Child exited) @ 0 (0) --- > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) = > 31586 > ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 > ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 > ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 > --- SIGCHLD (Child exited) @ 0 (0) --- > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) = > 31586 > ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 > ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 > ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 > > --- SIGCHLD (Child exited) @ 0 (0) --- > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) = > 31586 > ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 > ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 > ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 > --- SIGCHLD (Child exited) @ 0 (0) --- > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) = > 31586 > ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 > ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 > ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 > > I'll have to verify whether some code in the stubs is miscompiled. But not > until... well, I dunno when I'll be back... > > Anybody else with the same problem? Isn't this is the same problem we discussed a few months back? What I've discovered since then is this is a regression introduced after 2.6.16 on the _host_. Any post-2.6.16 host kernel, either skas v8.2 or v9.pre causes Ubuntu (and other distros, I believe) to hang after "VFS mounting root". http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116008558424368&w=2 http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116711159414443&w=2 -Chris |
From: Blaisorblade <bla...@ya...> - 2007-01-19 23:31:59
|
On Wednesday 17 January 2007 19:17, Christopher S. Aker wrote: > Blaisorblade wrote: > [snip] > > > Without any errors, after these messages: > > > > kjournald starting. Commit interval 5 seconds > > EXT3-fs: mounted filesystem with ordered data mode. > > VFS: Mounted root (ext3 filesystem) readonly. > > > > it hangs giving the following result at strace -p (I've printed two > > consecutive iterations of the same messages to show that they are the > > same): > > > > --- SIGCHLD (Child exited) @ 0 (0) --- > > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) > > = 31586 > > ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 > > ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 > > ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 > > --- SIGCHLD (Child exited) @ 0 (0) --- > > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) > > = 31586 > > ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 > > ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 > > ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 > > > > --- SIGCHLD (Child exited) @ 0 (0) --- > > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], WSTOPPED, NULL) > > = 31586 > > ptrace(PTRACE_GETREGS, 31586, 0, 0x6096dac8) = 0 > > ptrace(PTRACE_GETFPREGS, 31586, 0, 0x6096dba0) = 0 > > ptrace(PTRACE_CONT, 31586, 0, SIGSEGV) = 0 > > --- SIGCHLD (Child exited) @ 0 (0) --- > > wait4(31586, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGUSR1}], WSTOPPED, NULL) > > = 31586 > > ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 > > ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 > > ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 > > > > I'll have to verify whether some code in the stubs is miscompiled. But > > not until... well, I dunno when I'll be back... > > > > Anybody else with the same problem? > > Isn't this is the same problem we discussed a few months back? What > I've discovered since then is this is a regression introduced after > 2.6.16 on the _host_. Any post-2.6.16 host kernel, either skas v8.2 or > v9.pre causes Ubuntu (and other distros, I believe) to hang after "VFS > mounting root". Not at all. I'm running: 64bit host - 2.6.18 on the host, Ubuntu 6.06 LTS. It would use 2.6.15 as default kernel, however (which implies their glibc is compiled against it). Without SKAS. Guest - 32bit is all fine with Debian (my usual testcase) - 64bit is fine only with gcc 3.4, with gcc 4.0 I get troubles. Also, since now I don't have the time to read everything frequently, I start forgetting (I used to have a mental db of such problems, when I worked on them a lot). We should use http://bugzilla.kernel.org/. Since both you and Antoine are hitting that same bug, I'm cc'ing both of you. -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |
From: Boaz H. <bha...@pa...> - 2007-01-21 09:17:10
|
Antoine Martin wrote: > Antoine Martin wrote: >>>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>>> running again. But I can't do that for all of them, and this is just >>>>> not an option for some of the amd64 boxes. >>>> My setup is: >>> Thanks for that. That is very similar to mine. >>> I don't think this has anything to do with the guest... So I'll try to >>> remove the skas3 patch from the host and see how it goes. >>> >> I did, and no improvement... x86 guests still hang. >> Could you post a binary guest kernel somewhere so I can try that? >> (even if it isn't static - glibc should be similar since we're using >> Gentoo amd64) >> If that still does not work then I can be certain that it is something >> to do with the host. > I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, > kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: > None of them work with any of the 32-bit kernels! > It prints nothing, just sits there spinning at 100% cpu. > So I am now totally convinced that i haven't got a weird setup. > Something else is broken in UML. > > On fully up to date Fedora Core 6 x86_64, the kernel does display > something before crashing: > # uname -a > Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 > EST 2006 x86_64 x86_64 x86_64 GNU/Linux > # ./kernel32-2.6.19.2 > Checking that ptrace can change system call numbers...OK > Checking syscall emulation patch for ptrace...missing > Checking for tmpfs mount on /dev/shm...OK > Checking PROT_EXEC mmap in /dev/shm/...OK > Checking for the skas3 patch in the host: > - /proc/mm...not found > - PTRACE_FAULTINFO...not found > - PTRACE_LDT...not found > UML running in SKAS0 mode > > [root@localhost home]# > > This is 100% repeatable. Plain Fedora. > Many users will have a similar setup and will just give up on UML. > So I as I said before, UML is currently unusable for most people out > there running fairly recent systems. > > Antoine > > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel I have a 2.6.19, vanila kernel on top of redhat4, host. And any kernel between 2.6.17 - 2.6.20-rc5 x86_64 for guests. For my iSCSI tests they all run well. At the very beginning, I had bad crashes and I solved it with plain gdb. The point for us to use UML, from the begging, was to use gdb on our own kernel development. At the end it was very easy to find that I have miss-configured the uml kernel, when that was fixed every thing ran well. for me I just do #gdb um/vmlinux (was compiled with -O um) gdb>run ubd0=/var/opt/FedoraCore5-AMD64-root_fs eth0=tuntap,,,192.168.0.117 after first trap in gdb gdb>handle SIGUSR1 pass nostop noprint from than on its strait gdb, tell me if you need a script for loading symbols of kernel modules. Free Life Boaz |
From: Antoine M. <an...@na...> - 2007-01-21 13:04:37
|
>> On fully up to date Fedora Core 6 x86_64, the kernel does display >> something before crashing: >> # uname -a >> Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 >> EST 2006 x86_64 x86_64 x86_64 GNU/Linux >> # ./kernel32-2.6.19.2 >> Checking that ptrace can change system call numbers...OK >> Checking syscall emulation patch for ptrace...missing >> Checking for tmpfs mount on /dev/shm...OK >> Checking PROT_EXEC mmap in /dev/shm/...OK >> Checking for the skas3 patch in the host: >> - /proc/mm...not found >> - PTRACE_FAULTINFO...not found >> - PTRACE_LDT...not found >> UML running in SKAS0 mode >> >> [root@localhost home]# >> >> This is 100% repeatable. Plain Fedora. >> Many users will have a similar setup and will just give up on UML. >> So I as I said before, UML is currently unusable for most people out >> there running fairly recent systems. > I have a 2.6.19, vanila kernel on top of redhat4, host. There is one major difference in this case, redhat4 uses glibc 2.3 whereas fc6 (and gentoo-current) use glibc 2.4 A lot of the problems that I am reporting started when glibc was upgraded (sorry I forgot to mention that) > And any kernel between 2.6.17 - 2.6.20-rc5 x86_64 for guests. > For my iSCSI tests they all run well. > At the very beginning, I had bad crashes and I solved it with plain gdb. > The point for us to use UML, from the begging, was to use gdb on our own kernel development. > At the end it was very easy to find that I have miss-configured the uml kernel, when that was fixed every thing > ran well. > > for me I just do > #gdb um/vmlinux (was compiled with -O um) > gdb>run ubd0=/var/opt/FedoraCore5-AMD64-root_fs eth0=tuntap,,,192.168.0.117 > after first trap in gdb > gdb>handle SIGUSR1 pass nostop noprint > from than on its strait gdb, tell me if you need a script for loading symbols of kernel modules. That would be great, it could go on the wiki too. Thanks Antoine |
From: Boaz H. <bha...@pa...> - 2007-01-23 15:20:24
Attachments:
add-symbol-file
|
Antoine Martin wrote: >> for me I just do >> #gdb um/vmlinux (was compiled with -O um) >> gdb>run ubd0=/var/opt/FedoraCore5-AMD64-root_fs eth0=tuntap,,,192.168.0.117 >> after first trap in gdb >> gdb>handle SIGUSR1 pass nostop noprint >> from than on its strait gdb, tell me if you need a script for loading symbols of kernel modules. > > That would be great, it could go on the wiki too. > > Thanks > Antoine Attach is a script that works for both uml running in gdb or for kgdb. Usage: add-symbol-file module Where module can be just the mod_name or mod_name.ko or a full-path What you do is ssh into a running uml, after the module was loaded by the uml, and run the script. The printed message is what you need to paste at the gdb command-line. (the first chance you get) Note that the path echoed is relative to the uml root. You might need to adjust it for gdb since gdb will see the host path. If the module is inside the root_fs file than you will need to point gdb to a copy of the module file on the host. I hope that helps Boaz Harrosh |
From: Antoine M. <an...@na...> - 2007-01-17 18:39:03
|
>> ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 >> ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 >> ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 >> >> I'll have to verify whether some code in the stubs is miscompiled. But not >> until... well, I dunno when I'll be back... >> >> Anybody else with the same problem? > > > Isn't this is the same problem we discussed a few months back? What > I've discovered since then is this is a regression introduced after > 2.6.16 on the _host_. Any post-2.6.16 host kernel, either skas v8.2 or > v9.pre causes Ubuntu (and other distros, I believe) to hang after "VFS > mounting root". > > http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116008558424368&w=2 > http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116711159414443&w=2 Indeed it is. A fix would be more than welcome. (2.6.16 is a bit dated) I believe a lot of users are hitting this bug now too (and it is a regression) as a lot of distros ship recent kernels. Another regression in 2.6.16 (or later?) is the inability to run 32-bit guests on 64-bit hosts (the same binaries that used to work before - statically compiled). The boot stops very early during the "checking ptrace" step. Antoine |
From: Blaisorblade <bla...@ya...> - 2007-02-15 03:44:01
|
On Wednesday 17 January 2007 19:15, Antoine Martin wrote: > >> ptrace(PTRACE_SETREGS, 31586, 0, 0x6096dac8) = 0 > >> ptrace(PTRACE_SETFPREGS, 31586, 0, 0x6096dba0) = 0 > >> ptrace(PTRACE_SYSCALL, 31586, 0, SIG_0) = 0 > >> > >> I'll have to verify whether some code in the stubs is miscompiled. But > >> not until... well, I dunno when I'll be back... > >> > >> Anybody else with the same problem? > > > > Isn't this is the same problem we discussed a few months back? What > > I've discovered since then is this is a regression introduced after > > 2.6.16 on the _host_. Any post-2.6.16 host kernel, either skas v8.2 or > > v9.pre causes Ubuntu (and other distros, I believe) to hang after "VFS > > mounting root". > > http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116008558424368&w > >=2 > > http://marc.theaimsgroup.com/?l=user-mode-linux-devel&m=116711159414443&w > >=2 > > Indeed it is. > > A fix would be more than welcome. (2.6.16 is a bit dated) > I believe a lot of users are hitting this bug now too (and it is a > regression) as a lot of distros ship recent kernels. > Another regression in 2.6.16 (or later?) is the inability to run 32-bit > guests on 64-bit hosts (the same binaries that used to work before - > statically compiled). The boot stops very early during the "checking > ptrace" step. Just fixed such a regression in 2.6.18 - patch is 'x86_64: fix 2.6.18 regression - PTRACE_OLDSETOPTIONS should be accepted'. Actually, Jeff discovered the same bug today, like me! Please test it to make sure there are no further bugs. For the other bug, could you, Antoine and Christopher, open an entry in bugzilla? Going through emails is becoming confusing... Bye -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |
From: Antoine M. <an...@na...> - 2007-01-17 20:13:53
|
Don't know if that helps, but here it is: (sysrq t on 2.6.20-rc5): [42949400.260000] SysRq : Show State [42949400.260000] [42949400.260000] free sibling [42949400.260000] task PC stack pid father child younger older [42949400.260000] ksoftirqd/0 S B7FE2400 0 2 1 3 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08bcb280 080900ee 08bd3e94 0805cc37 08bcb784 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08bd0000 08bd3ec8 08059dba 08bcb280 [42949400.260000] 08bcb880 08bd0000 08bd0000 08bd0000 08bcb280 00000000 08bd3ecc 08bd0000 Call Trace: [42949400.260000] 08bd3e78: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08bd3e98: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08bd3ecc: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08bd3f3c: [<08090141>] ksoftirqd+0x53/0xb5 [42949400.260000] 08bd3f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08bd3f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08bd3fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08bd3ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] watchdog/0 S B7FE2400 0 3 1 4 2 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08bcac80 080ab1d9 08bd7e9c 0805cc37 08bcb184 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08bd4000 08bd7ed0 08059dba 08bcac80 [42949400.260000] 08bcb880 08bd4000 08bd4000 08bd4000 08bcac80 00000000 08bd7ed4 08bd4000 Call Trace: [42949400.260000] 08bd7e80: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08bd7ea0: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08bd7ed4: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08bd7f44: [<080ab235>] watchdog+0x5c/0x6b [42949400.260000] 08bd7f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08bd7f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08bd7fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08bd7ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] events/0 S B7FE2400 0 4 1 5 3 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08bca680 08bdbf40 08bdbe38 0805cc37 08bcab84 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08bd8000 08bdbe6c 08059dba 08bca680 [42949400.260000] 08bcb880 0807cf44 083dd690 00000000 08bca680 08bdbf20 08bdbe70 08bd8000 Call Trace: [42949400.260000] 08bdbe1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08bdbe3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08bdbe70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08bdbee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08bdbf5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08bdbf90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08bdbfcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08bdbffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] khelper S B7FE2400 0 5 1 6 4 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08bca080 08bdff40 08bdfe38 0805cc37 08bca584 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08bdc000 08bdfe6c 08059dba 08bca080 [42949400.260000] 08bcb880 08bdfe70 080880ca 08bcfc64 08bca080 08bdff20 08bdfe70 08bdc000 Call Trace: [42949400.260000] 08bdfe1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08bdfe3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08bdfe70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08bdfee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08bdff5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08bdff90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08bdffcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08bdfffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] kthread S B7FE2400 0 6 1 42 5 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08be1900 08be7f40 08be7e38 0805cc37 08be1e04 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08be4000 08be7e6c 08059dba 08be1900 [42949400.260000] 08bcb880 080880ca 08bcfbe8 00000003 08be1900 08be7f20 08be7e70 08be4000 Call Trace: [42949400.260000] 08be7e1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08be7e3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08be7e70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08be7ee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08be7f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08be7f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08be7fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08be7ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] kblockd/0 S B7FE2400 0 42 6 43 (L-TLB) [42949400.260000] 00000001 00000000 08be1900 08bbd380 00000000 08a63e38 0805cc37 08bbd884 [42949400.260000] 08be1e04 08059d68 08be1900 08be1900 08a60000 08a63e6c 08059dba 08bbd380 [42949400.260000] 08be1900 5a5a5a5a 5a5a5a5a 5a5a5a5a 08be1900 08a63e6c 080877e0 08a60000 Call Trace: [42949400.260000] 08a63e1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a63e3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a63e70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a63ee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08a63f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a63f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a63fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a63ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] cqueue/0 S B7FE2400 0 43 6 55 42 (L-TLB) [42949400.260000] 00000001 00000000 08bca080 08bbcd80 00000000 08a67e38 0805cc37 08bbd284 [42949400.260000] 08bca584 08059d68 08bca080 08bca080 08a64000 08a67e6c 08059dba 08bbcd80 [42949400.260000] 08bca080 5a5a5a5a 5a5a5a5a 5a5a5a5a 08bca080 08a67e6c 080877e0 08a64000 Call Trace: [42949400.260000] 08a67e1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a67e3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a67e70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a67ee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08a67f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a67f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a67fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a67ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] pdflush S B7FE2400 0 55 6 56 43 (L-TLB) [42949400.260000] 00000001 00000000 08a5f280 08a5ec80 080b1dc7 08a7be5c 0805cc37 08a5f184 [42949400.260000] 08a5f784 08059d68 08a5f280 08a5f280 08a78000 08a7be90 08059dba 08a5ec80 [42949400.260000] 08a5f280 5a5a5a5a 0807cf25 083aed14 08a5ec80 08a7bf38 08a7be94 08a78000 Call Trace: [42949400.260000] 08a7be40: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a7be60: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a7be94: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a7bf04: [<080b1d0c>] __pdflush+0x94/0x14f [42949400.260000] 08a7bf20: [<080b1e0d>] pdflush+0x46/0x48 [42949400.260000] 08a7bf5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a7bf90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a7bfcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a7bffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] pdflush S B7FE2400 0 56 6 57 55 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 08a5f280 080b1dc7 08a77e5c 0805cc37 08a5f784 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 08a74000 08a77e90 08059dba 08a5f280 [42949400.260000] 08bcb880 0807ce7c 083aed14 00000000 08a5f280 08a77f38 08a77e94 08a74000 Call Trace: [42949400.260000] 08a77e40: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a77e60: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a77e94: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a77f04: [<080b1d0c>] __pdflush+0x94/0x14f [42949400.260000] 08a77f20: [<080b1e0d>] pdflush+0x46/0x48 [42949400.260000] 08a77f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a77f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a77fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a77ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] kswapd0 S B7FE2400 0 57 6 58 56 (L-TLB) [42949400.260000] 00000001 00000000 08bbc180 08a5f880 00000001 08a73e44 0805cc37 08a5fd84 [42949400.260000] 08bbc684 08059d68 08bbc180 08bbc180 08a70000 08a73e78 08059dba 08a5f880 [42949400.260000] 08bbc180 5a5a5a5a 5a5a5a5a 5a5a5a5a 08a5f880 00000000 08a73e7c 08a70000 Call Trace: [42949400.260000] 08a73e28: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a73e48: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a73e7c: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a73eec: [<080b45cc>] kswapd+0xd4/0xf3 [42949400.260000] 08a73f5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a73f90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a73fcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a73ffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] aio/0 S B7FE2400 0 58 6 692 57 (L-TLB) [42949400.260000] 00000001 00000000 08bca080 08bbc180 00000000 08a6fe38 0805cc37 08bbc684 [42949400.260000] 08bca584 08059d68 08bca080 08bca080 08a6c000 08a6fe6c 08059dba 08bbc180 [42949400.260000] 08bca080 5a5a5a5a 5a5a5a5a 5a5a5a5a 08bca080 08a6fe6c 080877e0 08a6c000 Call Trace: [42949400.260000] 08a6fe1c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 08a6fe3c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 08a6fe70: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 08a6fee0: [<0809ad99>] worker_thread+0xfa/0x162 [42949400.260000] 08a6ff5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 08a6ff90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 08a6ffcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 08a6fffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] [42949400.260000] kjournald S B7FE2400 0 692 6 58 (L-TLB) [42949400.260000] 00000001 00000000 08bcb880 089de100 08ae13cc 088abe48 0805cc37 089de604 [42949400.260000] 08bcbd84 08059d68 08bcb880 08bcb880 088a8000 088abe7c 08059dba 089de100 [42949400.260000] 08bcb880 0807cf44 088abe99 00000015 089de100 00000001 088abe80 088a8000 Call Trace: [42949400.260000] 088abe2c: [<0805cc37>] switch_to_skas+0x3b/0x85 [42949400.260000] 088abe4c: [<08059dba>] _switch_to+0x39/0xbe [42949400.260000] 088abe80: [<082babd6>] schedule+0x44c/0x4c3 [42949400.260000] 088abef0: [<081370c8>] kjournald+0x14b/0x1b0 [42949400.260000] 088abf5c: [<0809d77e>] kthread+0x8f/0xb8 [42949400.260000] 088abf90: [<0807c421>] run_kernel_thread+0x45/0x50 [42949400.260000] 088abfcc: [<0805cd0d>] new_thread_handler+0x8c/0xb9 [42949400.260000] 088abffc: [<a55a5a5a>] 0xa55a5a5a [42949400.260000] |
From: Antoine M. <an...@na...> - 2007-01-19 11:00:21
|
Is anyone else having stability problems on AMD64? Or am I the only one using it? Not only is x86 UML broken on AMD64, I am also getting (seemingly) random crashes running 64-bit UML. The Gentoo guests in particular randomly fail to compile code - failing in completely random places. When you try again, it might work or just fail later when you run the binary... These used to work reliably before so I am not just doing it wrong (as is often the case ;) To summarize, that leaves us with: * x86 unable to run kernels 2.6 due to a bug in host >2.6.16 (and most x86 guests require 2.6 to use latest glibc) * amd64 unable to run x86 guests * amd64 unable to run amd64 guests reliably ... Antoine |
From: Antoine M. <an...@na...> - 2007-01-21 14:49:55
|
Joel Palmius wrote: > Confirmed on my athlon64 gentoo setup. I've been running 2.6.14.3 as > host kernel for ages (since I was too wimpy to try to upgrade a host > kernel remote on a machine that required binary proprietary drivers). > > On 2.6.14.3 x86_64 all my 32bit UMLs run fine with various guest kernels > compiled in various circumstances. on x86 I was able to upgrade as far as 2.6.15.x, 2.6.16 and later broke. So you may be able to upgrade by one minor release (not much - and 2.6.15 is no longer maintained, whereas 2.6.16 is... damn) > On 2.6.18-gentoo-r6 x86_64 (genkernel), all guest UMLs spin up to 100% > and does nothing, no output whatsoever. Finally someone confirms what I have been seeing for ages! Maybe the devs can find out what is going on now... It does look like a host kernel issue, but since other people are reporting no errors with the same kernels, I wouldn't completely discard the possibility that glibc has something to do with this. Also, there was a report today that *_FS_SECURITY is causing problems. Do you happen to have *_FS_SECURITY options switched on (on the host)? Antoine > Strace says: > > execve("./vmlinux", ["./vmlinux"], [/* 28 vars */]) = 0 > [ Process PID=13658 runs in 32 bit mode. ] > uname({sys="Linux", node="master", ...}) = 0 > brk(0) = 0xffffffffa0314000 > brk(0xa0314844) = 0xffffffffa0314844 > set_thread_area(0xffdb51d0) = 0 > brk(0xa0335844) = 0xffffffffa0335844 > brk(0xa0336000) = 0xffffffffa0336000 > getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=4292563436}) = 0 > rt_sigaction(SIGINT, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 > rt_sigprocmask(SIG_UNBLOCK, [INT], NULL, 8) = 0 > rt_sigaction(SIGTERM, {0xc0000000a001cad8, [], > SA_INTERRUPT|SA_ONESHOT|0x161e48}, NULL, 8) = 0 > rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0 > rt_sigaction(SIGHUP, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 > rt_sigprocmask(SIG_UNBLOCK, [HUP], NULL, 8) = 0 > fstat64(0x1, 0xffdb4ad8) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0x1000) = 0xfffffffff7fe8000 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7fe7000 > clone(child_stack=0xf7fe7fd4, flags=|SIGCHLD) = 13659 > --- SIGCHLD (Child exited) @ 0 (0) --- > waitpid(13659, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WSTOPPED) = > 13659 > ptrace(0x15 /* PTRACE_??? */, 13659, 0, 0x1) = -1 EINVAL (Invalid > argument) > > > Pity... I had finally decided to upgrade the host kernel... :-) > > // Joel > > > On Fri, 19 Jan 2007, Antoine Martin wrote: > >> Antoine Martin wrote: >>>>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>>>> running again. But I can't do that for all of them, and this is just >>>>>> not an option for some of the amd64 boxes. >>>>> >>>>> My setup is: >>>> Thanks for that. That is very similar to mine. >>>> I don't think this has anything to do with the guest... So I'll try to >>>> remove the skas3 patch from the host and see how it goes. >>>> >>> I did, and no improvement... x86 guests still hang. >>> Could you post a binary guest kernel somewhere so I can try that? >>> (even if it isn't static - glibc should be similar since we're using >>> Gentoo amd64) >>> If that still does not work then I can be certain that it is something >>> to do with the host. >> I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, >> kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: >> None of them work with any of the 32-bit kernels! >> It prints nothing, just sits there spinning at 100% cpu. >> So I am now totally convinced that i haven't got a weird setup. >> Something else is broken in UML. >> >> On fully up to date Fedora Core 6 x86_64, the kernel does display >> something before crashing: >> # uname -a >> Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 >> EST 2006 x86_64 x86_64 x86_64 GNU/Linux >> # ./kernel32-2.6.19.2 >> Checking that ptrace can change system call numbers...OK >> Checking syscall emulation patch for ptrace...missing >> Checking for tmpfs mount on /dev/shm...OK >> Checking PROT_EXEC mmap in /dev/shm/...OK >> Checking for the skas3 patch in the host: >> - /proc/mm...not found >> - PTRACE_FAULTINFO...not found >> - PTRACE_LDT...not found >> UML running in SKAS0 mode >> >> [root@localhost home]# >> >> This is 100% repeatable. Plain Fedora. >> Many users will have a similar setup and will just give up on UML. >> So I as I said before, UML is currently unusable for most people out >> there running fairly recent systems. >> >> Antoine >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys - and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> User-mode-linux-devel mailing list >> Use...@li... >> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel >> > |
From: Joel P. <joe...@mi...> - 2007-01-21 14:54:31
|
master linux-2.6.18-gentoo-r6 # grep -i security .config CONFIG_EXT2_FS_SECURITY=y CONFIG_EXT3_FS_SECURITY=y CONFIG_REISERFS_FS_SECURITY=y CONFIG_JFS_SECURITY=y CONFIG_XFS_SECURITY=y # Security options # CONFIG_SECURITY is not set master linux-2.6.18-gentoo-r6 # cd .. master src # cd linux-2.6.14.3/ master linux-2.6.14.3 # grep -i security .config # CONFIG_EXT2_FS_SECURITY is not set # CONFIG_EXT3_FS_SECURITY is not set # CONFIG_REISERFS_FS_SECURITY is not set # CONFIG_JFS_SECURITY is not set # CONFIG_XFS_SECURITY is not set # Security options # CONFIG_SECURITY is not set master linux-2.6.14.3 # Hrm.. Interesting.. I'll tinker with this and see what happens. // Joel On Sun, 21 Jan 2007, Antoine Martin wrote: > Joel Palmius wrote: >> Confirmed on my athlon64 gentoo setup. I've been running 2.6.14.3 as >> host kernel for ages (since I was too wimpy to try to upgrade a host >> kernel remote on a machine that required binary proprietary drivers). >> >> On 2.6.14.3 x86_64 all my 32bit UMLs run fine with various guest kernels >> compiled in various circumstances. > on x86 I was able to upgrade as far as 2.6.15.x, 2.6.16 and later broke. > So you may be able to upgrade by one minor release (not much - and > 2.6.15 is no longer maintained, whereas 2.6.16 is... damn) > >> On 2.6.18-gentoo-r6 x86_64 (genkernel), all guest UMLs spin up to 100% >> and does nothing, no output whatsoever. > Finally someone confirms what I have been seeing for ages! > Maybe the devs can find out what is going on now... > > It does look like a host kernel issue, but since other people are > reporting no errors with the same kernels, I wouldn't completely discard > the possibility that glibc has something to do with this. > Also, there was a report today that *_FS_SECURITY is causing problems. > Do you happen to have *_FS_SECURITY options switched on (on the host)? > > Antoine > > > > >> Strace says: >> >> execve("./vmlinux", ["./vmlinux"], [/* 28 vars */]) = 0 >> [ Process PID=13658 runs in 32 bit mode. ] >> uname({sys="Linux", node="master", ...}) = 0 >> brk(0) = 0xffffffffa0314000 >> brk(0xa0314844) = 0xffffffffa0314844 >> set_thread_area(0xffdb51d0) = 0 >> brk(0xa0335844) = 0xffffffffa0335844 >> brk(0xa0336000) = 0xffffffffa0336000 >> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=4292563436}) = 0 >> rt_sigaction(SIGINT, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [INT], NULL, 8) = 0 >> rt_sigaction(SIGTERM, {0xc0000000a001cad8, [], >> SA_INTERRUPT|SA_ONESHOT|0x161e48}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0 >> rt_sigaction(SIGHUP, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [HUP], NULL, 8) = 0 >> fstat64(0x1, 0xffdb4ad8) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >> 0x1000) = 0xfffffffff7fe8000 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7fe7000 >> clone(child_stack=0xf7fe7fd4, flags=|SIGCHLD) = 13659 >> --- SIGCHLD (Child exited) @ 0 (0) --- >> waitpid(13659, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WSTOPPED) = >> 13659 >> ptrace(0x15 /* PTRACE_??? */, 13659, 0, 0x1) = -1 EINVAL (Invalid >> argument) >> >> >> Pity... I had finally decided to upgrade the host kernel... :-) >> >> // Joel >> >> >> On Fri, 19 Jan 2007, Antoine Martin wrote: >> >>> Antoine Martin wrote: >>>>>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>>>>> running again. But I can't do that for all of them, and this is just >>>>>>> not an option for some of the amd64 boxes. >>>>>> >>>>>> My setup is: >>>>> Thanks for that. That is very similar to mine. >>>>> I don't think this has anything to do with the guest... So I'll try to >>>>> remove the skas3 patch from the host and see how it goes. >>>>> >>>> I did, and no improvement... x86 guests still hang. >>>> Could you post a binary guest kernel somewhere so I can try that? >>>> (even if it isn't static - glibc should be similar since we're using >>>> Gentoo amd64) >>>> If that still does not work then I can be certain that it is something >>>> to do with the host. >>> I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, >>> kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: >>> None of them work with any of the 32-bit kernels! >>> It prints nothing, just sits there spinning at 100% cpu. >>> So I am now totally convinced that i haven't got a weird setup. >>> Something else is broken in UML. >>> >>> On fully up to date Fedora Core 6 x86_64, the kernel does display >>> something before crashing: >>> # uname -a >>> Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 >>> EST 2006 x86_64 x86_64 x86_64 GNU/Linux >>> # ./kernel32-2.6.19.2 >>> Checking that ptrace can change system call numbers...OK >>> Checking syscall emulation patch for ptrace...missing >>> Checking for tmpfs mount on /dev/shm...OK >>> Checking PROT_EXEC mmap in /dev/shm/...OK >>> Checking for the skas3 patch in the host: >>> - /proc/mm...not found >>> - PTRACE_FAULTINFO...not found >>> - PTRACE_LDT...not found >>> UML running in SKAS0 mode >>> >>> [root@localhost home]# >>> >>> This is 100% repeatable. Plain Fedora. >>> Many users will have a similar setup and will just give up on UML. >>> So I as I said before, UML is currently unusable for most people out >>> there running fairly recent systems. >>> >>> Antoine >>> >>> ------------------------------------------------------------------------- >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys - and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> _______________________________________________ >>> User-mode-linux-devel mailing list >>> Use...@li... >>> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel >>> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |
From: Joel P. <joe...@mi...> - 2007-01-21 16:10:40
|
I tested a (host) kernel build without the *_FS_SECURITY thingies, but with same result. I guess the problem is something else. // Joel On Sun, 21 Jan 2007, Antoine Martin wrote: > Joel Palmius wrote: >> Confirmed on my athlon64 gentoo setup. I've been running 2.6.14.3 as >> host kernel for ages (since I was too wimpy to try to upgrade a host >> kernel remote on a machine that required binary proprietary drivers). >> >> On 2.6.14.3 x86_64 all my 32bit UMLs run fine with various guest kernels >> compiled in various circumstances. > on x86 I was able to upgrade as far as 2.6.15.x, 2.6.16 and later broke. > So you may be able to upgrade by one minor release (not much - and > 2.6.15 is no longer maintained, whereas 2.6.16 is... damn) > >> On 2.6.18-gentoo-r6 x86_64 (genkernel), all guest UMLs spin up to 100% >> and does nothing, no output whatsoever. > Finally someone confirms what I have been seeing for ages! > Maybe the devs can find out what is going on now... > > It does look like a host kernel issue, but since other people are > reporting no errors with the same kernels, I wouldn't completely discard > the possibility that glibc has something to do with this. > Also, there was a report today that *_FS_SECURITY is causing problems. > Do you happen to have *_FS_SECURITY options switched on (on the host)? > > Antoine > > > > >> Strace says: >> >> execve("./vmlinux", ["./vmlinux"], [/* 28 vars */]) = 0 >> [ Process PID=13658 runs in 32 bit mode. ] >> uname({sys="Linux", node="master", ...}) = 0 >> brk(0) = 0xffffffffa0314000 >> brk(0xa0314844) = 0xffffffffa0314844 >> set_thread_area(0xffdb51d0) = 0 >> brk(0xa0335844) = 0xffffffffa0335844 >> brk(0xa0336000) = 0xffffffffa0336000 >> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=4292563436}) = 0 >> rt_sigaction(SIGINT, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [INT], NULL, 8) = 0 >> rt_sigaction(SIGTERM, {0xc0000000a001cad8, [], >> SA_INTERRUPT|SA_ONESHOT|0x161e48}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0 >> rt_sigaction(SIGHUP, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 >> rt_sigprocmask(SIG_UNBLOCK, [HUP], NULL, 8) = 0 >> fstat64(0x1, 0xffdb4ad8) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >> 0x1000) = 0xfffffffff7fe8000 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7fe7000 >> clone(child_stack=0xf7fe7fd4, flags=|SIGCHLD) = 13659 >> --- SIGCHLD (Child exited) @ 0 (0) --- >> waitpid(13659, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WSTOPPED) = >> 13659 >> ptrace(0x15 /* PTRACE_??? */, 13659, 0, 0x1) = -1 EINVAL (Invalid >> argument) >> >> >> Pity... I had finally decided to upgrade the host kernel... :-) >> >> // Joel >> >> >> On Fri, 19 Jan 2007, Antoine Martin wrote: >> >>> Antoine Martin wrote: >>>>>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>>>>> running again. But I can't do that for all of them, and this is just >>>>>>> not an option for some of the amd64 boxes. >>>>>> >>>>>> My setup is: >>>>> Thanks for that. That is very similar to mine. >>>>> I don't think this has anything to do with the guest... So I'll try to >>>>> remove the skas3 patch from the host and see how it goes. >>>>> >>>> I did, and no improvement... x86 guests still hang. >>>> Could you post a binary guest kernel somewhere so I can try that? >>>> (even if it isn't static - glibc should be similar since we're using >>>> Gentoo amd64) >>>> If that still does not work then I can be certain that it is something >>>> to do with the host. >>> I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, >>> kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: >>> None of them work with any of the 32-bit kernels! >>> It prints nothing, just sits there spinning at 100% cpu. >>> So I am now totally convinced that i haven't got a weird setup. >>> Something else is broken in UML. >>> >>> On fully up to date Fedora Core 6 x86_64, the kernel does display >>> something before crashing: >>> # uname -a >>> Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 >>> EST 2006 x86_64 x86_64 x86_64 GNU/Linux >>> # ./kernel32-2.6.19.2 >>> Checking that ptrace can change system call numbers...OK >>> Checking syscall emulation patch for ptrace...missing >>> Checking for tmpfs mount on /dev/shm...OK >>> Checking PROT_EXEC mmap in /dev/shm/...OK >>> Checking for the skas3 patch in the host: >>> - /proc/mm...not found >>> - PTRACE_FAULTINFO...not found >>> - PTRACE_LDT...not found >>> UML running in SKAS0 mode >>> >>> [root@localhost home]# >>> >>> This is 100% repeatable. Plain Fedora. >>> Many users will have a similar setup and will just give up on UML. >>> So I as I said before, UML is currently unusable for most people out >>> there running fairly recent systems. >>> >>> Antoine >>> >>> ------------------------------------------------------------------------- >>> Take Surveys. Earn Cash. Influence the Future of IT >>> Join SourceForge.net's Techsay panel and you'll get the chance to >>> share your >>> opinions on IT & business topics through brief surveys - and earn cash >>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>> _______________________________________________ >>> User-mode-linux-devel mailing list >>> Use...@li... >>> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel >>> >> > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |
From: Jeff D. <jd...@ad...> - 2007-01-22 22:16:11
|
On Sun, Jan 21, 2007 at 02:46:12PM +0000, Antoine Martin wrote: > Finally someone confirms what I have been seeing for ages! > Maybe the devs can find out what is going on now... OK, can someone give me access to a box where this is happening? Jeff -- Work email - jdike at linux dot intel dot com |
From: Antoine M. <an...@na...> - 2007-01-23 00:59:55
|
Jeff Dike wrote: > On Sun, Jan 21, 2007 at 02:46:12PM +0000, Antoine Martin wrote: >> Finally someone confirms what I have been seeing for ages! >> Maybe the devs can find out what is going on now... > > OK, can someone give me access to a box where this is happening? Sure thing. Just let me know when you want to access it and I'll make sure it is available. Also send me your public key and I'll give you ssh root. Antoine |
From: Jeff D. <jd...@ad...> - 2007-01-17 23:32:48
|
On Wed, Jan 17, 2007 at 05:15:35PM +0100, Blaisorblade wrote: > kjournald starting. Commit interval 5 seconds > EXT3-fs: mounted filesystem with ordered data mode. > VFS: Mounted root (ext3 filesystem) readonly. > > it hangs giving the following result at strace -p (I've printed two > consecutive iterations of the same messages to show that they are the same): I'm chasing something on i386 with the same symptoms, but I think it's a different problem. What I'm seeing is init segfaulting on some hosts, but not others. Figure out what the segfault is, and where it's happening. Jeff -- Work email - jdike at linux dot intel dot com |
From: Daniel G. <da...@ge...> - 2007-01-19 16:08:17
|
On Fri, 2007-01-19 at 11:00 +0000, Antoine Martin wrote: > Is anyone else having stability problems on AMD64? Or am I the only one=20 > using it? > Not only is x86 UML broken on AMD64, I am also getting (seemingly)=20 > random crashes running 64-bit UML. > The Gentoo guests in particular randomly fail to compile code - failing=20 > in completely random places. When you try again, it might work or just=20 > fail later when you run the binary... > These used to work reliably before so I am not just doing it wrong (as=20 > is often the case ;) > To summarize, that leaves us with: > * x86 unable to run kernels 2.6 due to a bug in host >2.6.16 > (and most x86 guests require 2.6 to use latest glibc) > * amd64 unable to run x86 guests > * amd64 unable to run amd64 guests reliably > ... >=20 I'm the gentoo usermode-sources maintainer, and I run UML almost exclusively on amd64. I have, at any given time, from 4 to 20 UMLs running on my main dev box, most of them 32-bit, and I haven't had any stability problems. Is your RAM OK? Which kernels do you have inside and outside? How are you trying to build the 32-bit UMLs? Daniel |
From: Antoine M. <an...@na...> - 2007-01-19 16:14:53
|
Daniel Gryniewicz wrote: > On Fri, 2007-01-19 at 11:00 +0000, Antoine Martin wrote: >> Is anyone else having stability problems on AMD64? Or am I the only one >> using it? >> Not only is x86 UML broken on AMD64, I am also getting (seemingly) >> random crashes running 64-bit UML. >> The Gentoo guests in particular randomly fail to compile code - failing >> in completely random places. When you try again, it might work or just >> fail later when you run the binary... >> These used to work reliably before so I am not just doing it wrong (as >> is often the case ;) >> To summarize, that leaves us with: >> * x86 unable to run kernels 2.6 due to a bug in host >2.6.16 >> (and most x86 guests require 2.6 to use latest glibc) >> * amd64 unable to run x86 guests >> * amd64 unable to run amd64 guests reliably >> ... >> > > I'm the gentoo usermode-sources maintainer, and I run UML almost > exclusively on amd64. I have, at any given time, from 4 to 20 UMLs > running on my main dev box, most of them 32-bit, and I haven't had any > stability problems. Can you send me more details of your setup as I am quite experienced with UML and now completely stuck with AMD64. > Is your RAM OK? Oh yes, this has nothing to do with the hardware. > Which kernels do you have inside and outside? Host (outside): 2.6.19.2-skas3-v8.2 Inside: pick any of those (none of them work with >=2.6.16): http://uml.nagafix.co.uk/ > How are > you trying to build the 32-bit UMLs? These are the same as on the site above (which I maintain) and these have worked flawlessly for a long long time. I have downgraded the x86 boxes to 2.6.15.7 and these are up and running again. But I can't do that for all of them, and this is just not an option for some of the amd64 boxes. Thanks Antoine |
From: Antoine M. <an...@na...> - 2007-01-19 16:38:57
|
>> Which kernels do you have inside and outside? > Host (outside): > 2.6.19.2-skas3-v8.2 Note: skas3 v9 has some problems, but I tried that too. And I wouldn't blame the guest builds either. These guest kernels have worked for many people in the past, including me. And the image has not changed. I've tried all the versions you can find on: http://uml.nagafix.co.uk/kernels/ And some of Blaisorblade's kernels too (>2.6.15 as I need TLS) Any ideas? Anyone? Antoine |
From: Daniel G. <da...@ge...> - 2007-01-19 16:38:42
|
On Fri, 2007-01-19 at 16:14 +0000, Antoine Martin wrote: > Daniel Gryniewicz wrote: > > I'm the gentoo usermode-sources maintainer, and I run UML almost > > exclusively on amd64. I have, at any given time, from 4 to 20 UMLs > > running on my main dev box, most of them 32-bit, and I haven't had any > > stability problems. > Can you send me more details of your setup as I am quite experienced=20 > with UML and now completely stuck with AMD64. >=20 > > Is your RAM OK? > Oh yes, this has nothing to do with the hardware. >=20 > > Which kernels do you have inside and outside? > Host (outside): > 2.6.19.2-skas3-v8.2 Okay, that's one big difference. I have a stock kernel (well, gentoo-sources) on the outside, not a SKAS3 patched kernel. Maybe that makes a difference on amd64? > Inside: pick any of those (none of them work with >=3D2.6.16): > http://uml.nagafix.co.uk/ >=20 > > How are > > you trying to build the 32-bit UMLs? > These are the same as on the site above (which I maintain) and these=20 > have worked flawlessly for a long long time. >=20 > I have downgraded the x86 boxes to 2.6.15.7 and these are up and running=20 > again. But I can't do that for all of them, and this is just not an=20 > option for some of the amd64 boxes. My setup is: Host boxes: - Dual dual-core opteron 265 with 6GB of RAM running 2.6.17-gentoo-r8 - Turion laptop with 2GB of RAM running currently 2.6.20-rc5 (stock) but most recently 2.6.19-ck1 (stock). - Athlon64 3000+ with 1GB of RAM running ck-sources-2.6.19_p2-r3 Notice, no UML related patches on the host boxes. UML Boxes: - usermode-sources-2.6.16-r6 (2.6.16 with genpatches 15 and -bs2) - usermode-sources-2.6.18-r1 (2.6.18 with genpatches 8 and -bb2) - Modified usermode-sources-2.6.14-r6 (2.6.14 with -bs3 and networking patches for work) (this one also runs on 32-bit hosts running various gentoo kernels on workstations at work) All of these run some form of Gentoo inside; not necessarily recent. Each of them runs in both 32 and 64 bit versions. I have also briefly run a UML with a fedora core 2 image, running a usermode-sources-2.6.18 kernel, on the opteron box, but that was only very briefly. That's a 32-bit only UML. Daniel |
From: Antoine M. <an...@na...> - 2007-01-19 16:43:10
|
>>> Which kernels do you have inside and outside? >> Host (outside): >> 2.6.19.2-skas3-v8.2 > > Okay, that's one big difference. I have a stock kernel (well, > gentoo-sources) on the outside, not a SKAS3 patched kernel. Maybe that > makes a difference on amd64? I wouldn't think so, as it is always possible to disable skas3 (and in fact you have to disable it as /proc/mm mm64 isn't fully working on amd64) on the command line using skas0 or noprocmm, etc.. (Which I did) >> I have downgraded the x86 boxes to 2.6.15.7 and these are up and running >> again. But I can't do that for all of them, and this is just not an >> option for some of the amd64 boxes. > > My setup is: Thanks for that. That is very similar to mine. I don't think this has anything to do with the guest... So I'll try to remove the skas3 patch from the host and see how it goes. Antoine |
From: Antoine M. <an...@na...> - 2007-01-19 17:34:09
|
>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>> running again. But I can't do that for all of them, and this is just >>> not an option for some of the amd64 boxes. >> >> My setup is: > Thanks for that. That is very similar to mine. > I don't think this has anything to do with the guest... So I'll try to > remove the skas3 patch from the host and see how it goes. > I did, and no improvement... x86 guests still hang. Could you post a binary guest kernel somewhere so I can try that? (even if it isn't static - glibc should be similar since we're using Gentoo amd64) If that still does not work then I can be certain that it is something to do with the host. Thanks Antoine BTW, anyone know if UML guests are compatible with exec-shield? |
From: Antoine M. <an...@na...> - 2007-01-19 19:56:34
|
Antoine Martin wrote: >>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>> running again. But I can't do that for all of them, and this is just >>>> not an option for some of the amd64 boxes. >>> >>> My setup is: >> Thanks for that. That is very similar to mine. >> I don't think this has anything to do with the guest... So I'll try to >> remove the skas3 patch from the host and see how it goes. >> > I did, and no improvement... x86 guests still hang. > Could you post a binary guest kernel somewhere so I can try that? > (even if it isn't static - glibc should be similar since we're using > Gentoo amd64) > If that still does not work then I can be certain that it is something > to do with the host. I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: None of them work with any of the 32-bit kernels! It prints nothing, just sits there spinning at 100% cpu. So I am now totally convinced that i haven't got a weird setup. Something else is broken in UML. On fully up to date Fedora Core 6 x86_64, the kernel does display something before crashing: # uname -a Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 EST 2006 x86_64 x86_64 x86_64 GNU/Linux # ./kernel32-2.6.19.2 Checking that ptrace can change system call numbers...OK Checking syscall emulation patch for ptrace...missing Checking for tmpfs mount on /dev/shm...OK Checking PROT_EXEC mmap in /dev/shm/...OK Checking for the skas3 patch in the host: - /proc/mm...not found - PTRACE_FAULTINFO...not found - PTRACE_LDT...not found UML running in SKAS0 mode [root@localhost home]# This is 100% repeatable. Plain Fedora. Many users will have a similar setup and will just give up on UML. So I as I said before, UML is currently unusable for most people out there running fairly recent systems. Antoine |
From: Joel P. <joe...@mi...> - 2007-01-21 14:58:53
|
Confirmed on my athlon64 gentoo setup. I've been running 2.6.14.3 as host kernel for ages (since I was too wimpy to try to upgrade a host kernel remote on a machine that required binary proprietary drivers). On 2.6.14.3 x86_64 all my 32bit UMLs run fine with various guest kernels compiled in various circumstances. On 2.6.18-gentoo-r6 x86_64 (genkernel), all guest UMLs spin up to 100% and does nothing, no output whatsoever. Strace says: execve("./vmlinux", ["./vmlinux"], [/* 28 vars */]) = 0 [ Process PID=13658 runs in 32 bit mode. ] uname({sys="Linux", node="master", ...}) = 0 brk(0) = 0xffffffffa0314000 brk(0xa0314844) = 0xffffffffa0314844 set_thread_area(0xffdb51d0) = 0 brk(0xa0335844) = 0xffffffffa0335844 brk(0xa0336000) = 0xffffffffa0336000 getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=4292563436}) = 0 rt_sigaction(SIGINT, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [INT], NULL, 8) = 0 rt_sigaction(SIGTERM, {0xc0000000a001cad8, [], SA_INTERRUPT|SA_ONESHOT|0x161e48}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0 rt_sigaction(SIGHUP, {0xc0000000a001cad8, [], 0}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [HUP], NULL, 8) = 0 fstat64(0x1, 0xffdb4ad8) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7fe8000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7fe7000 clone(child_stack=0xf7fe7fd4, flags=|SIGCHLD) = 13659 --- SIGCHLD (Child exited) @ 0 (0) --- waitpid(13659, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WSTOPPED) = 13659 ptrace(0x15 /* PTRACE_??? */, 13659, 0, 0x1) = -1 EINVAL (Invalid argument) Pity... I had finally decided to upgrade the host kernel... :-) // Joel On Fri, 19 Jan 2007, Antoine Martin wrote: > Antoine Martin wrote: >>>>> I have downgraded the x86 boxes to 2.6.15.7 and these are up and >>>>> running again. But I can't do that for all of them, and this is just >>>>> not an option for some of the amd64 boxes. >>>> >>>> My setup is: >>> Thanks for that. That is very similar to mine. >>> I don't think this has anything to do with the guest... So I'll try to >>> remove the skas3 patch from the host and see how it goes. >>> >> I did, and no improvement... x86 guests still hang. >> Could you post a binary guest kernel somewhere so I can try that? >> (even if it isn't static - glibc should be similar since we're using >> Gentoo amd64) >> If that still does not work then I can be certain that it is something >> to do with the host. > I've just tried on 3 more hosts, all AMD64 Gentoo fully up to date, > kernel 2.6.19.2. No skas, no exec shield, no selinux, plain kernel.org: > None of them work with any of the 32-bit kernels! > It prints nothing, just sits there spinning at 100% cpu. > So I am now totally convinced that i haven't got a weird setup. > Something else is broken in UML. > > On fully up to date Fedora Core 6 x86_64, the kernel does display > something before crashing: > # uname -a > Linux localhost.localdomain 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 > EST 2006 x86_64 x86_64 x86_64 GNU/Linux > # ./kernel32-2.6.19.2 > Checking that ptrace can change system call numbers...OK > Checking syscall emulation patch for ptrace...missing > Checking for tmpfs mount on /dev/shm...OK > Checking PROT_EXEC mmap in /dev/shm/...OK > Checking for the skas3 patch in the host: > - /proc/mm...not found > - PTRACE_FAULTINFO...not found > - PTRACE_LDT...not found > UML running in SKAS0 mode > > [root@localhost home]# > > This is 100% repeatable. Plain Fedora. > Many users will have a similar setup and will just give up on UML. > So I as I said before, UML is currently unusable for most people out > there running fairly recent systems. > > Antoine > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |
From: Blaisorblade <bla...@ya...> - 2007-01-19 23:19:11
|
On Thursday 18 January 2007 00:26, Jeff Dike wrote: > On Wed, Jan 17, 2007 at 05:15:35PM +0100, Blaisorblade wrote: > > kjournald starting. Commit interval 5 seconds > > EXT3-fs: mounted filesystem with ordered data mode. > > VFS: Mounted root (ext3 filesystem) readonly. > > > > it hangs giving the following result at strace -p (I've printed two > > consecutive iterations of the same messages to show that they are the > > same): > > I'm chasing something on i386 with the same symptoms, but I think it's > a different problem. What I'm seeing is init segfaulting on some > hosts, but not others. > > Figure out what the segfault is, and where it's happening. I've started an attempt. I'm also compiling uml 2.6.16-bs3 - with gcc 3.4 there is no problem, we'll see now with gcc 4.0. All these tests are run on a custom 2.6.18.6 64-bit kernel, without SKAS (and with my RFP patches , but this won't make a difference). Ok, I hope I remembered correctly how to debug such faults (I'm posting the full procedure so you can give a look) (gdb) where #0 userspace (regs=0x60a5cac8) at /home/paolo/Admin/kernel/6/VCS/linux-2.6.18/arch/um/os-Linux/skas/process.c:275 #1 0x0000000060010192 in new_thread_handler (sig=<value optimized out>) at /home/paolo/Admin/kernel/6/VCS/linux-2.6.18/arch/um/kernel/skas/process_kern.c:68 #2 <signal handler called> #3 0x000000006017a829 in kill () at swab.h:135 #4 0x000000006001d7c9 in set_signals (enable=12139) at /home/paolo/Admin/kernel/6/VCS/linux-2.6.18/arch/um/os-Linux/signal.c:228 #5 0x00000000602a3330 in init_thread_union () #6 0x00000000602a34e0 in init_thread_union () #7 0x00000000600204a4 in new_thread (stack=Cannot access memory at address 0xfffffffffffffe38 ) at /home/paolo/Admin/kernel/6/VCS/linux-2.6.18/arch/um/os-Linux/skas/process.c:457 Previous frame inner to this frame (corrupt stack?) (gdb) print/x regs->skas.regs[16] # HOST_IP $22 = 0x4042f92f # Always this one (gdb) print pid $25 = 12191 bash $ grep 4042f000 /proc/12191/maps 4042f000-40430000 r-xs 019a5000 00:13 72548 /tmp/vm_file-eTomUL (deleted) Finally: (gdb) print/x uml_physmem + 0x019a5000 + 0x92f $24 = 0x619a592f (that's uml_physmem, plus mmap offset from /proc/<child>/maps, + the offset inside the vma). With disassemble I got: 0x00000000619a590f: nop 0x00000000619a5910: mov $0x15,%rax # 21 = __NR_access on x86_64. 0x00000000619a5917: syscall 0x00000000619a5919: cmp $0xfffffffffffff001,%rax # that's -4095, -MAX_ERRNO 0x00000000619a591f: jae 0x619a5922 0x00000000619a5921: retq 0x00000000619a5922: mov 1549599(%rip),%rcx # 0x61b1fe48 0x00000000619a5929: xor %rdx,%rdx 0x00000000619a592c: sub %rax,%rdx 0x00000000619a592f: mov %edx,%fs:(%rcx) #faulting instruction. 0x00000000619a5932: or $0xffffffffffffffff,%rax 0x00000000619a5936: jmp 0x619a5921 And there is also a caller: 0x00000000619a5940: push %rbx 0x00000000619a5941: mov %esi,%ebx 0x00000000619a5943: sub $0x90,%rsp 0x00000000619a594a: mov 1549783(%rip),%rax # 0x61b1ff28 0x00000000619a5951: mov (%rax),%edx 0x00000000619a5953: test %edx,%edx 0x00000000619a5955: jne 0x619a5969 0x00000000619a5957: callq 0x619a5910 0x00000000619a595c: mov %eax,%edx 0x00000000619a595e: add $0x90,%rsp 0x00000000619a5965: mov %edx,%eax 0x00000000619a5967: pop %rbx This looks like part of the code emitted for __syscall_return, with %fs:$rcx seeming like a move into errno (i.e. that's actually glibc code). This is from a Sarge-64 root_fs (the one from Antoine Martin). RCX there is (long)regs->skas.regs[11] = -64, and for FS, since HOST_FS = 25, I get: print/x regs->skas.regs[25] $45 = 0x63 -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |