From: Blaisorblade <bla...@ya...> - 2004-12-03 15:54:34
|
On Friday 03 December 2004 08:22, Anthony Brock wrote: > I'm attempting to build a 2.4 guest that will run and function on our 2.6.5 > SKAS enabled host (stock SuSE kernel). I moved the compile to a SuSE 9.0 > installation when I experienced problems (redefinition errors) Something about igmp.c? I remember something like that on SuSE 9.1 (never diagnosed - host headers problem, probably). > compiling > the kernel on the SuSE 9.1 Professional distribution. Both the kernel and > image worked fine on the older development machine (once I changed the > compiler optimization to '-O1'). > However, I just copied the kernel and image back over to our SuSE 9.1 > machine. While the kernel will "launch", it appears to freeze or hang > immediately. I see the following: > $ /bin/linux-2.4.27-bs1-32bit umid=test-1 mem=128M con=pts con0=fd:0,fd:1 > ubd0=/opt/images/instances/test_root.img > ubd1=/opt/images/instances/test_swap.img > eth0=tuntap,test-1-0,fe:fd:c6:6b:03:47 > Checking for the skas3 patch in the host...found > Checking for /proc/mm...found > Checking PROT_EXEC mmap in /tmp...OK > > [1]+ Stopped /bin/linux-2.4.27-bs1-32bit umid=test-1 > mem=128M con=pts con0=fd:0,fd:1 ubd0=/opt/images/instances/test_root.img > ubd1=/opt/images/instances/test_swap.img > eth0=tuntap,test-1-0,fe:fd:c6:6b:03:47 > $ > > I was quite shocked that the process would "stop" by itself. I then > attempted to attach with gdb using: Yes, this is a long-known problem we fixed very recently in 2.6 (I'll backport the fix to next 2.4 release, *when* I'll find time). It is due to a glibc bug in NPTL (see below for explaination), and to workaround it there are various roads, all leading to UML not being run with libraries from /lib/tls (the NPTL version) but the one in /lib/ (using LinuxThreads, which does not show this bug). NPTL evidently is not in SuSE 9.0 (which runs by default on a 2.4 kernel, so it could not support NPTL), so you don't see the problem. 1) running "export LD_ASSUME_KERNEL=2.4.1" before launching UML - the linker will then run libraries compatible with a 2.4.1 kernel (i.e., not NPTL). 2) compile UML by also enabling CONFIG_MODE_TT during compilation (you can then run it on SKAS mode). This implies that it's statically linked, and static linking on normal distros happens against non-NPTL libs (Gentoo breaks this rule, but not your case). This is way this bug is not seen often... 3) The same result would be given by enabling CONFIG_STATIC_LINK during compilation. 4) Moving on the host /lib/tls to /lib/tls.off - this will however slow your host down for heavily threaded programs, so it's just the last resort. In that point, UML is going to send a SIGSTOP to a child it fork()ed through the clone(2) syscall, but it gets the pid to kill() through a getpid() call executed by the child. Now, getpid() is reading its pid from a cache which is not flushed - so it is getting the pid value of the father, i.e. the pid of the main uml thread. And sending SIGSTOP to it, after which it booms (it will later segfault because the flow of execution is completely different from the expected one). > Attaching to program: /bin/linux-2.4.27-bs1-32bit, process 25303 > At this point, gdb "freezes" as well. it's waiting for the child to proceed and execute a new instruction... > I can't do anything until I close the > shell in which I originally launched the 2.4.27 kernel. Even "kill -9 > 25303" does not affect the process. Probably because it's ptraced... > The only way I've been able to produce > a backtrace was: Thanks for the description. Do you mind posting what I explained on http://uml.harlowhill.com/ on the troubleshooting section? It's a Wiki, i.e. freely editable - it's the modern way to manage docs, and it's a pity that the UML one is so little. I'd really like to spread it over everybody here! Bye -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade |