From: Brian J. M. <0f6...@in...> - 2002-10-08 17:40:00
|
I have asked the second part of this question on the user list but did not get much response from it. It is perhaps an issue that really only developers would know the answer to. But first my first part. I built the 2.4.19-9 UML release with gcc 3.2 and am trying to debug it with gdb 5.2.1. Here are the version strings: gcc (GCC) 3.2 (Mandrake Linux 9.0 3.2-1mdk) Copyright (C) 2002 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. GNU gdb 5.2.1-2mdk (Mandrake Linux) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-mandrake-linux-gnu". When I try to use the umlgdb tool, umlgdb seems to start OK and tells me to start "linux" with "debug gdb-pid=3D31968" which I do, and then I go back to the gdb and hit return at the "att 1" prompt and then type "cont" and hit return. The UML continues to run but dies very shortly with: tracing thread pid =3D 32006 Linux version 2.4.19-16mdkcustom-9um (brian@pc) (gcc version 3.2 (Mandrake = Linux 9.0 3.2-1mdk)) #7 Mon Oct 7 14:57:03 EDT 2002 On node 0 totalpages: 8192 zone(0): 8192 pages. zone(1): 0 pages. zone(2): 0 pages. Kernel command line: root=3D/dev/ubd0 Calibrating delay loop... 610.98 BogoMIPS Memory: 30496k available Dentry cache hash table entries: 4096 (order: 3, 32768 bytes) Inode cache hash table entries: 2048 (order: 2, 16384 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes) Page-cache hash table entries: 8192 (order: 3, 32768 bytes) Checking for host processor cmov support...Yes Checking for host processor xmm support...No Checking that ptrace can change system call numbers...OK Checking that host ptys support output SIGIO...Yes Checking that host ptys support SIGIO on close...No, enabling workaround POSIX conformance testing by UNIFIX Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket Kernel panic: Segfault with no mm The second part is that I tried to simply get a gdb by signalling UML with USR1. When I do that, the gdb seems to be running fine but any time I try to get a stack trace I only ever get frame #0. It does not seem to show any of the prior frames. Does anyone have _any_ ideas at all why this would be? It seems like it is something to do with gcc 3.2, but any further ideas would be welcome. I am happy to go bug the gcc guys if I can get some idea of what to bug them about. Thanx, b. --=20 Brian J. Murrell |
From: Jan H. <bu...@uc...> - 2002-10-08 18:02:14
|
On Tue, Oct 08, 2002 at 01:39:40PM -0400, Brian J. Murrell wrote: > When I try to use the umlgdb tool, umlgdb seems to start OK and tells > me to start "linux" with "debug gdb-pid=31968" which I do, and then I > go back to the gdb and hit return at the "att 1" prompt and then type > "cont" and hit return. The UML continues to run but dies very shortly > with: > > tracing thread pid = 32006 > Linux version 2.4.19-16mdkcustom-9um (brian@pc) (gcc version 3.2 (Mandrake Linux 9.0 3.2-1mdk)) #7 Mon Oct 7 14:57:03 EDT 2002 > On node 0 totalpages: 8192 > zone(0): 8192 pages. > zone(1): 0 pages. > zone(2): 0 pages. > Kernel command line: root=/dev/ubd0 > Calibrating delay loop... 610.98 BogoMIPS > Memory: 30496k available > Dentry cache hash table entries: 4096 (order: 3, 32768 bytes) > Inode cache hash table entries: 2048 (order: 2, 16384 bytes) > Mount-cache hash table entries: 512 (order: 0, 4096 bytes) > Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes) > Page-cache hash table entries: 8192 (order: 3, 32768 bytes) > Checking for host processor cmov support...Yes > Checking for host processor xmm support...No > Checking that ptrace can change system call numbers...OK > Checking that host ptys support output SIGIO...Yes > Checking that host ptys support SIGIO on close...No, enabling workaround > POSIX conformance testing by UNIFIX > Linux NET4.0 for Linux 2.4 > Based upon Swansea University Computer Society NET3.039 > Initializing RT netlink socket > Kernel panic: Segfault with no mm This looks like gcc-3.2 uncovered some hiding bug. First, apply latest patch, ie -12um. Then, if the problem is still there, try to get backtrace and other things from the debugger. Let's see if it reveals something. Segfault with no mm is a standart oops (unable to handle kernel paging request) from kernel thread (or in this case before init is started). Note, that i just tested 2.4.19-12um from umlgdb and it came up, loaded and unloaded my module and shut down ok. Another note: I tried the signal method too and got a "Segfault in signals" message in the xterm where debuger should have come up and the thing forze. > The second part is that I tried to simply get a gdb by signalling UML > with USR1. When I do that, the gdb seems to be running fine but any > time I try to get a stack trace I only ever get frame #0. It does not > seem to show any of the prior frames. > > Does anyone have _any_ ideas at all why this would be? It seems like it > is something to do with gcc 3.2, but any further ideas would be welcome. > I am happy to go bug the gcc guys if I can get some idea of what to bug > them about. No idea to this. But I just switched to gcc-3.2, so if I hit it too (well, I am not using this way to get debuger, since I need the module-tracking stuff from umlgdb). ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bu...@uc...> |
From: Brian J. M. <5f7...@in...> - 2002-10-09 10:42:23
|
On Tue, Oct 08, 2002 at 08:01:53PM +0200, Jan Hudec wrote: >=20 > This looks like gcc-3.2 uncovered some hiding bug. Maybe. > First, apply latest > patch, ie -12um. Done. > Then, if the problem is still there, try to get > backtrace and other things from the debugger. Let's see if it reveals > something. Well, see, the problem is that this only happens when I try to start the UML with the umlgdb tool hooked in. If I start the UML without a debugger hooked in (i.e. no "debug gdb-pid=3Dnnn"), then it starts fine. It winds up crashing during other operations but that is why I wanted to use umlgdb. The debugger also sees this failure as a normal exit: This GDB was configured as "i586-mandrake-linux-gnu"... (gdb) b sys_init_module Breakpoint 1 at 0xa00107b7: file /usr/src/linux-2.4.18-8.1mdk-pom-clean/i= nclude/linux/sched.h, line 771. (gdb) b panic Breakpoint 2 at 0xa000f37a: file panic.c, line 52. (gdb) att 1 Attaching to program: /localdata/uml/linux, process 1 0xa0105e81 in xstat64_conv () (gdb) c Continuing. =20 Program exited normally. So there is nothing to collect in the debugger. > Segfault with no mm is a standart oops (unable to handle kernel paging > request) from kernel thread (or in this case before init is started). OK. > Note, that i just tested 2.4.19-12um from umlgdb and it came up, loaded > and unloaded my module and shut down ok. Built with which compiler? I have the same 2.4.19-9 UML that was failing (before I updated to 12) that I rebuilt with gcc 3.0.4 and it starts and runs fine with umlgdb. Well runs fine until I It has crashed with a whole bunch of: Kernel panic: Segfault with no mm Kernel panic: Segfault with no mm Kernel panic: Segfault with no mm and then a: I'm tracing myself and I can't get out And also with a: Kernel panic: Kernel mode fault at addr 0x0, ip 0x0 In interrupt handler - not syncing while running with umlgdb and umlgdb just said: Program exited normally. even though I had a breakpoint installed in "panic". > Another note: I tried the signal method too and got a "Segfault in > signals" message in the xterm where debuger should have come up and the > thing forze. Interesting. > No idea to this. But I just switched to gcc-3.2, so if I hit it too > (well, I am not using this way to get debuger, since I need the > module-tracking stuff from umlgdb). I would ultimately like to use umlgdb as well, but alas, I don't get very far with it. :-( b. --=20 Brian J. Murrell |
From: Jan H. <bu...@uc...> - 2002-10-09 10:48:47
|
On Wed, Oct 09, 2002 at 06:42:02AM -0400, Brian J. Murrell wrote: > > Note, that i just tested 2.4.19-12um from umlgdb and it came up, loaded > > and unloaded my module and shut down ok. > > Built with which compiler? [...] My compiler version: bulb@vagabond:~$ gcc -v Reading specs from /usr/lib/gcc-lib/i386-linux/3.2.1/specs Configured with: /mnt/data/gcc-3.1/gcc-3.2-3.2.1ds1/src/configure -v --enable-languages=c,c++,java,f77,proto,objc,ada --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.2 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-java-gc=boehm --enable-objc-gc i386-linux Thread model: posix gcc version 3.2.1 20020912 (Debian prerelease) My gdb version: bulb@vagabond:~$ gdb -v GNU gdb 2002-08-18-cvs Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-linux". ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bu...@uc...> |
From: Brian J. M. <2d4...@in...> - 2002-10-21 16:08:13
|
The latest build error I am getting with UML 2.5.44-1: ld -r -o net/rxrpc/rxrpc.o net/rxrpc/call.o net/rxrpc/connection.o net/= rxrpc/krxiod.o net/rxrpc/krxsecd.o net/rxrpc/krxtimod.o net/rxrpc/main.o ne= t/rxrpc/peer.o net/rxrpc/rxrpc_syms.o net/rxrpc/transport.o net/rxrpc/proc.= o net/rxrpc/sysctl.o make -f net/sched/Makefile=20 gcc -Wp,-MD,net/sched/.sch_ingress.o.d -D__KERNEL__ -Iinclude -Wall -Wstr= ict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -g -D_= _arch_um__ -DSUBARCH=3D\"i386\" -D_LARGEFILE64_SOURCE -I/usr/src/linux-2.5.= 44/arch/um/include -Derrno=3Dkernel_errno -U__i386__ -Ui386 -nostdinc -iwit= hprefix include -DMODULE -DKBUILD_BASENAME=3Dsch_ingress -c -o net/sched/= sch_ingress.o net/sched/sch_ingress.c In file included from net/sched/sch_ingress.c:21: include/asm/smp.h:4: parse error before numeric constant make[2]: *** [net/sched/sch_ingress.o] Error 1 make[1]: *** [net/sched] Error 2 make: *** [net] Error 2 I wonder if the #define at include/linux/smp.h:93 is causing a problem: #define cpu_online_map 1 For the record, I have the following .config settings: # CONFIG_UML_SMP is not set # CONFIG_SMP is not set b. --=20 Brian J. Murrell |
From: Jeff D. <jd...@ka...> - 2002-10-21 19:18:44
|
2d4...@in... said: > include/asm/smp.h:4: parse error before numeric constant I have this fixed. Just move the declaration inside the #ifdef CONFIG_SMP. Jeff |
From: Brian J. M. <62c...@in...> - 2002-10-21 20:29:06
|
On Mon, Oct 21, 2002 at 03:22:41PM -0500, Jeff Dike wrote: >=20 > I have this fixed. Just move the declaration inside the #ifdef CONFIG_SM= P. I half had an idea that it should go inside the #ifdef. Seems to have built successfully now. Thanks for the help Jeff. b. --=20 Brian J. Murrell |
From: Brian J. M. <dd9...@in...> - 2002-10-28 15:09:03
|
On Tue, Oct 08, 2002 at 08:01:53PM +0200, Jan Hudec wrote: >=20 > No idea to this. But I just switched to gcc-3.2, so if I hit it too > (well, I am not using this way to get debuger, since I need the > module-tracking stuff from umlgdb). How are you making out with building UMLs with gcc-3.2? Are you getting stables UMLs? I think I have yet to build a stable UML, but I am usually starting with my vendor (Mandrake)'s heavily patched kernel. I am trying their latest with UML 2.4.19-22 right now, (with gcc-3.2) and if it's still unstable, I will try vanilla 2.4.19 with the 2.4.19-22 UML patch. Just wondering how you were making out with gcc-3.2 and UML. Thanx, b. --=20 Brian J. Murrell |
From: David C. <da...@da...> - 2002-10-28 15:30:42
|
Brian J. Murrell wrote: > How are you making out with building UMLs with gcc-3.2? Are you > getting stables UMLs? I build all the UMLs on one box with 3.2, and they're just as stable as with 2.95.4. > I think I have yet to build a stable UML, but I > am usually starting with my vendor (Mandrake)'s heavily patched > kernel. Well, that would be your problem. > I am trying their latest with UML 2.4.19-22 right now, (with > gcc-3.2) and if it's still unstable, I will try vanilla 2.4.19 with > the 2.4.19-22 UML patch. Elaborate on 'unstable'? It doesn't compile, or the UML doesn't work. Try an eariler 2.4.19 patch, say 2.4.19-16 or something. More recent UML patches contain the skas merge which breaks lots of things, so unless you know what you're doing, make sure you use something which is known to actually work. David -- David Coulson http://davidcoulson.net/ d...@vi... http://journal.davidcoulson.net/ |
From: Brian J. M. <844...@in...> - 2002-10-28 15:43:22
|
On Mon, Oct 28, 2002 at 03:30:25PM +0000, David Coulson wrote: >=20 > I build all the UMLs on one box with 3.2, and they're just as stable as= =20 > with 2.95.4. Excellent! Just hearing success stories is half the battle. How about gdbing them? Can you get good stack traces? All I ever seem to get is the topmost frame. If you do get good stack traces, what gdb are you using? > Well, that would be your problem. Very well could be, but if the native kernel is stable (and it is indeed!) why would "UMLing" it make it unstable. > Elaborate on 'unstable'? Crashes (panics) and hangs frequently or even repeatably. > It doesn't compile, or the UML doesn't work.=20 Well, they can and will work for a while, and then inevitably I get a crash or hang. In the latter case I have to go back to the host and start killing off processes. Most times just killing the tracing thread is enough, but sometimes I have to kill the whole process stack for the UML with -9. > Try an eariler 2.4.19 patch, say 2.4.19-16 or something. OK. The last response I got to trying to get this to work was "use the latest patch". I can certainly understand why the newer patches would be more unstable, but don't newer patches also fix problems with older ones? I suppose somewhere in there there was a "this is a good release, now let's do some (unstable) development on it". I take it 2.4.19-16 was one of these points? > More recent UML=20 > patches contain the skas merge which breaks lots of things, so unless=20 > you know what you're doing, make sure you use something which is known=20 > to actually work. I will take your recommendation and try -16. Thanx! b. --=20 Brian J. Murrell |
From: Rainer E. <ra...@el...> - 2002-10-28 15:56:00
|
Brian J. Murrell wrote: > I will take your recommendation and try -16. I think, -17 should be fine. This version only needs a small fix (grab it from the lists archive) if you want to compile with SMP and build modules. I'm also using gcc 3.2.1 and a vanilla kernel. -- ra...@el... |
From: David C. <da...@da...> - 2002-10-28 16:21:27
|
Brian J. Murrell wrote: > Excellent! Just hearing success stories is half the battle. How > about gdbing them? Can you get good stack traces? All I ever seem to > get is the topmost frame. If you do get good stack traces, what gdb > are you using? Sure - My main UML devel box runs gcc 3.2, and I do debugging with it. > Very well could be, but if the native kernel is stable (and it is > indeed!) why would "UMLing" it make it unstable. Er, because the UML patch expects a vanilla kernel tree - Functions change, things are different, etc. > Crashes (panics) and hangs frequently or even repeatably. Can you get any debugging information? > OK. The last response I got to trying to get this to work was "use > the latest patch". I can certainly understand why the newer patches > would be more unstable, but don't newer patches also fix problems with > older ones? I suppose somewhere in there there was a "this is a good > release, now let's do some (unstable) development on it". I take it > 2.4.19-16 was one of these points? Well, newer patches fix problems, but if new features are introduced then they will make new problems. skas is a huge change, so there are a great many issues when using it. > I will take your recommendation and try -16. With a regular kernel tree :-) David -- David Coulson http://davidcoulson.net/ d...@vi... http://journal.davidcoulson.net/ |
From: Steven P. <st...@si...> - 2002-10-28 20:17:56
|
On Mon, Oct 28, 2002 at 10:43:07AM -0500, Brian J. Murrell wrote: > I suppose somewhere in there there was a "this is a good release, now > let's do some (unstable) development on it". I take it 2.4.19-16 was > one of these points? Looks like it (more or less). If you try -16, be sure to apply the patch in this message: http://sourceforge.net/mailarchive/message.php?msg_id=2301892 Steve -- st...@si... | Southern Illinois Linux Users Group (618)398-7360 | See web site for meeting details. Steven Pritchard | http://www.silug.org/ |