Re: [Kgdb-bugreport] kgdb 2.3 hangs
Status: Beta
Brought to you by:
jwessel
From: Ming Z. <mi...@el...> - 2006-01-31 13:52:39
|
On Tue, 2006-01-31 at 19:01 +0530, Amit Kale wrote: > KGDB paniced right here -> > > Sending packet: $mc0566328,4#9e...Packet instead of Ack, ignoring it > > Any packets after this are unreliable. > Looks like access to memory location 0xc0566328 caused the kernel to panic. > This address should be in direct mapped region of the kernel. It couldn't > have segfaulted here, besides we have tested kgdb 2.3 with different memory > accesses, to ensure that it doesn't panic this way. i will do this. > > Seems like the other processor tried to send some printk related packets (O > packets). Could you confirm whether all processors are in debugger? Issue "p > procindebug" It'll print the procindebug array. You should see as many 1s as > the number processors. If there are fewer 1s there, other processors are > still not inside KGDB, in which case you'll see behavior of this kind. > > Is this an Opteron system? We have seen too many nmi problems on Opterons. yes, it is a dual opteron. though i compile 32bit kernel for it. tried both up and smp kernel. > -Amit > > On Monday 30 Jan 2006 10:28 pm, Ming Zhang wrote: > > On Mon, 2006-01-30 at 12:26 +0530, Amit Kale wrote: > > > Tom, Ming, > > > > > > module debugging patch uses read memory packets, so it won't have any smp > > > issues. > > > > > > Have you passed "nmi_watchdog=1" to kernel through grub.conf file? Please > > > also check the output of dmesg, to confirm that NMI watchdog in deed > > > works. If it doesn't you'll see a message in dmesg output. If > > > nmi_watchdog=1 doesn't work try nmi_watchdog=2. (Yeah, seems like a joke, > > > but that's exactly what it is :-) > > > > yes, i knew this nmi trick, it is funny. > > > > > I see two different issues here. > > > > > > 1. You could go ahead after turning off console messages over kgdb. This > > > should be required. I would like to see a dump of packets with this > > > option turned on. We'll get some idea about why this option is causing a > > > problem. > > > > i selected it, compiled, and loaded the new kernel. got this. > > > > ---------------------------- > > This GDB was configured as "i686-pc-linux-gnu"... > > (gdb) set remotebaud 57600 > > (gdb) set debug remote 1 > > (gdb) set solib-search-path /root/kgdb/module26 > > (gdb) target remote /dev/ttyS0 > > Remote debugging using /dev/ttyS0 > > Sending packet: $Hc-1#09...Ack > > Packet received: OK > > Sending packet: $qC#b4...Ack > > Packet received: QC0000000000008000 > > Sending packet: $qOffsets#4b...Ack > > Packet received: > > Sending packet: $?#3f...Ack > > Packet received: S05 > > Sending packet: $Hg8000#77...Ack > > Packet received: OK > > Sending packet: $g#67...Ack > > Packet received: > > 01000000c01a64c000000000728362c0209f60c0249f60c07b8362c042f762c > > 08bbf14c08600000060000000680000007b0000007b000000ffff0000ffff0000 > > breakpoint () at kernel/kgdb.c:1776 > > 1776 atomic_set(&kgdb_setting_breakpoint, 0); > > Sending packet: $mc0566328,4#9e...Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Bad checksum, sentsum=0xd1, csum=0x9b, buf=28356c0 > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Bad checksum, sentsum=0xd1, csum=0x9b, buf=28356c0 > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Bad checksum, sentsum=0xd1, csum=0x9b, buf=28356c0 > > Packet instead of Ack, ignoring it > > Packet instead of Ack, ignoring it > > Bad checksum, sentsum=0xd1, csum=0x9b, buf=28356c0 > > Packet instead of Ack, ignoring it > > putpkt: Junk: 286356c0#d1 > > ---------------------------- > > > > then after a while, saw this. seems that serial link is broken > > > > > > Sending packet: $mc0566328,4#9e...putpkt: write failed: Input/output > > error. > > > > (gdb) c > > Continuing. > > Sending packet: $Z0,c014bc80,1#38...putpkt: write failed: Input/output > > error. > > (gdb) c > > Continuing. > > Sending packet: $Z0,c014bc80,1#38...putpkt: write failed: Input/output > > error. > > > > > I have seen a bug report earlier where console message packets got mixed > > > up with regular gdb packets, don't know how. Couldn't reproduce it later. > > > I think we should print a message from kgdb indicating that smp isn't > > > working. Perhaps halt kernel at that point. > > > > > > 2. The output you have sent is rather limited. Could you send the _whole_ > > > output. That is right from the point where gdb prints it's initial > > > version no. etc. to the point where debugging fails. > > > > > > -Amit > > > > > > On Friday 27 Jan 2006 9:12 pm, Ming Zhang wrote: > > > > On Fri, 2006-01-27 at 08:31 -0700, Tom Rini wrote: > > > > > On Fri, Jan 27, 2006 at 10:12:46AM -0500, Ming Zhang wrote: > > > > > > thanks for the hint. > > > > > > > > > > > > now i deselect the "console message over kgdb" and use this 'set > > > > > > debug remote 1', the boot process is good. > > > > > > > > > > > > > > > > > > now is another problem. > > > > > > > > > > > > it hangs when loading module. > > > > > > > > > > You need to have a custom compiled GDB with the patches found in CVS, > > > > > if you've also got the 'module.patch' patch applied. > > > > > > > > i run the gdb_mod 2.3 from download page. i assumed that is the one you > > > > mentioned. > > > > > > > > > But I'll admit I haven't tested module autoloading stuff on SMP, but > > > > > it should work. > > > > > > > > though i am using a dual opteron, i already disabled SMP and choose P4 > > > > for a 32bit compilation. > > > > > > > > ming > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > > > > files for problems? Stop! Download the new AJAX search engine that > > > > makes searching your log files as easy as surfing the web. DOWNLOAD > > > > SPLUNK! > > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164 > > > >2 _______________________________________________ > > > > Kgdb-bugreport mailing list > > > > Kgd...@li... > > > > https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > > files for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > > _______________________________________________ > > Kgdb-bugreport mailing list > > Kgd...@li... > > https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport |