From: Mark H. <ma...@os...> - 2004-06-08 15:37:05
|
I ran my 4 node test yesterday with a lock around access to the quarantine_head in buf_safe_discard. It didn't hang this time but after about 14 hours or so two of the machines got something like this: net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001011):ORIG(1001011:1642938376)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): net->drop_nam:DAT0:MCST:REROUTED(1):HZ(44):SZ(713):SQNO(0):ACK(0):BACK(0):PRND(1001012):ORIG(1001012:937762824)::DEST(1001013:0): TIPC: Lost Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A TIPC: Lost contact with <1.1.17> bad: scheduling while atomic! TIPC: Established Link <1.1.19:eth1-1.1.17:eth1> on Network Plane A [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 Debug: sleeping function called from invalid context at mm/slab.c:1994 in_atomic():1, irqs_disabled():0 [<c010618e>] dump_stack+0x1e/0x30 [<c011e0c9>] __might_sleep+0x99/0xb0 [<c014bcdf>] kmem_cache_alloc+0x21f/0x230 [<c03786a3>] alloc_skb+0x23/0xf0 [<c037795e>] sock_alloc_send_pskb+0xce/0x1f0 [<c0377aae>] sock_alloc_send_skb+0x2e/0x40 [<c03dfe69>] unix_stream_sendmsg+0x199/0x3f0 [<c0374a3d>] sock_aio_write+0xbd/0xe0 [<c0165cd7>] do_sync_write+0x87/0xc0 [<c0165df9>] vfs_write+0xe9/0x120 [<c0165ecf>] sys_write+0x3f/0x60 [<c0105363>] syscall_call+0x7/0xb bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c010538a>] work_resched+0x5/0x16 bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c03f95ce>] schedule_timeout+0x6e/0xc0 [<c01941c5>] ep_poll+0x135/0x1b0 [<c0192e8b>] sys_epoll_wait+0xab/0xb0 [<c0105363>] syscall_call+0x7/0xb bad: scheduling while atomic! [<c010618e>] dump_stack+0x1e/0x30 [<c03f8d84>] schedule+0x6b4/0x6c0 [<c011d0cd>] sys_sched_yield+0x5d/0x90 [<c01741c3>] coredump_wait+0x43/0xb0 [<c0174398>] do_coredump+0x168/0x271 [<c012e1a7>] get_signal_to_deliver+0x287/0x510 [<c0105126>] do_signal+0xb6/0xf0 [<c01051bb>] do_notify_resume+0x5b/0x5d [<c01053ae>] work_notifysig+0x13/0x15 Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing I'm not sure what to make of this. I don't see TIPC on the stack, but who knows. I'll try page alloc debug to see if there is some re-using of free memory going on. Mark -- Mark Haverkamp <ma...@os...> |