From: Jon M. <jon...@er...> - 2004-05-12 19:10:00
|
Yeah, that's a classical one. I also think your solution is ok; the pending ports will be awakened at next message reception, so no harm done. Thanks /jon Mark Haverkamp wrote: >Here is what looks to be happening with the spin lock deadlock. I >replaced all the spin_lock_bh calls with a wrapper that tries to get the >lock for a while then prints out a debug message if it can't get the >lock. > >As an experiment, I changed the spin_lock_bh in link_wakeup_ports to a >trylock and exited if it couldn't get the lock. I am now not able to >get the deadlock. > > >CPU 0: >release -- > tipc_delete_port (get port lock) -- > port_abort_peer -- > port_send_proto_msg -- > net_route_msg -- > link_send (get node lock) -- (hung spinning) > >CPU 1: >common_interrupt -- > do_softirq -- > net_rx_action -- > netif_receive_skb -- > recv_msg (tipc eth) -- > tipc_recv_msg (get node lock) -- > link_wakeup_ports (get port lock) -- (hung spinning) > >Stack dumps: > >port lock timeout >Call Trace: > [<f8a837ab>] link_wakeup_ports+0x9b/0x230 [tipc] > [<f8a87c2e>] tipc_recv_msg+0x7fe/0x8c0 [tipc] > [<c014949d>] __kmalloc+0x19d/0x250 > [<f8aa5db9>] recv_msg+0x39/0x50 [tipc] > [<c0375af2>] netif_receive_skb+0x172/0x1b0 > [<c0375bb4>] process_backlog+0x84/0x120 > [<c0375cd0>] net_rx_action+0x80/0x120 > [<c0124bc8>] __do_softirq+0xb8/0xc0 > [<c0124c05>] do_softirq+0x35/0x40 > [<c0107ce5>] do_IRQ+0x175/0x230 > [<c0105ce0>] common_interrupt+0x18/0x20 > [<c0221c91>] copy_from_user+0x1/0x80 > [<f8a866bf>] link_send_sections_long+0x30f/0xb30 [tipc] > [<c0221694>] __delay+0x14/0x20 > [<f8a8366f>] link_schedule_port+0x13f/0x1e0 [tipc] > [<f8a860f5>] link_send_sections_fast+0x5b5/0x870 [tipc] > [<c011b12a>] __wake_up_common+0x3a/0x60 > [<f8a97bf2>] tipc_send+0x92/0x9d0 [tipc] > [<c011d736>] __mmdrop+0x36/0x50 > [<c03f15b7>] schedule+0x467/0x7a0 > [<f8aa33e6>] recv_msg+0x2b6/0x560 [tipc] > [<f8aa2d90>] send_packet+0x90/0x180 [tipc] > [<c011b0d0>] default_wake_function+0x0/0x20 > [<c036c83e>] sock_sendmsg+0x8e/0xb0 > [<f8aa5db9>] recv_msg+0x39/0x50 [tipc] > [<c01435ba>] buffered_rmqueue+0xfa/0x220 > [<c036c61a>] sockfd_lookup+0x1a/0x80 > [<c036dd61>] sys_sendto+0xe1/0x100 > [<c0128f62>] del_timer_sync+0x42/0x140 > [<c036d109>] sock_poll+0x29/0x30 > [<c017884b>] do_pollfd+0x5b/0xa0 > [<c036ddb6>] sys_send+0x36/0x40 > [<c036e60e>] sys_socketcall+0x12e/0x240 > [<c0105373>] syscall_call+0x7/0xb > >&node->lock lock timeout >Call Trace: > [<f8a8549a>] link_send+0xda/0x2a0 [tipc] > [<f8a92cee>] net_route_msg+0x41e/0x43d [tipc] > [<f8a949c2>] port_send_proto_msg+0x1a2/0x2a0 [tipc] > [<f8a95983>] port_abort_peer+0x83/0x90 [tipc] > [<f8a9458f>] tipc_deleteport+0x19f/0x280 [tipc] > [<f8aa25b2>] release+0x72/0x130 [tipc] > [<c036c76b>] sock_release+0x7b/0xc0 > [<c036d176>] sock_close+0x36/0x50 > [<c016315a>] __fput+0x10a/0x120 > [<c0161597>] filp_close+0x57/0x90 > [<c0121dbc>] put_files_struct+0x7c/0xf0 > [<c0122d5a>] do_exit+0x23a/0x5a0 > [<c012aa35>] __dequeue_signal+0xf5/0x1b0 > [<c0123240>] do_group_exit+0xe0/0x150 > [<c012ab1d>] dequeue_signal+0x2d/0x90 > [<c012cbef>] get_signal_to_deliver+0x26f/0x510 > [<c0105136>] do_signal+0xb6/0xf0 > [<c036ddb6>] sys_send+0x36/0x40 > [<c036e60e>] sys_socketcall+0x12e/0x240 > [<c01051cb>] do_notify_resume+0x5b/0x5d > [<c01053be>] work_notifysig+0x13/0x15 > > > |