From: A.R K. <a.r...@gm...> - 2011-03-21 23:14:19
|
On Mon, Mar 21, 2011 at 1:55 PM, Stephens, Allan < all...@wi...> wrote: > Hi A.R.: > > > > It appears that the post-TIPC 1.7.7-rc4 patch “Ignore broadcast > acknowledgements that are out-of-range” contains a bug that prevents > tipc_bclink_acknowledge() from properly cleaning up unsent messages when > contact with TIPC’s only neighboring node is lost. The lack of full clean up > causes the broadcast link to improperly attempt to resolve its link > congestion resolution logic when there is nowhere to send these previously > unsent messages, and triggers the assertion you identified. > > > > I’ve now created a supplementary patch that revises the checks introduced > by the earlier patch, which should resolve the problem you encountered. > Would you be in a position to try out this new patch to see if it actually > does completely address your problem? > > > Sure. Thanks for the patch. The customer who reproduced this consistently with our product is running with a fail-safe patch that I had mailed him as explained earlier. Will ask him to revert my patch and apply yours that you had mailed inline in your next mail. Regards, -Karthick > > > *From:* A.R Karthick [mailto:a.r...@gm...] > *Sent:* Friday, March 18, 2011 1:58 PM > *To:* tipc-discussion; Stephens, Allan > *Subject:* TIPC kernel panic in tipc_bcbearer_send (1.7.7-rc4) > > > > Hi, > I had got this panic report yesterday since our product relies on TIPC > heavily. But my RCA enclosed below was based on an incomplete panic report > which just had: tipc_bcbearer_send + 0x30 as the EIP and an OOPs hex dump > which was good enough to figure out that we hit an assert in > tipc_bcbearer_send based on the tipc.ko disassembly. > > Got the entire panic backtrace today: > > [ 3300.287486] SBX NRS module unloaded. > [ 3301.715321] SBX TCP Redundancy module unloaded > [ 3304.404419] TIPC: Disabling bearer <eth:bond0> > [ 3304.409757] TIPC: Lost link <1.1.2:bond0-1.1.1:bond0> on network plane > A > [ 3304.412512] TIPC: Lost contact with <1.1.1> > [ 3304.412867] ------------[ cut here ]------------ > [ 3304.414979] kernel BUG at > /software/src/common/debian/sbs100/linux-2.6.26/net/tipc/tipc_bcast.c:636! > [ 3304.424351] invalid opcode: 0000 [1] SMP > [ 3304.424351] CPU 10 > [ 3304.424351] Modules linked in: 8021q(F) sbx_pnps(PF) genworkq(F) > octeon_drv(PF) drbd pkp_drv(P) deflate zlib_deflate zlib_inflate ctr twofish > twofish_common camellia serpent blowfish des_generic cbc aes_x86_64 > aes_generic xcbc sha256_generic crypto_null crypto_blkcipher af_key > xfrm_user sha1_generic cn sctp libcrc32c tipc bonding nls_utf8 nls_cp437 > vfat fat nls_base ipmi_si ipmi_devintf ipmi_poweroff ipmi_watchdog > ipmi_msghandler loop snd_pcsp snd_pcm snd_timer snd soundcore snd_page_alloc > joydev i2c_i801 i2c_core shpchp igb pci_hotplug button evdev ext3 jbd > mbcache dm_mirror dm_log dm_snapshot dm_mod usbhid hid ff_memless sd_mod > ahci libata scsi_mod dock ehci_hcd uhci_hcd thermal processor fan > thermal_sys [last unloaded: sbx_tcpred] > [ 3304.424351] Pid: 14952, comm: tipc-config Tainted: PF 2.6.26 #1 > [ 3304.424351] RIP: 0010:[<ffffffffa0240904>] [<ffffffffa0240904>] > :tipc:tipc_bcbearer_send+0x30/0x17e > [ 3304.424351] RSP: 0018:ffff81031d9959b8 EFLAGS: 00010246 > [ 3304.424351] RAX: 0000000000000000 RBX: ffff81033f10e000 RCX: > 0000000067000000 > [ 3304.424351] RDX: ffff8101bdcf2880 RSI: ffff81033cc3f000 RDI: > ffff81018e0d5500 > [ 3304.424351] RBP: ffff8101bdcf2880 R08: ffff81018e0d5500 R09: > ffff8101bf3b5dc0 > [ 3304.424351] R10: ffff81033fc004c0 R11: ffff81033f1ce6c0 R12: > ffff81018e0d5500 > [ 3304.424351] R13: 0000000000000001 R14: ffff8101bc96f180 R15: > ffff8101a07525c0 > [ 3304.424351] FS: 00007fd5dd22e6e0(0000) GS:ffff8101bf27bbc0(0000) > knlGS:0000000000000000 > [ 3304.424351] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 3304.424351] CR2: 00000000009bc168 CR3: 00000001ba0a2000 CR4: > 00000000000006e0 > [ 3304.424351] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 3304.424351] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 3304.424351] Process tipc-config (pid: 14952, threadinfo > ffff81031d994000, task ffff81032c21f8f0) > [ 3304.424351] Stack: ffff8101bf0599c0 ffff81033f10e000 ffff8101bdcf2880 > ffff81018e0d5500 > [ 3304.424351] 0000000000000001 ffffffffa0243910 ffff81033f10e000 > 0000000000000066 > [ 3304.424351] ffff81018e0d5500 ffffffffa02442f8 ffff81018e0d5500 > ffffffffa0240878 > [ 3304.536587] Call Trace: > [ 3304.536587] [<ffffffffa0243910>] ? > :tipc:tipc_link_push_packet+0x25f/0x2a4 > [ 3304.536587] [<ffffffffa02442f8>] ? > :tipc:tipc_link_push_queue+0x1f/0x39 > [ 3304.536587] [<ffffffffa0240878>] ? > :tipc:tipc_bclink_acknowledge+0x142/0x19e > [ 3304.536587] [<ffffffffa024c695>] ? > :tipc:tipc_node_link_down+0x174/0x20b > [ 3304.536587] [<ffffffffa02447da>] ? :tipc:tipc_link_reset+0x92/0x1f1 > [ 3304.536587] [<ffffffffa0245851>] ? :tipc:tipc_link_delete+0x40/0x9a > [ 3304.536587] [<ffffffffa02413af>] ? :tipc:bearer_disable+0x4e/0x92 > [ 3304.536587] [<ffffffffa0241609>] ? :tipc:tipc_disable_bearer+0x60/0x72 > [ 3304.536587] [<ffffffffa024268e>] ? :tipc:tipc_cfg_do_cmd+0x467/0x9a8 > [ 3304.536587] [<ffffffff803d16c9>] ? netlink_sendskb+0x23/0x3c > [ 3304.536587] [<ffffffffa024bff5>] ? :tipc:handle_cmd+0x4b/0x9a > [ 3304.536587] [<ffffffff803d4c67>] ? genl_rcv_msg+0x150/0x171 > [ 3304.536587] [<ffffffff803d4b17>] ? genl_rcv_msg+0x0/0x171 > [ 3304.536587] [<ffffffff803d22c5>] ? netlink_rcv_skb+0x34/0x7c > [ 3304.536587] [<ffffffff803d417e>] ? genl_rcv+0x1f/0x2c > [ 3304.536587] [<ffffffff803d20b0>] ? netlink_unicast+0x1e5/0x25b > [ 3304.536587] [<ffffffff803b6411>] ? __alloc_skb+0x80/0x12e > [ 3304.536587] [<ffffffff803d2869>] ? netlink_sendmsg+0x274/0x287 > [ 3304.536587] [<ffffffff803afc74>] ? sock_aio_write+0xf7/0x106 > [ 3304.536587] [<ffffffff8029b1ec>] ? do_sync_write+0xd1/0x118 > [ 3304.536587] [<ffffffff80246116>] ? autoremove_wake_function+0x0/0x2e > [ 3304.536587] [<ffffffff804560f5>] ? _spin_lock_bh+0x9/0x1f > [ 3304.536587] [<ffffffff803b23cb>] ? release_sock+0x13/0x96 > [ 3304.536587] [<ffffffff803b3e90>] ? sock_setsockopt+0x493/0x4a5 > [ 3304.536587] [<ffffffff8029b9b9>] ? vfs_write+0xc0/0x156 > [ 3304.536587] [<ffffffff8029bf40>] ? sys_write+0x45/0x6e > [ 3304.536587] [<ffffffff8020bf6a>] ? system_call_after_swapgs+0x8a/0x8f > [ 3304.536587] > [ 3304.536587] > [ 3304.536587] Code: 89 fc 55 53 48 83 ec 08 48 8b 97 d8 00 00 00 8b 02 0f > c8 a9 00 00 10 00 75 3d 48 8b 05 9e dc 01 00 8b 80 68 03 00 00 85 c0 75 04 > <0f> 0b eb fe 89 c0 48 89 47 38 8b 02 80 e4 ef 80 cc 10 89 02 8b > [ 3304.536587] RIP [<ffffffffa0240904>] > :tipc:tipc_bcbearer_send+0x30/0x17e > [ 3304.536587] RSP <ffff81031d9959b8> > [ 3304.683415] ---[ end trace b4fa417924722506 ]--- > [ 3304.687414] Kernel panic - not syncing: Aiee, killing interrupt > handler! > [ 3304.687424] Rebooting in 1 seconds..gStatus Code Available > > ================================================================= > > Still feel that the RCA enclosed below based on just the EIP that I had > received is valid and the fix to remove the assert(bclink->bcast_nodes.count > > 0) with: > if(unlikely( !bclink->bcast_nodes.count) ) return 0; is still safe. > > Another spin could be to do this check in tipc_bclink_acknowledge for bcast > node count before retransmitting pending packets in the link outqueue > (congestion resolve step): > > /* Try resolving broadcast link congestion, if necessary */ > > > > if (unlikely(bcl->next_out) && likely(bcl->bcast_nodes.count) ) { > > tipc_link_push_queue(bcl); > > bclink_set_last_sent(); > > } > > > Since we have a pending link_release_outqueue anyway in the above > backtrace, it should be all safe. > > And even if we just remove the assert from tipc_bcbearer_send and return 0 > if bcast_nodes.count is 0, the link retransmit would just empty the > retransmit queue without sending anything as we would be faking the success > anyway. Should be still safe though a bit hackish than the check in > tipc_bclink_acknowledge but future changes in bcast path wouldn't impact > tipc_bcbearer_send if the assert is removed. (more safer) > > Here is the RCA enclosed below which was done on an incomplete backtrace > received yesterday but still valid: (hope you guys fix it officially in the > next TIPC release): > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Hi ..., > > Do the following change in : > > net/tipc/tipc_bcast.c: > > > > Function: > > tipc_bcbearer_send: > > REMOVE the: > > > > assert( bclink->bcast_nodes.count != 0); line > > > > and REPLACE with > > > > if( unlikely(!bclink->bcast_nodes.count) ) > > return 0; > > > > Analysis: > > > > The kernel panic is a result of a broadcast message transmit through the > tipc bcast link when the peer or bcast node count for connected broadcast > links became 0. > > > > Now there was already a bug earlier (1.6 time frame) where a tipc bcast > link packet from the socket was sneaking through in the window when the peer > nodes or link was getting reset. > > > > But this was addressed by immediately discarding the bcast link packet in > tipc_bclink_send_msg routine in tipc_bcast.c that checked for: > > bclink->bcast_nodes.count to be 0 and immediately discarded/dropped the > skbuff or packet by failing it as expected. > > > > This was done with the bc_lock spinlock held. So there wasn't any race with > a parallel link remove that decrements the bcast_nodes.count for the bcast > link. > > > > This is still safe. However there appears to be a path where a bcast packet > is trying to be sent within the tipc internal code without checking for the > presence of the bcast_nodes.count to be non-zero. > > > > There is no way a user space (asp related) bcast packet can sneak through > considering that guy is protected by the above check mentioned in > tipc_bclink_send_msg. > > > > And also any internal bcast protocol packet aren't marked for reliability > and there skbuff msg sequence number would force it to skip the check in: > > tipc_bcbearer_send: > > > > if (likely(!msg_non_seq(buf_msg(buf)))) { -> this won't be hit for BCAST > protocol tipc internal packets that aren't acked or reliable > > struct tipc_msg *msg; > > > > assert(bclink->bcast_nodes.count != 0); -> WE ARE ASSERTING OR > PANICKING HERE > > bcbuf_set_acks(buf, bclink->bcast_nodes.count); > > msg = buf_msg(buf); > > msg_set_non_seq(msg, 1); > > msg_set_mc_netid(msg, tipc_net_id); > > bcl->stats.sent_info++; > > } > > > > So the fix to remove the assert and return an implicit 0 or success for > bcbearer_send when bcast_nodes.count == 0 instead of assert would ensure > implicitly that such packets would be ignored or marked as a success. (like > tipc_bclink_send_msg does without any bcast links connected) > > > > Because it couldn't have been the path from tipc_bclink_send_msg as thats > protected for the bcast node count 0 check, it also could not have been a > BCAST protocol retransmit as those aren't marked or won't enter the "if" > block that triggers the assert check. > > > > I can only think of a tipc_bclink_update_link_state call as a result of a > tipc link STATE_MSG. Don't see any other path albeit a COMPLETE panic > backtrace would have made things clear. > > > > But I am 100% sure that with the above fix/change, you wouldn't see the > panic again. Atleast in the same path for sure :) > > And also I don't see any side effects for the above change considering the > main bc link sendmsg path is already protected in tipc. So I am guessing > that this was a tipc bcast link state update path that triggered the panic. > Either way, I feel the change should be safe. > > > > The disassembly conclusively proved the panic EIP as the assert: > > The EIP was pointing to : > > tipc_bcbearer_send + 0x30 along with the faulting instruction at: <0f> > (angular bracketed) in the hex dump of the Oops. that was immediately > followed by 0b . 0f0b is intel ud2a or undefined instruction trigger used by > the linux kernel to trigger a BUG or an Assert. as mentioned above. > > > > > > The below snippet corresponds to the above "C" code in tipc_bcbearer_send. > > > > And tipc_bcbearer_send + 0x904 is the place where we faulted. See how the > faulting instruction code is :0f 0b > > And see how the hex dump matches all the instructions in the buffer. Its > dead simple :) > > > > And EIP was tipc_bcbearer_send + 0x30 > > which is nothing but: > > > > tipc_bcbearer_send + 0x904 - tipc_bcbearer_send + 0x8d4 (start of the > text segment below in the disassembly for the function) > > > > 00000000000008d4 <tipc_bcbearer_send>: > > 8d4: 41 55 push %r13 > > 8d6: 41 54 push %r12 > > 8d8: 49 89 fc mov %rdi,%r12 > > 8db: 55 push %rbp > > 8dc: 53 push %rbx > > 8dd: 48 83 ec 08 sub $0x8,%rsp > > 8e1: 48 8b 97 d8 00 00 00 mov 0xd8(%rdi),%rdx > > 8e8: 8b 02 mov (%rdx),%eax > > 8ea: 0f c8 bswap %eax > > 8ec: a9 00 00 10 00 test $0x100000,%eax -> CHECK FOR > msg_non_seq which is the 20th bit right shifted or bcast protocol message > > 8f1: 75 3d jne 930 <tipc_bcbearer_send+0x5c> -> > we were hit for a normal bcast link packet. > > 8f3: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # 8fa > <tipc_bcbearer_send+0x26> > > 8fa: 8b 80 68 03 00 00 mov 0x368(%rax),%eax > > 900: 85 c0 test %eax,%eax > > > > /* annotation for the C code for the above assembly */ > > if (likely(!msg_non_seq(buf_msg(buf)))) > > > > 902: 75 04 jne 908 <tipc_bcbearer_send+0x34> > > 904: 0f 0b ud2a -> KABOOM as this is: "0f 0b" or > the kernel BUG macro that forces a panic through an undefined instr. > exception > > > > Regards, > > -Karthick > > > > > -- > Software is like sex: it's better when it's free. > -Linus Torvalds > > A.R.Karthick > http://github.com/karthick18 > -- Software is like sex: it's better when it's free. -Linus Torvalds A.R.Karthick http://github.com/karthick18 |