From: Daniel S. C. <ds...@te...> - 2014-05-08 17:29:30
|
Hello, I have a pair of Dell PowerEdge R720s running Fedora 20 and a 3.13.11 kernel built from kernel.org. The system has 7 dual port x520 cards, and becomes unstable when I enable forwarding. system details: # ethtool -i eth0 driver: ixgbe version: 3.15.1-k firmware-version: 0x80000656 bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no # lspci | grep Ethernet 03:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 03:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 04:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 04:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 05:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 05:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 41:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 41:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 42:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 42:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 43:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 43:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 44:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 44:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) LRO is not enabled: # ethtool -S eth0 | grep lro Using intel optics on the x520 side, after enabling forwarding, I get the below error looping: May 6 14:32:08 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 5 May 6 14:32:08 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 5 May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter When I try using a direct attach cable, I get the below error, but the system seems to recover: May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: detected SFP+: 6 May 7 17:12:13 njdc2 kernel: ------------[ cut here ]------------ May 7 17:12:13 njdc2 kernel: WARNING: CPU: 4 PID: 60 at kernel/softirq.c:156 local_bh_enable_ip+0x5a/0x80() May 7 17:12:13 njdc2 kernel: Modules linked in: netconsole 8021q igb ipmi_devintf iTCO_wdt ixgbe i2c_algo_bit i2c_core ipmi_si lpc_ich mdio ipmi_msghandler mfd_core ptp pps_core hid_generic usbhid hid ahci libahci megaraid_sas May 7 17:12:13 njdc2 kernel: CPU: 4 PID: 60 Comm: kworker/4:1 Not tainted 3.13.11-CBS.1.fc20.x86_64 #1 May 7 17:12:13 njdc2 kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 01/16/2014 May 7 17:12:13 njdc2 kernel: Workqueue: events ixgbe_service_task [ixgbe] May 7 17:12:13 njdc2 kernel: 0000000000000009 ffffffff81324f33 0000000000000000 ffffffff8103327d May 7 17:12:13 njdc2 kernel: ffff8800b8d80440 ffff8800b8d804f0 0000000000000100 ffff88007d348000 May 7 17:12:13 njdc2 kernel: 0000000000000010 ffffffff81036caa ffffffffa01ec6c6 ffff880000000000 May 7 17:12:13 njdc2 kernel: Call Trace: May 7 17:12:13 njdc2 kernel: [<ffffffff81324f33>] ? dump_stack+0x41/0x51 May 7 17:12:13 njdc2 kernel: [<ffffffff8103327d>] ? warn_slowpath_common+0x6d/0x90 May 7 17:12:13 njdc2 kernel: [<ffffffff81036caa>] ? local_bh_enable_ip+0x5a/0x80 May 7 17:12:13 njdc2 kernel: [<ffffffffa01ec6c6>] ? ixgbe_poll+0x3a6/0x800 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff812b2ed4>] ? netpoll_poll_dev+0xd4/0x580 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3631>] ? netpoll_send_skb_on_dev+0x2b1/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3a58>] ? netpoll_send_udp+0x268/0x380 May 7 17:12:13 njdc2 kernel: [<ffffffffa004723b>] ? write_msg+0xbb/0x108 [netconsole] May 7 17:12:13 njdc2 kernel: [<ffffffff81062fc3>] ? call_console_drivers.constprop.25+0x83/0xa0 May 7 17:12:13 njdc2 kernel: [<ffffffff810634f8>] ? console_unlock+0x398/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff81063775>] ? vprintk_emit+0x245/0x480 May 7 17:12:13 njdc2 kernel: [<ffffffff81219d9d>] ? dev_vprintk_emit+0x3d/0x50 May 7 17:12:13 njdc2 kernel: [<ffffffff8105d5f3>] ? idle_balance+0x173/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff81325fc2>] ? __schedule+0x242/0x5c0 May 7 17:12:13 njdc2 kernel: [<ffffffff81219de9>] ? dev_printk_emit+0x39/0x40 May 7 17:12:13 njdc2 kernel: [<ffffffff8129a8c4>] ? __netdev_printk+0x74/0xf0 May 7 17:12:13 njdc2 kernel: [<ffffffff8129ac27>] ? netdev_info+0x57/0x60 May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff8104664c>] ? process_one_work+0x13c/0x3c0 May 7 17:12:13 njdc2 kernel: [<ffffffff810471b6>] ? worker_thread+0x116/0x3a0 May 7 17:12:13 njdc2 kernel: [<ffffffff810470a0>] ? manage_workers.isra.27+0x280/0x280 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c9e1>] ? kthread+0xc1/0xe0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff813299bc>] ? ret_from_fork+0x7c/0xb0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: ---[ end trace 327c5ae308408b67 ]--- May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: detected SFP+: 6 May 7 17:12:13 njdc2 kernel: ------------[ cut here ]------------ May 7 17:12:13 njdc2 kernel: WARNING: CPU: 4 PID: 60 at kernel/softirq.c:156 local_bh_enable_ip+0x5a/0x80() May 7 17:12:13 njdc2 kernel: Modules linked in: netconsole 8021q igb ipmi_devintf iTCO_wdt ixgbe i2c_algo_bit i2c_core ipmi_si lpc_ich mdio ipmi_msghandler mfd_core ptp pps_core hid_generic usbhid hid ahci libahci megaraid_sas May 7 17:12:13 njdc2 kernel: CPU: 4 PID: 60 Comm: kworker/4:1 Not tainted 3.13.11-CBS.1.fc20.x86_64 #1 May 7 17:12:13 njdc2 kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 01/16/2014 May 7 17:12:13 njdc2 kernel: Workqueue: events ixgbe_service_task [ixgbe] May 7 17:12:13 njdc2 kernel: 0000000000000009 ffffffff81324f33 0000000000000000 ffffffff8103327d May 7 17:12:13 njdc2 kernel: ffff8800b8d80440 ffff8800b8d804f0 0000000000000100 ffff88007d348000 May 7 17:12:13 njdc2 kernel: 0000000000000010 ffffffff81036caa ffffffffa01ec6c6 ffff880000000000 May 7 17:12:13 njdc2 kernel: Call Trace: May 7 17:12:13 njdc2 kernel: [<ffffffff81324f33>] ? dump_stack+0x41/0x51 May 7 17:12:13 njdc2 kernel: [<ffffffff8103327d>] ? warn_slowpath_common+0x6d/0x90 May 7 17:12:13 njdc2 kernel: [<ffffffff81036caa>] ? local_bh_enable_ip+0x5a/0x80 May 7 17:12:13 njdc2 kernel: [<ffffffffa01ec6c6>] ? ixgbe_poll+0x3a6/0x800 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff812b2ed4>] ? netpoll_poll_dev+0xd4/0x580 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3631>] ? netpoll_send_skb_on_dev+0x2b1/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3a58>] ? netpoll_send_udp+0x268/0x380 May 7 17:12:13 njdc2 kernel: [<ffffffffa004723b>] ? write_msg+0xbb/0x108 [netconsole] May 7 17:12:13 njdc2 kernel: [<ffffffff81062fc3>] ? call_console_drivers.constprop.25+0x83/0xa0 May 7 17:12:13 njdc2 kernel: [<ffffffff810634f8>] ? console_unlock+0x398/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff81063775>] ? vprintk_emit+0x245/0x480 May 7 17:12:13 njdc2 kernel: [<ffffffff81219d9d>] ? dev_vprintk_emit+0x3d/0x50 May 7 17:12:13 njdc2 kernel: [<ffffffff8105d5f3>] ? idle_balance+0x173/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff81325fc2>] ? __schedule+0x242/0x5c0 May 7 17:12:13 njdc2 kernel: [<ffffffff81219de9>] ? dev_printk_emit+0x39/0x40 May 7 17:12:13 njdc2 kernel: [<ffffffff8129a8c4>] ? __netdev_printk+0x74/0xf0 May 7 17:12:13 njdc2 kernel: [<ffffffff8129ac27>] ? netdev_info+0x57/0x60 May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff8104664c>] ? process_one_work+0x13c/0x3c0 May 7 17:12:13 njdc2 kernel: [<ffffffff810471b6>] ? worker_thread+0x116/0x3a0 May 7 17:12:13 njdc2 kernel: [<ffffffff810470a0>] ? manage_workers.isra.27+0x280/0x280 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c9e1>] ? kthread+0xc1/0xe0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff813299bc>] ? ret_from_fork+0x7c/0xb0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: ---[ end trace 327c5ae308408b67 ]--- May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.0 eth2: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.0 eth2: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.1 eth3: detected SFP+: 6 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.1 eth3: detected SFP+: 6 May 7 17:12:14 njdc2 kernel: ixgbe 0000:05:00.0 eth4: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:05:00.0 eth4: detected SFP+: 5 May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.0 eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.0 eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Down May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Down May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:05:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:05:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:18 njdc2 kernel: ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:18 njdc2 kernel: ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX I'm connecting to a Cisco Nexus 5K, Cisco Nexus 7K and a Cisco ASR 9K. Any assistance would be greatly appreciated. I have other PowerEdge variations running Intel X520 nics without any issues, so I'm a bit perplexed. Thank you in advance, Dan |
From: Daniel S. C. <da...@d-...> - 2014-05-08 17:35:29
|
Hello, I have a pair of Dell PowerEdge R720s running Fedora 20 and a 3.13.11 kernel built from kernel.org. The system has 7 dual port x520 cards, and becomes unstable when I enable forwarding. system details: # ethtool -i eth0 driver: ixgbe version: 3.15.1-k firmware-version: 0x80000656 bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no # lspci | grep Ethernet 03:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 03:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 04:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 04:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 05:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 05:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 41:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 41:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 42:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 42:00.1 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01) 43:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 43:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 44:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 44:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) LRO is not enabled: # ethtool -S eth0 | grep lro Using intel optics on the x520 side, after enabling forwarding, I get the below error looping: May 6 14:32:08 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 5 May 6 14:32:08 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 5 May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 6 14:32:09 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter When I try using a direct attach cable, I get the below error, but the system seems to recover: May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: initiating reset to clear Tx work after link loss May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: Reset adapter May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: detected SFP+: 6 May 7 17:12:13 njdc2 kernel: ------------[ cut here ]------------ May 7 17:12:13 njdc2 kernel: WARNING: CPU: 4 PID: 60 at kernel/softirq.c:156 local_bh_enable_ip+0x5a/0x80() May 7 17:12:13 njdc2 kernel: Modules linked in: netconsole 8021q igb ipmi_devintf iTCO_wdt ixgbe i2c_algo_bit i2c_core ipmi_si lpc_ich mdio ipmi_msghandler mfd_core ptp pps_core hid_generic usbhid hid ahci libahci megaraid_sas May 7 17:12:13 njdc2 kernel: CPU: 4 PID: 60 Comm: kworker/4:1 Not tainted 3.13.11-CBS.1.fc20.x86_64 #1 May 7 17:12:13 njdc2 kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 01/16/2014 May 7 17:12:13 njdc2 kernel: Workqueue: events ixgbe_service_task [ixgbe] May 7 17:12:13 njdc2 kernel: 0000000000000009 ffffffff81324f33 0000000000000000 ffffffff8103327d May 7 17:12:13 njdc2 kernel: ffff8800b8d80440 ffff8800b8d804f0 0000000000000100 ffff88007d348000 May 7 17:12:13 njdc2 kernel: 0000000000000010 ffffffff81036caa ffffffffa01ec6c6 ffff880000000000 May 7 17:12:13 njdc2 kernel: Call Trace: May 7 17:12:13 njdc2 kernel: [<ffffffff81324f33>] ? dump_stack+0x41/0x51 May 7 17:12:13 njdc2 kernel: [<ffffffff8103327d>] ? warn_slowpath_common+0x6d/0x90 May 7 17:12:13 njdc2 kernel: [<ffffffff81036caa>] ? local_bh_enable_ip+0x5a/0x80 May 7 17:12:13 njdc2 kernel: [<ffffffffa01ec6c6>] ? ixgbe_poll+0x3a6/0x800 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff812b2ed4>] ? netpoll_poll_dev+0xd4/0x580 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3631>] ? netpoll_send_skb_on_dev+0x2b1/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3a58>] ? netpoll_send_udp+0x268/0x380 May 7 17:12:13 njdc2 kernel: [<ffffffffa004723b>] ? write_msg+0xbb/0x108 [netconsole] May 7 17:12:13 njdc2 kernel: [<ffffffff81062fc3>] ? call_console_drivers.constprop.25+0x83/0xa0 May 7 17:12:13 njdc2 kernel: [<ffffffff810634f8>] ? console_unlock+0x398/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff81063775>] ? vprintk_emit+0x245/0x480 May 7 17:12:13 njdc2 kernel: [<ffffffff81219d9d>] ? dev_vprintk_emit+0x3d/0x50 May 7 17:12:13 njdc2 kernel: [<ffffffff8105d5f3>] ? idle_balance+0x173/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff81325fc2>] ? __schedule+0x242/0x5c0 May 7 17:12:13 njdc2 kernel: [<ffffffff81219de9>] ? dev_printk_emit+0x39/0x40 May 7 17:12:13 njdc2 kernel: [<ffffffff8129a8c4>] ? __netdev_printk+0x74/0xf0 May 7 17:12:13 njdc2 kernel: [<ffffffff8129ac27>] ? netdev_info+0x57/0x60 May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff8104664c>] ? process_one_work+0x13c/0x3c0 May 7 17:12:13 njdc2 kernel: [<ffffffff810471b6>] ? worker_thread+0x116/0x3a0 May 7 17:12:13 njdc2 kernel: [<ffffffff810470a0>] ? manage_workers.isra.27+0x280/0x280 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c9e1>] ? kthread+0xc1/0xe0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff813299bc>] ? ret_from_fork+0x7c/0xb0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: ---[ end trace 327c5ae308408b67 ]--- May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: detected SFP+: 6 May 7 17:12:13 njdc2 kernel: ------------[ cut here ]------------ May 7 17:12:13 njdc2 kernel: WARNING: CPU: 4 PID: 60 at kernel/softirq.c:156 local_bh_enable_ip+0x5a/0x80() May 7 17:12:13 njdc2 kernel: Modules linked in: netconsole 8021q igb ipmi_devintf iTCO_wdt ixgbe i2c_algo_bit i2c_core ipmi_si lpc_ich mdio ipmi_msghandler mfd_core ptp pps_core hid_generic usbhid hid ahci libahci megaraid_sas May 7 17:12:13 njdc2 kernel: CPU: 4 PID: 60 Comm: kworker/4:1 Not tainted 3.13.11-CBS.1.fc20.x86_64 #1 May 7 17:12:13 njdc2 kernel: Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 01/16/2014 May 7 17:12:13 njdc2 kernel: Workqueue: events ixgbe_service_task [ixgbe] May 7 17:12:13 njdc2 kernel: 0000000000000009 ffffffff81324f33 0000000000000000 ffffffff8103327d May 7 17:12:13 njdc2 kernel: ffff8800b8d80440 ffff8800b8d804f0 0000000000000100 ffff88007d348000 May 7 17:12:13 njdc2 kernel: 0000000000000010 ffffffff81036caa ffffffffa01ec6c6 ffff880000000000 May 7 17:12:13 njdc2 kernel: Call Trace: May 7 17:12:13 njdc2 kernel: [<ffffffff81324f33>] ? dump_stack+0x41/0x51 May 7 17:12:13 njdc2 kernel: [<ffffffff8103327d>] ? warn_slowpath_common+0x6d/0x90 May 7 17:12:13 njdc2 kernel: [<ffffffff81036caa>] ? local_bh_enable_ip+0x5a/0x80 May 7 17:12:13 njdc2 kernel: [<ffffffffa01ec6c6>] ? ixgbe_poll+0x3a6/0x800 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff812b2ed4>] ? netpoll_poll_dev+0xd4/0x580 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3631>] ? netpoll_send_skb_on_dev+0x2b1/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff812b3a58>] ? netpoll_send_udp+0x268/0x380 May 7 17:12:13 njdc2 kernel: [<ffffffffa004723b>] ? write_msg+0xbb/0x108 [netconsole] May 7 17:12:13 njdc2 kernel: [<ffffffff81062fc3>] ? call_console_drivers.constprop.25+0x83/0xa0 May 7 17:12:13 njdc2 kernel: [<ffffffff810634f8>] ? console_unlock+0x398/0x3d0 May 7 17:12:13 njdc2 kernel: [<ffffffff81063775>] ? vprintk_emit+0x245/0x480 May 7 17:12:13 njdc2 kernel: [<ffffffff81219d9d>] ? dev_vprintk_emit+0x3d/0x50 May 7 17:12:13 njdc2 kernel: [<ffffffff8105d5f3>] ? idle_balance+0x173/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff81325fc2>] ? __schedule+0x242/0x5c0 May 7 17:12:13 njdc2 kernel: [<ffffffff81219de9>] ? dev_printk_emit+0x39/0x40 May 7 17:12:13 njdc2 kernel: [<ffffffff8129a8c4>] ? __netdev_printk+0x74/0xf0 May 7 17:12:13 njdc2 kernel: [<ffffffff8129ac27>] ? netdev_info+0x57/0x60 May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffffa01f043e>] ? ixgbe_service_task+0xee/0x1150 [ixgbe] May 7 17:12:13 njdc2 kernel: [<ffffffff8104664c>] ? process_one_work+0x13c/0x3c0 May 7 17:12:13 njdc2 kernel: [<ffffffff810471b6>] ? worker_thread+0x116/0x3a0 May 7 17:12:13 njdc2 kernel: [<ffffffff810470a0>] ? manage_workers.isra.27+0x280/0x280 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c9e1>] ? kthread+0xc1/0xe0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: [<ffffffff813299bc>] ? ret_from_fork+0x7c/0xb0 May 7 17:12:13 njdc2 kernel: [<ffffffff8104c920>] ? kthread_create_on_node+0x180/0x180 May 7 17:12:13 njdc2 kernel: ---[ end trace 327c5ae308408b67 ]--- May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.0 eth0: detected SFP+: 3 May 7 17:12:13 njdc2 kernel: ixgbe 0000:03:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.0 eth2: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.0 eth2: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.1 eth3: detected SFP+: 6 May 7 17:12:14 njdc2 kernel: ixgbe 0000:04:00.1 eth3: detected SFP+: 6 May 7 17:12:14 njdc2 kernel: ixgbe 0000:05:00.0 eth4: detected SFP+: 5 May 7 17:12:14 njdc2 kernel: ixgbe 0000:05:00.0 eth4: detected SFP+: 5 May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.0 eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.0 eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Down May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Down May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:04:00.1 eth3: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:05:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:15 njdc2 kernel: ixgbe 0000:05:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:18 njdc2 kernel: ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX May 7 17:12:18 njdc2 kernel: ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX I'm connecting to a Cisco Nexus 5K, Cisco Nexus 7K and a Cisco ASR 9K. Any assistance would be greatly appreciated. I have other PowerEdge variations running Intel X520 nics without any issues, so I'm a bit perplexed. Thank you in advance, Dan |
From: Tantilov, E. S <emi...@in...> - 2014-05-08 17:55:29
|
>-----Original Message----- >From: Daniel S. Cohen [mailto:ds...@te...] >Sent: Thursday, May 08, 2014 10:10 AM >To: e10...@li... >Subject: [E1000-devel] Issues with x520/ixgbe when enabling >forwarding > >Hello, > >I have a pair of Dell PowerEdge R720s running Fedora 20 and >a 3.13.11 >kernel built from kernel.org. The system has 7 dual port x520 cards, >and becomes unstable when I enable forwarding. > >system details: > ># ethtool -i eth0 >driver: ixgbe >version: 3.15.1-k >firmware-version: 0x80000656 >bus-info: 0000:03:00.0 >supports-statistics: yes >supports-test: yes >supports-eeprom-access: yes >supports-register-dump: yes >supports-priv-flags: no This is a known bug in ixgbe where ndo_start_xmit can be called while the driver is resetting. This is pretty easy to see when using netconsole. This is the upstream patch: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c?id=cdc04dcce0598fead6029a2f95e95a4d2ea419c2 It is also resolved in the current ixgbe version on SF. Thanks, Emil |
From: Daniel S. C. <da...@d-...> - 2014-05-08 18:40:58
|
Emil, Thanks for your response. After rebuilding the 3.13.11 kernel with the patch, the system freezes when enabling forwarding. Have you seen this behavior with the patch applied? I'll try the updated driver on the SF next. Dan On Thu, May 8, 2014 at 1:55 PM, Tantilov, Emil S <emi...@in...> wrote: >>-----Original Message----- >>From: Daniel S. Cohen [mailto:ds...@te...] >>Sent: Thursday, May 08, 2014 10:10 AM >>To: e10...@li... >>Subject: [E1000-devel] Issues with x520/ixgbe when enabling >>forwarding >> >>Hello, >> >>I have a pair of Dell PowerEdge R720s running Fedora 20 and >>a 3.13.11 >>kernel built from kernel.org. The system has 7 dual port x520 cards, >>and becomes unstable when I enable forwarding. >> >>system details: >> >># ethtool -i eth0 >>driver: ixgbe >>version: 3.15.1-k >>firmware-version: 0x80000656 >>bus-info: 0000:03:00.0 >>supports-statistics: yes >>supports-test: yes >>supports-eeprom-access: yes >>supports-register-dump: yes >>supports-priv-flags: no > > This is a known bug in ixgbe where ndo_start_xmit can be called while the driver is resetting. This is pretty easy to see when using netconsole. > > This is the upstream patch: > https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c?id=cdc04dcce0598fead6029a2f95e95a4d2ea419c2 > > It is also resolved in the current ixgbe version on SF. > > Thanks, > Emil > |
From: Tantilov, E. S <emi...@in...> - 2014-05-08 19:10:57
|
>-----Original Message----- >From: da...@ra... [mailto:da...@ra...] >On Behalf Of Daniel S. Cohen >Sent: Thursday, May 08, 2014 11:41 AM >To: Tantilov, Emil S >Cc: e10...@li... >Subject: Re: [E1000-devel] Issues with x520/ixgbe when >enabling forwarding > >Emil, > >Thanks for your response. After rebuilding the 3.13.11 >kernel with the patch, the system freezes when enabling forwarding. >Have you seen this behavior with the patch applied? I'll try the updated >driver on the SF next. Yeah, there is also an issue with netpoll and busy_poll which is what you may be running into. The SF driver has a fix for this, so let me know if you still see it with ixgbe 3.21.2 driver. I can't say for certain without seeing some kind of a trace though. Thanks, Emil > >Dan |
From: Daniel S. C. <da...@d-...> - 2014-05-09 03:16:02
|
Emil, I tested the 3.21.2 driver, and all appears to be well! Thank for your assistance. Dan On Thu, May 8, 2014 at 3:10 PM, Tantilov, Emil S <emi...@in...> wrote: >>-----Original Message----- >>From: da...@ra... [mailto:da...@ra...] >>On Behalf Of Daniel S. Cohen >>Sent: Thursday, May 08, 2014 11:41 AM >>To: Tantilov, Emil S >>Cc: e10...@li... >>Subject: Re: [E1000-devel] Issues with x520/ixgbe when >>enabling forwarding >> >>Emil, >> >>Thanks for your response. After rebuilding the 3.13.11 >>kernel with the patch, the system freezes when enabling forwarding. >>Have you seen this behavior with the patch applied? I'll try the updated >>driver on the SF next. > > Yeah, there is also an issue with netpoll and busy_poll which is what you may be running into. The SF driver has a fix for this, so let me know if you still see it with ixgbe 3.21.2 driver. I can't say for certain without seeing some kind of a trace though. > > Thanks, > Emil > >> >>Dan > |