#416 Transmit Queue Hang - ixgbe


Ixgbe driver version 3.13.10 on CentOS 6.4 (2.6.32-358.2.1). lspci output is attached. Two ixgbe interfaces were bonded together with flow directors for udp turned on. They were connected to a 1Gig switch at this point of time. In all likelihood, this is not reproducible but we are still trying to reproduce in a similar setting.

The tx queue hung with this log:
[16562714.670055] ------------[ cut here ]------------
[16562714.735951] WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Tainted: G C --------------- )
[16562714.885246] Hardware name: PowerEdge M620
[16562714.944757] NETDEV WATCHDOG: eth0 (ixgbe): transmit queue 7 timed out
[16562714.944758] Modules linked in: drbd(U) ip6_tables dm_round_robin scsi_dh_rdac dm_multipath be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi kvm_intel kvm fnic libfcoe libfc scsi_transport_fc tun nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables bridge bonding ipv6 8021q garp stp mlx4_en mlx4_core bnx2x(U) mdio bnx2(U) be2net(U) zram(C) lzo_decompress lzo_compress dm_mod power_meter hwmon sb_edac edac_core iTCO_wdt iTCO_vendor_support shpchp ixgbe(U) dca [last unloaded: drbd]
[16562715.656195] Pid: 0, comm: swapper Tainted: G C --------------- 2.6.32-358.2.1.el6GUAVUSsmp guavus
[16562715.802761] Call Trace:
[16562715.848612] <IRQ> [<ffffffff81070af7>] ? warn_slowpath_common+0x87/0xc0
[16562715.947082] [<ffffffff81070be6>] ? warn_slowpath_fmt+0x46/0x50
[16562716.035165] [<ffffffff8162607d>] ? dev_watchdog+0x26d/0x280
[16562716.120362] [<ffffffff81012b19>] ? sched_clock+0x9/0x10
[16562716.201622] [<ffffffff81083435>] ? internal_add_timer+0xb5/0x110
[16562716.292597] [<ffffffff81625e10>] ? dev_watchdog+0x0/0x280
[16562716.376455] [<ffffffff81084047>] ? run_timer_softirq+0x197/0x340
[16562716.467865] [<ffffffff810aa800>] ? tick_sched_timer+0x0/0xc0
[16562716.555439] [<ffffffff8102e94d>] ? lapic_next_event+0x1d/0x30
[16562716.644321] [<ffffffff810797c1>] ? __do_softirq+0xc1/0x1e0
[16562716.730330] [<ffffffff8109df8b>] ? hrtimer_interrupt+0x14b/0x260
[16562716.823151] [<ffffffff8100c1cc>] ? call_softirq+0x1c/0x30
[16562716.908769] [<ffffffff8100de05>] ? do_softirq+0x65/0xa0
[16562716.992214] [<ffffffff810795a5>] ? irq_exit+0x85/0x90
[16562717.073467] [<ffffffff816e7380>] ? smp_apic_timer_interrupt+0x70/0x9b
[16562717.171480] [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
[16562717.265309] <EOI> [<ffffffff81014807>] ? mwait_idle+0x77/0xd0
[16562717.356399] [<ffffffff816e383a>] ? atomic_notifier_call_chain+0x1a/0x20
[16562717.456673] [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
[16562717.539067] [<ffffffff816d7062>] ? start_secondary+0x2ac/0x2ef
[16562717.629732] ---[ end trace 8152a568759c57d5 ]---

After this, things recovered on their own and their was no problem but we saw these logs quite frequently:
[16562717.704734] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 5 seconds
[16562727.642220] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 10 seconds
[16562747.598988] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 20 seconds
[16562767.556063] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 20 seconds
[16562787.512989] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 20 seconds
[16562807.470219] ixgbe 0000:01:00.0: eth0: Fake Tx hang detected with timeout of 20 seconds

ethtool stats show no packets in fdir_overflow or rx_no_dma_resources.
Output of ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: on
TX: on

3 Attachments


  • Emil Tantilov

    Emil Tantilov - 2014-06-19

    The fake Tx hang message indicates that the hang was in SW not HW (as in the HW was able to push packets, but it didn't get up the stack for some reason). This can happen easily if the interface was overwhelmed with flow control packets. Do you have the output of ethtool -S from the time the Tx hang occurred?

    In the case of a HW hang the driver will dump the contents of the descriptors.

  • Abhishek Singh

    Abhishek Singh - 2014-06-19

    The ethtool -S output attached in the bug (eth_stats file - eth0 is the interface) was taken a few days after the hang occurred. However, the system was not rebooted so it should probably contain enough information.

    While the "fake tx hang" logs do not seem to cause much of an issue (they are repeated several times but no operation impact observed till now), the one hang reported by watchdog (shown in bug description - dmesg is attached) is the one which is problematic as it caused actual network outage. I did not see any descriptors being dumped either in dmesg or anywhere else though.

  • Todd Fujinaka

    Todd Fujinaka - 2014-06-20

    Are you trying to pump 10G of traffic through the 1G link? That would certainly lead to congestoin due to lack of resources and lots of flow control that would lead to the Tx being paused. The pause would, in turn, lead to the timeouts.

    • Abhishek Singh

      Abhishek Singh - 2014-06-23

      It doesn't look like we are pumping too much traffic.
      [admin@server ~]# ethtool eth0
      Settings for eth0:
      Supported ports: [ FIBRE ]
      Supported link modes: 10000baseT/Full
      Supported pause frame use: No
      Supports auto-negotiation: Yes
      Advertised link modes: 10000baseT/Full
      Advertised pause frame use: No
      Advertised auto-negotiation: Yes
      Speed: 1000Mb/s
      Duplex: Full
      Port: Other
      PHYAD: 0
      Transceiver: external
      Auto-negotiation: on
      Supports Wake-on: umbg
      Wake-on: g
      Current message level: 0x00000007 (7)
      drv probe link
      Link detected: yes

      Ethtool reports "pause frame use" as no. Can it still be because of the flow control?

  • Todd Fujinaka

    Todd Fujinaka - 2014-06-20
    • assigned_to: Todd Fujinaka
  • Emil Tantilov

    Emil Tantilov - 2014-06-23

    The trace as result of the tx hang will only show once after each reboot. If you look at net/sched/sch_generic.c:261 you will notice that it is done via WARN_ONCE. The fact that you see the "Fake Tx hang" messages however indicates that you are hitting the same code path quite often. The driver is preventing an actual reset (tx_timeout_count: 0) by allowing more time for the packets to pass. You need to figure out what is causing the delays.

    As for flow control - this is just one of the factors that may contribute to a "fake" Tx hang. The eth0 interface is receiving steady amount of xoff packets:

     tx_flow_control_xon: 5
     rx_flow_control_xon: 6321931
     tx_flow_control_xoff: 18
     rx_flow_control_xoff: 6711447

    so even if it's not the main cause of the delayed packets it's certainly not helping. On this interface flow control negotiation is not supported, but it is enabled, as you can see from the ethtool -a output (and the counters above):

    Pause parameters for eth0:
    Autonegotiate: off
    RX: on
    TX: on

    You can try and disable flow control and see if that changes anything - you will most likely see packets being dropped instead.

    You can try and get a perf log to see if it will show where the bottleneck is. Also if you could provide the parameters used to load the driver and any subsequent configurations done to the interface via ethtool we may have a better idea of what is going on.

    Aside from the xoff packets I didn't notice anything else that could be contributing to this, one thing of note is the fdir counts:

     fdir_match: 27162
     fdir_miss: 118287733666

    perhaps because your traffic is mostly unidirectional, but you're not getting much out of flow director, you may be better of disabling it and defaulting to RSS instead.

  • Abhishek Singh

    Abhishek Singh - 2014-06-24

    We actually configure the flow director on all interfaces through our application as we don't know which interface will actually receive the intended traffic. And in this case, the interface which hung has flow director configured but almost no traffic which could match the flow directors.
    The commands we give are like this :
    ethtool -K eth0 ntuple on
    ethtool -N eth0 rx-flow-hash udp4 sdfn
    ethtool -U eth0 flow-type udp4 dst-port <port> action <flow>
    echo <affinity> > /proc/irq/<intr_num>/smp_affinity

    We are trying to observe the effects of disabling and enbling flow control on this interface in a similar setup. Will update here.

    ixgbe driver is inserted with "MQ=1 DCA=0" parameters in this scenario.

  • Todd Fujinaka

    Todd Fujinaka - 2015-03-11
    • status: open --> closed

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks