#502 igb Tx Unit Hang - Intel I210 NIC

wont-fix
nobody
None
standalone_driver
1
2016-02-29
2015-11-29
Thomas Jepp
No

Hi,

I'm using a Supermicro X11SBA-LN4F board (Intel Pentium N3700 based with 4x I210 NICs).

This board is very new - there are no available BIOS updates from Supermicro. As such, I'm running the BIOS it shipped with.

I'm getting Tx Unit Hangs on the NICs - most notably the NICs at 05:00.0 (eth1) and 06:00.0 (eth2) - they are the NICs that carry most of the traffic. They seem to happen the most under higher load.

02:00.0 - eth0: management network interface (virtually no traffic).
05:00.0 - eth1 is using a MTU of 1508 and is using PPPoE.
06:00.0 - eth2 is carrying several VLANs to a switch.
07:00.0 - unused at present.

Currently running Debian Jessie (Linux router 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u6 (2015-11-09) x86_64 GNU/Linux) with the out-of-kernel igb driver from igb-5.3.3.2.tar.gz. The same issue occurs with the standard debian kernel module. This is a home system, so I'm happy to use whatever OS/kernel/driver combination is necessary to debug.

I've attached a dmesg log showing the Tx hangs on one NIC (full log from boot), and lspci -vvv.

2 Attachments

Discussion

  • Thomas Jepp

    Thomas Jepp - 2015-11-29
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -11,6 +11,6 @@
     06:00.0 - eth2 is carrying several VLANs to a switch.
     07:00.0 - unused at present.
    
    -Currently running Debian Jessie (Linux router 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u6 (2015-11-09) x86_64 GNU/Linux) with the out-of-kernel igb driver from igb-5.3.3.2.tar.gz. The same issue occurs with the standard debian kernel. This is a home system, so I'm happy to use whatever OS/kernel/driver combination is necessary to debug.
    +Currently running Debian Jessie (Linux router 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u6 (2015-11-09) x86_64 GNU/Linux) with the out-of-kernel igb driver from igb-5.3.3.2.tar.gz. The same issue occurs with the standard debian kernel module. This is a home system, so I'm happy to use whatever OS/kernel/driver combination is necessary to debug.
    
     I've attached a dmesg log showing the Tx hangs on one NIC (full log from boot), and lspci -vvv.
    
     
  • Thomas Jepp

    Thomas Jepp - 2015-12-01

    I contacted Supermicro and they advised me to update the LAN EEPROM firmware and provided an updater.

    I am currently checking to see if it resolves the issue.

     
  • Thomas Jepp

    Thomas Jepp - 2015-12-01

    I've further worked on this and with another X11SBA-LN4F owner, discovered:

    eth0 seems to actually work fine, and eth1, eth2 and eth3 are the problem cards.

    I don't think it's a driver issue, and I am talking to SM support to try and get it resolved.

     
  • Thomas Jepp

    Thomas Jepp - 2015-12-07

    SM have confirmed that this is a hardware issue - I have been advised to RMA my board.

    It is apparently fixed in board revision 1.02.

    I'll close this ticket - not a driver issue!

     
  • Thomas Jepp

    Thomas Jepp - 2015-12-07
    • status: open --> wont-fix
     
  • Thomas Jepp

    Thomas Jepp - 2016-02-29

    Just to update this: Supermicro replaced the board with one they claimed they had repaired and it lessened the frequency of the fault, but it didn't go away entirely. The fault appears to be caused by the PCI Express switch chip they have placed the NICs behind. The fault seems to occur on all X11SBA-LN4F boards, and is not limited to just Linux - it happens on Windows and FreeBSD as well! I've had contact with multiple other customers with the same issues.

    I ended up returning the board to my retailer for a full refund as the board seems to have an inherent fault and SM have failed to resolve it in a reasonable timeframe.

     
    Last edit: Thomas Jepp 2016-02-29

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks