#372 SuperMicro X9SCM-F. Ethernet 82579LM. Forwarding users traffic = Detected Hardware Unit Chang

open
dertman
standalone_driver
1
5 days ago
2013-03-31
Izvekov Antonio
No

Hello!
Sorry for my English!

I am the administrator ISP.
We bought two servers, the server is built on the motherboard SuperMicro X9SCM-F.

The board has two Ethernet ports 1G on two different chips of Intel
 # Lspci | grep Eth
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 05)
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

Server is used to route traffic to customers. The scheme:
Client <-> "eth1 Server eth0" <-> Internet

NAT is done on the server and Shaping. But at the moment it all off, and the server does only forwarding.

We are faced with a problem. after a couple of minutes in this mode, the server generates an error: eth0: Detected Hardware Unit Chang

the bug is reproducible on a different version of the drivers from different nuclei
Options have been tried:
Kernel = 2.6.30, 3.7.10
Driver = 2.1, 2.2, 2.3.
ways of links to webpages:
pcie_aspm = off
BIOS: enable \ disable aspm
ethtool-A eth0 autoneg off rx off
modprobe e1000e InterruptThrottleRate = 0,0
modprobe e1000e SmartPowerDownEnable = 1 KumeranLockLoss = 0 IntMode = 1 EEE = 0
make CFLAGS_EXTRA=-DDISABLE_PCI_MSI install

eth0 is built on the chip 82579LM,

Below I will give all the necessary information, the errors that occur Command output, the output network card driver in debug mode with the output packages ... I look forward to your help, we can of course buy two PCI-E card in the server at the other chip, but I think this is not quite correct, because the server already has two we need 1G card ....

We've tried other versions of the server using the vlan, here they are:

Client <-> "eth1 Server eth1" <-> Internet = FAIL :(
Client <-> "eth0 Server eth0" <-> Internet = Good work... :)

lspci -vvv

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 05)
Subsystem: Super Micro Computer Inc Device 1502
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort-="">SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 41
Region 0: Memory at dfa00000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at dfa25000 (32-bit, non-prefetchable) [size=4K]
Region 2: I/O ports at f020 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0300c Data: 41d3
Capabilities: [e0] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: e1000e

02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Super Micro Computer Inc Device 0000
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort-="">SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at df900000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at e000 [size=32]
Region 3: Memory at df920000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [a0] MSI-X: Enable+ Count=5 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-25-90-ff-ff-7f-71-56
Kernel driver in use: e1000e

modinfo e1000e

filename: /lib/modules/3.7.10-gentoo/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
version: 2.3.2-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, linux.nics@intel.com
srcversion: 281F7CB2F8EDEC01E32F57F
alias: pci:v00008086d00001559svsdbcsci*
...
depends:
vermagic: 3.7.10-gentoo SMP mod_unload
...

ethtool -e eth0

Offset Values
------ ------
0x0000 00 25 90 7f 71 57 00 08 ff ff d4 00 ff ff ff ff
0x0010 ff ff ff ff c3 10 02 15 d9 15 02 15 00 00 00 00
0x0020 02 07 00 00 00 00 05 a5 28 30 00 1a 00 00 00 0c
0x0030 f4 18 40 2b 43 08 13 01 02 15 ad ba 02 15 03 15
0x0040 ad ba ad ba ad ba 02 15 00 80 90 80 00 4e 86 08
0x0050 00 00 00 00 07 00 00 00 00 00 00 00 00 00 ff ff
0x0060 00 01 00 40 51 13 07 40 ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff 00 01 ff ff 6a ea
0x0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0090 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff
0x00a0 02 34 30 00 14 02 31 00 36 38 30 00 0f 00 31 00
....
....

ethtool -d eth0

MAC Registers

0x00000: CTRL (Device control register) 0x40100240
Endian mode (buffers): little
Link reset: normal
Set link up: 1
Invert Loss-Of-Signal: no
Receive flow control: disabled
Transmit flow control: disabled
VLAN mode: enabled
Auto speed detect: disabled
Speed select: 1000Mb/s
Force speed: no
Force duplex: no
0x00008: STATUS (Device status register) 0x40080083
Duplex: full
Link up: link config
TBI mode: disabled
Link speed: 1000Mb/s
Bus type: PCI
Bus speed: 33MHz
Bus width: 32-bit
0x00100: RCTL (Receive control register) 0x04008002
Receiver: enabled
Store bad packets: disabled
Unicast promiscuous: disabled
Multicast promiscuous: disabled
Long packet: disabled
Descriptor minimum threshold size: 1/2
Broadcast accept mode: accept
VLAN filter: disabled
Canonical form indicator: disabled
Discard pause frames: filtered
Pass MAC control frames: don't pass
Receive buffer size: 2048
0x02808: RDLEN (Receive desc length) 0x00001000
0x02810: RDH (Receive desc head) 0x000000EF
0x02818: RDT (Receive desc tail) 0x000000E0
0x02820: RDTR (Receive delay timer) 0x00000000
0x00400: TCTL (Transmit ctrl register) 0x3003F0FA
Transmitter: enabled
Pad short packets: enabled
Software XOFF Transmission: disabled
Re-transmit on late collision: disabled
0x03808: TDLEN (Transmit desc length) 0x00001000
0x03810: TDH (Transmit desc head) 0x0000004B
0x03818: TDT (Transmit desc tail) 0x0000004B
0x03820: TIDV (Transmit delay timer) 0x00000008
PHY type: unknown

ethtool -k eth0

Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on

ethtool -a eth0

Pause parameters for eth0:
Autonegotiate: on
RX: off
TX: off

ifconfig

eth0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.0.255.2 netmask 255.255.255.248 broadcast 10.0.255.7
inet6 fe80::225:90ff:fe7f:7157 prefixlen 64 scopeid 0x20<link>
ether 00:25:90:7f:71:57 txqueuelen 1000 (Ethernet)
RX packets 801494901 bytes 1171390459590 (1.0 TiB)
RX errors 0 dropped 12467 overruns 0 frame 0
TX packets 241488293 bytes 20926087334 (19.4 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 20 memory 0xdfa00000-dfa20000

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.9 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::225:90ff:fe7f:7156 prefixlen 64 scopeid 0x20<link>
ether 00:25:90:7f:71:56 txqueuelen 1000 (Ethernet)
RX packets 244447160 bytes 20963508112 (19.5 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 807765932 bytes 1172779804184 (1.0 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 16 memory 0xdf900000-df920000

error:

Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] TDH <51a>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] TDT <568>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] next_to_use <568>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] next_to_clean <518>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] buffer_info[next_to_clean]:
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] time_stamp <100263693>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] next_to_watch <51a>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] jiffies <100263fd4>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] next_to_watch.status <0>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] MAC Status <40080083>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] PHY Status <796d>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] PHY 1000BASE-T Status <7c00>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] PHY Extended Status <3000>
Mar 30 12:55:33 KEN-TEST kernel: [ 2804.368979] PCI Status <10>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] TDH <51a>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] TDT <568>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] next_to_use <568>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] next_to_clean <518>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] buffer_info[next_to_clean]:
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] time_stamp <100263693>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] next_to_watch <51a>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] jiffies <1002647a4>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] next_to_watch.status <0>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] MAC Status <40080083>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] PHY Status <796d>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] PHY 1000BASE-T Status <7c00>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] PHY Extended Status <3000>
Mar 30 12:55:35 KEN-TEST kernel: [ 2806.367055] PCI Status <10>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] TDH <51a>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] TDT <568>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] next_to_use <568>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] next_to_clean <518>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] buffer_info[next_to_clean]:
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] time_stamp <100263693>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] next_to_watch <51a>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] jiffies <100264f74>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] next_to_watch.status <0>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] MAC Status <40080083>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] PHY Status <796d>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] PHY 1000BASE-T Status <7c00>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] PHY Extended Status <3000>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.365095] PCI Status <10>
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376404] ------------[ cut here ]------------
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376412] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf4/0x154()
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376414] Hardware name: X9SCL/X9SCM
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376416] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Mar 30 12:55:37 KEN-TEST kernel: [ 0.874535] ACPI: Invalid Power Resource to register!
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376418] Modules linked in: xt_CLASSIFY sch_htb xt_CT iptable_raw xt_REDIRECT xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ipt_REJECT iptable_filter xt_mark xt_set xt_length iptable_mangle ip_tables ip_set_bitmap_port ip_set_hash_net ip_set_bitmap_ip ip_set_hash_ip nf_nat_pptp nf_nat_proto_gre nf_nat_proto_sctp libcrc32c nf_nat_ftp nf_nat_irc nf_nat_tftp nf_nat_h323 nf_nat_proto_dccp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_proto_sctp nf_conntrack_h323 nf_conntrack_netlink nf_conntrack_tftp nf_conntrack_ftp nf_conntrack_irc nf_conntrack_proto_dccp ip_set nf_nat nf_conntrack ipv6 e1000e unix
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376453] Pid: 0, comm: swapper/0 Not tainted 3.7.10-gentoo #4
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376455] Call Trace:
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376457] <IRQ> [<ffffffff81458c00>] ? dev_watchdog+0xa2/0x154
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376467] [<ffffffff8102f497>] warn_slowpath_common+0x7e/0x96
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376472] [<ffffffff81458b5e>] ? netif_tx_unlock+0x52/0x52
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376475] [<ffffffff81458b5e>] ? netif_tx_unlock+0x52/0x52
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376479] [<ffffffff8102f543>] warn_slowpath_fmt+0x41/0x43
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376483] [<ffffffff81458ad7>] ? netif_tx_lock+0x45/0x7a
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376488] [<ffffffff81458c52>] dev_watchdog+0xf4/0x154
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376492] [<ffffffff8105c623>] ? trigger_load_balance+0x58/0x1e2
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376496] [<ffffffff8103ac10>] call_timer_fn+0x56/0xe3
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376500] [<ffffffff81458b5e>] ? netif_tx_unlock+0x52/0x52
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376512] [<ffffffff8103c142>] run_timer_softirq+0x199/0x1e1
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376514] [<ffffffff8101cfd1>] ? apic_write+0x11/0x13
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376517] [<ffffffff81036038>] __do_softirq+0xd4/0x1ac
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376519] [<ffffffff8106c417>] ? tick_program_event+0x1f/0x21
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376523] [<ffffffff814f7e8c>] call_softirq+0x1c/0x30
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376526] [<ffffffff81003c81>] do_softirq+0x33/0x6a
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376527] [<ffffffff810361e3>] irq_exit+0x3f/0x9a
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376529] [<ffffffff8101d508>] smp_apic_timer_interrupt+0x77/0x85
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376531] [<ffffffff814f788a>] apic_timer_interrupt+0x6a/0x70
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376532] <EOI> [<ffffffff8104dbc1>] ? hrtimer_start+0x13/0x15
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376536] [<ffffffff81008cec>] ? mwait_idle+0x82/0xa7
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376538] [<ffffffff81008cdf>] ? mwait_idle+0x75/0xa7
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376540] [<ffffffff81009354>] cpu_idle+0x5d/0x9a
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376543] [<ffffffff814d40d9>] rest_init+0x6d/0x6f
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376546] [<ffffffff81aa8ad3>] start_kernel+0x345/0x352
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376548] [<ffffffff81aa8597>] ? repair_env_string+0x56/0x56
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376550] [<ffffffff81aa82aa>] x86_64_start_reservations+0xae/0xb2
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376552] [<ffffffff81aa839e>] x86_64_start_kernel+0xf0/0xf7
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376553] ---[ end trace 4e71432354cb5f18 ]---
Mar 30 12:55:37 KEN-TEST kernel: [ 2808.376559] e1000e 0000:00:19.0 eth0: Reset adapter
Mar 30 12:55:40 KEN-TEST kernel: [ 2811.841633] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

and error is ENABLE debug options in modprobe e1000e drivers:
attached files. is syslog message

1 Attachments

Discussion

    • group: in-kernel_driver --> standalone_driver
     
  • uname -a

    Linux KEN-TEST 3.7.10-gentoo #4 SMP Sat Mar 30 01:52:46 MSK 2013 x86_64 Intel(R)
    Core(TM) i3-3220 CPU @ 3.30GHz GenuineIntel GNU/Linux

    cat /proc/interrupts

           CPU0       CPU1
    

    0: 128 0 IO-APIC-edge timer
    1: 1083 4 IO-APIC-edge i8042
    8: 1 0 IO-APIC-edge rtc0
    9: 0 0 IO-APIC-fasteoi acpi
    12: 4 0 IO-APIC-edge i8042
    16: 11616606 907977 IO-APIC-fasteoi ehci_hcd:usb1
    20: 3153191 10213442 IO-APIC-fasteoi
    23: 85 45 IO-APIC-fasteoi ehci_hcd:usb2
    40: 105210 16064 PCI-MSI-edge ahci
    41: 367681129 41388397 PCI-MSI-edge eth0
    42: 124198914 38956605 PCI-MSI-edge eth1-rx-0
    43: 247966225 78650922 PCI-MSI-edge eth1-tx-0
    44: 5 1 PCI-MSI-edge eth1
    NMI: 14781 11512 Non-maskable interrupts
    LOC: 5690519 3582031 Local timer interrupts
    SPU: 0 0 Spurious interrupts
    PMI: 14781 11512 Performance monitoring interrupts
    IWI: 2 1 IRQ work interrupts
    RTR: 1 0 APIC ICR read retries
    RES: 182854 51362 Rescheduling interrupts
    CAL: 265 511 Function call interrupts
    TLB: 802125 807735 TLB shootdowns
    TRM: 0 0 Thermal event interrupts
    THR: 0 0 Threshold APIC interrupts
    MCE: 0 0 Machine check exceptions
    MCP: 276 276 Machine check polls
    ERR: 0
    MIS: 0

     
  • Todd Fujinaka
    Todd Fujinaka
    2013-07-09

    Have you tried a newer kernel? I believe there are fixes for the 82579 upstream.

     
  • Todd Fujinaka
    Todd Fujinaka
    2013-12-03

    • assigned_to: dertman
     
  • Oliver Wagner
    Oliver Wagner
    2014-01-15

    Issue happens for me with

    Linux gateway1 3.8.0-35-generic #50~precise1-Ubuntu SMP Wed Dec 4 17:25:51 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

    I've added a full report in #378, which IMHO describes the same issue.

     
  • Oliver Wagner
    Oliver Wagner
    2014-02-25

    Problem still happens with

    [ 0.781704] e1000e: Intel(R) PRO/1000 Network Driver - 3.0.4-NAPI

    on

    Linux gateway1 3.8.0-36-generic #52~precise1-Ubuntu SMP Mon Feb 3 21:54:46 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

    Disabling tso (ethtool -K eth0 tso off) works as a workaround.

    I suggest merging this issue and #378 as they IMHO describe the same bug.

     
    Last edit: Oliver Wagner 2014-02-25
  • Oliver Wagner
    Oliver Wagner
    2014-05-26

    Problem still happens with

    e1000e: Intel(R) PRO/1000 Network Driver - 3.0.4.1-NAPI

    on

    Linux gateway1 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

     
  • Benjamin Bird
    Benjamin Bird
    2014-07-10

    Same issue here on 2.3.2-k. Two different Intel chipsets in machine. Hanging happens on I217-V (rev 04) chipset, and not the 82547L. NIC facing Internet w/ nat.

    Linux hostname 3.13.0-30-generic #55-Ubuntu SMP Fri Jul 4 21:40:53 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

    Jul 10 10:18:30 hostname kernel: [73909.819126] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
    Jul 10 10:18:30 hostname kernel: [73909.819126] TDH <c9>
    Jul 10 10:18:30 hostname kernel: [73909.819126] TDT <d8>
    Jul 10 10:18:30 hostname kernel: [73909.819126] next_to_use <d8>
    Jul 10 10:18:30 hostname kernel: [73909.819126] next_to_clean <c8>
    Jul 10 10:18:30 hostname kernel: [73909.819126] buffer_info[next_to_clean]:
    Jul 10 10:18:30 hostname kernel: [73909.819126] time_stamp <10118883e>
    Jul 10 10:18:30 hostname kernel: [73909.819126] next_to_watch <ca>
    Jul 10 10:18:30 hostname kernel: [73909.819126] jiffies <101188b29>
    Jul 10 10:18:30 hostname kernel: [73909.819126] next_to_watch.status <0>
    Jul 10 10:18:30 hostname kernel: [73909.819126] MAC Status <40080083>
    Jul 10 10:18:30 hostname kernel: [73909.819126] PHY Status <796d>
    Jul 10 10:18:30 hostname kernel: [73909.819126] PHY 1000BASE-T Status <3800>
    Jul 10 10:18:30 hostname kernel: [73909.819126] PHY Extended Status <3000>
    Jul 10 10:18:30 hostname kernel: [73909.819126] PCI Status <10>

    00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-V (rev 04)
    <-- eth0. one that hangs facing internet w/ nat.

    03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
    <-- lan side

    details attached.

     
    Attachments