#369 e1000e driver hardware hang

wont-fix
None
standalone_driver
1
2015-02-20
2012-12-28
No

I have installed Debian squeeze on 2 cluster nodes.

    Product Name: IBM System x3655 -[798541G]-

Each cluster node has three Intel dual NIC PCIe cards. The NICs are put together to 2 bonding interfaces:

cat /proc/net/bonding/bond0:

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 200

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 3
Actor Key: 17
Partner Key: 17
Partner Mac Address: 00:17:08:7d:85:e5

Slave Interface: eth1
MII Status: up
Link Failure Count: 4
Permanent HW addr: 00:1b:78:5d:16:39
Aggregator ID: 2

Slave Interface: eth3
MII Status: up
Link Failure Count: 4
Permanent HW addr: 00:1f:29:5b:17:1b
Aggregator ID: 2

Slave Interface: eth5
MII Status: up
Link Failure Count: 4
Permanent HW addr: 00:26:55:df:3d:85
Aggregator ID: 2

cat /proc/net/bonding/bond1:

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 200

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 3
Actor Key: 17
Partner Key: 2
Partner Mac Address: b8:af:67:48:0e:06

Slave Interface: eth0
MII Status: up
Link Failure Count: 2
Permanent HW addr: 00:1b:78:5d:16:38
Aggregator ID: 2

Slave Interface: eth2
MII Status: up
Link Failure Count: 2
Permanent HW addr: 00:1f:29:5b:17:1a
Aggregator ID: 2

Slave Interface: eth4
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:26:55:df:3d:84
Aggregator ID: 2

lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 6.0.6 (squeeze)
Release: 6.0.6
Codename: squeeze

Before updateing the driver to version 2.1.4, I had many, many hardware hang kernel messages. Leading in full disaster on external bonding interface bond1. I updated the driver to version 2.1.4 and of now it has helped a lot. But on the 26th of December I had once more a hardware hang on one of the interfaces, so I guess the driver is very good, but not perfect, yet. I asked a someone for this problem. Maybe some timeing issue between PCIe bus and the NICs, but at least a problem in the driver. So this is my bug report right now.

uname -a
Linux node1 2.6.32-5-amd64 #1 SMP Sun Sep 23 10:07:46 UTC 2012 x86_64 GNU/Linux

lspci
17:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
17:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
22:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
22:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
2d:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
2d:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)

modinfo e1000e
filename: /lib/modules/2.6.32-5-amd64/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
version: 2.1.4-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, linux.nics@intel.com

kern.log:

Dec 26 12:10:31 node1 kernel: [407057.793311] e1000e 0000:2d:00.0: eth2: Detected Hardware Unit Hang:
Dec 26 12:10:31 node1 kernel: [407057.793314] TDH <c0>
Dec 26 12:10:31 node1 kernel: [407057.793315] TDT <45>
Dec 26 12:10:31 node1 kernel: [407057.793316] next_to_use <45>
Dec 26 12:10:31 node1 kernel: [407057.793317] next_to_clean <bf>
Dec 26 12:10:31 node1 kernel: [407057.793318] buffer_info[next_to_clean]:
Dec 26 12:10:31 node1 kernel: [407057.793320] time_stamp <1060fa40d>
Dec 26 12:10:31 node1 kernel: [407057.793321] next_to_watch <c2>
Dec 26 12:10:31 node1 kernel: [407057.793322] jiffies <1060fa868>
Dec 26 12:10:31 node1 kernel: [407057.793323] next_to_watch.status <0>
Dec 26 12:10:31 node1 kernel: [407057.793324] MAC Status <80383>
Dec 26 12:10:31 node1 kernel: [407057.793326] PHY Status <792d>
Dec 26 12:10:31 node1 kernel: [407057.793327] PHY 1000BASE-T Status <3800>
Dec 26 12:10:31 node1 kernel: [407057.793328] PHY Extended Status <3000>
Dec 26 12:10:31 node1 kernel: [407057.793329] PCI Status <10>
Dec 26 12:10:37 node1 kernel: [407063.792288] e1000e 0000:2d:00.0: eth2: Detected Hardware Unit Hang:
Dec 26 12:10:37 node1 kernel: [407063.792290] TDH <c0>
Dec 26 12:10:37 node1 kernel: [407063.792291] TDT <45>
Dec 26 12:10:37 node1 kernel: [407063.792292] next_to_use <45>
Dec 26 12:10:37 node1 kernel: [407063.792293] next_to_clean <bf>
Dec 26 12:10:37 node1 kernel: [407063.792295] buffer_info[next_to_clean]:
Dec 26 12:10:37 node1 kernel: [407063.792296] time_stamp <1060fa40d>
Dec 26 12:10:37 node1 kernel: [407063.792297] next_to_watch <c2>
Dec 26 12:10:37 node1 kernel: [407063.792298] jiffies <1060fae44>
Dec 26 12:10:37 node1 kernel: [407063.792299] next_to_watch.status <0>
Dec 26 12:10:37 node1 kernel: [407063.792300] MAC Status <80383>
Dec 26 12:10:37 node1 kernel: [407063.792302] PHY Status <792d>
Dec 26 12:10:37 node1 kernel: [407063.792303] PHY 1000BASE-T Status <3800>
Dec 26 12:10:37 node1 kernel: [407063.792304] PHY Extended Status <3000>
Dec 26 12:10:37 node1 kernel: [407063.792305] PCI Status <10>
Dec 26 12:10:40 node1 kernel: [407066.792017] ------------[ cut here ]------------
Dec 26 12:10:40 node1 kernel: [407066.792027] WARNING: at /build/buildd-linux-2.6_2.6.32-46-amd64-_ApuPc/linux-2.6-2.6.32/debian/build/source_amd64_none/net/sched/sch_generic.c:261 dev_watchdog+0xe2/0x194()
Dec 26 12:10:40 node1 kernel: [407066.792032] Hardware name: IBM System x3655 -[798541G]-
Dec 26 12:10:40 node1 kernel: [407066.792035] NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out
Dec 26 12:10:40 node1 kernel: [407066.792037] Modules linked in: drbd lru_cache cn xt_TCPMSS xt_connbytes act_mirred ip6table_filter ip6_tables ifb act_police cls_flow cls_fw cls_u32 sch_htb sch_hfsc sch_ingress sch_sfq xt_time xt_connlimit xt_realm iptable_raw xt_comment xt_recent xt_policy ipt_ULOG ipt_REJECT ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE ipt_ECN ipt_ecn ipt_CLUSTERIP ipt_ah ipt_addrtype nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_TPROXY nf_tproxy_core xt_tcpmss xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_MARK xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY ip
nfnetlink iptable_filter ip_tables x_tables bonding pppoe pppox ppp_generic slhc 8021q garp stp loop radeon ttm snd_pcm drm_kms_helper amd64_edac_mod snd_timer edac_core drm snd edac_mce_amd i2c_algo_bit ibmpex k8temp soundcore ibmaem ipmi_msghandler i2c_piix4 shpchp pci_hotplug i2c_core snd_page_alloc evdev joydev pcspkr psmouse serio_raw button processor ext4 mbcache jbd2 crc16 sg ses sr_mod enclosure cdrom sd_mod crc_t10dif ata_generic usbhid hid ohci_hcd pata_serverworks sata_svw aacraid thermal thermal_sys ehci_hcd libata e1000e scsi_mod usbcore nls_base [last unloaded: scsi_wait_scan]
Dec 26 12:10:40 node1 kernel: [407066.792147] Pid: 0, comm: swapper Not tainted 2.6.32-5-amd64 #1
Dec 26 12:10:40 node1 kernel: [407066.792150] Call Trace:
Dec 26 12:10:40 node1 kernel: [407066.792152] <IRQ> [<ffffffff812632da>] ? dev_watchdog+0xe2/0x194
Dec 26 12:10:40 node1 kernel: [407066.792158] [<ffffffff812632da>] ? dev_watchdog+0xe2/0x194
Dec 26 12:10:40 node1 kernel: [407066.792163] [<ffffffff8104df38>] ? warn_slowpath_common+0x77/0xa3
Dec 26 12:10:40 node1 kernel: [407066.792167] [<ffffffff812631f8>] ? dev_watchdog+0x0/0x194
Dec 26 12:10:40 node1 kernel: [407066.792170] [<ffffffff8104dfc0>] ? warn_slowpath_fmt+0x51/0x59
Dec 26 12:10:40 node1 kernel: [407066.792175] [<ffffffff810168c1>] ? sched_clock+0x5/0x8
Dec 26 12:10:40 node1 kernel: [407066.792180] [<ffffffff8105a918>] ? lock_timer_base+0x26/0x4b
Dec 26 12:10:40 node1 kernel: [407066.792184] [<ffffffff8105aeba>] ? mod_timer+0x141/0x153
Dec 26 12:10:40 node1 kernel: [407066.792187]
[<ffffffff812631cc>] ? netif_tx_lock+0x3d/0x69
Dec 26 12:10:40 node1 kernel: [407066.792192]
[<ffffffff8124dff8>] ? netdev_drivername+0x3b/0x40
Dec 26 12:10:40 node1 kernel: [407066.792195]
[<ffffffff812632da>] ? dev_watchdog+0xe2/0x194
Dec 26 12:10:40 node1 kernel: [407066.792199]
[<ffffffff8103f9ca>] ?
wake_up+0x30/0x44
Dec 26 12:10:40 node1 kernel: [407066.792202] [<ffffffff8105a6c7>] ? run_timer_softirq+0x1c9/0x268
Dec 26 12:10:40 node1 kernel: [407066.792207] [<ffffffff81069397>] ? sched_clock_local+0x13/0x74
Dec 26 12:10:40 node1 kernel: [407066.792211] [<ffffffff81053d73>] ? __do_softirq+0xdd/0x1a6
Dec 26 12:10:40 node1 kernel: [407066.792215] [<ffffffff8102462a>] ? lapic_next_event+0x18/0x1d
Dec 26 12:10:40 node1 kernel: [407066.792219] [<ffffffff81011cac>] ? call_softirq+0x1c/0x30
Dec 26 12:10:40 node1 kernel: [407066.792222] [<ffffffff8101322b>] ? do_softirq+0x3f/0x7c
Dec 26 12:10:40 node1 kernel: [407066.792226] [<ffffffff81053be3>] ? irq_exit+0x36/0x76
Dec 26 12:10:40 node1 kernel: [407066.792229] [<ffffffff810250f8>] ? smp_apic_timer_interrupt+0x87/0x95
Dec 26 12:10:40 node1 kernel: [407066.792233] [<ffffffff81011673>] ? apic_timer_interrupt+0x13/0x20
Dec 26 12:10:40 node1 kernel: [407066.792235] <EOI> [<ffffffff8102c584>] ? native_safe_halt+0x2/0x3
Dec 26 12:10:40 node1 kernel: [407066.792241] [<ffffffff8101758d>] ? default_idle+0x34/0x51
Dec 26 12:10:40 node1 kernel: [407066.792245] [<ffffffff81017919>] ? c1e_idle+0xf5/0xfb
Dec 26 12:10:40 node1 kernel: [407066.792249] [<ffffffff8100fe97>] ? cpu_idle+0xa2/0xda
Dec 26 12:10:40 node1 kernel: [407066.792252] ---[ end trace 31425aa57b3e6a60 ]---
Dec 26 12:10:40 node1 kernel: [407066.792267] e1000e 0000:2d:00.0: eth2: Reset adapter
Dec 26 12:10:40 node1 kernel: [407066.833020] bonding: bond1: link status down for interface eth2, disabling it in 200 ms.
Dec 26 12:10:40 node1 kernel: [407067.032027] bonding: bond1: link status definitely down for interface eth2, disabling it
Dec 26 12:10:43 node1 kernel: [407069.913892] e1000e: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Dec 26 12:10:43 node1 kernel: [407069.941027] bonding: bond1: link status definitely up for interface eth2.

Discussion

  • Tushar Dave

    Tushar Dave - 2013-01-07

    Does issue occur without bonding?
    Please send me the full dmesg log after issue occurs.

     
  • Tushar Dave

    Tushar Dave - 2013-01-07
    • assigned_to: Tushar Dave
     
  • Martin

    Martin - 2013-03-30

    Hi,

    it seems that I have this problem too.
    If I download a big file over the bonded device the "hardware hangs" problem occurs.
    If I do the download directly over my unbonded wan interface (e1000e) everything works fine.

    For testing purposes I used the latest stable release: e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-NAPI

    Here is my dmesg output:

    [ 1439.846938] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <49>
    TDT <56>
    next_to_use <56>
    next_to_clean <45>
    buffer_info[next_to_clean]:
    time_stamp <1001163ad>
    next_to_watch <49>
    jiffies <1001167e5>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1441.845523] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <49>
    TDT <56>
    next_to_use <56>
    next_to_clean <45>
    buffer_info[next_to_clean]:
    time_stamp <1001163ad>
    next_to_watch <49>
    jiffies <100116fb5>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1443.844366] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <49>
    TDT <56>
    next_to_use <56>
    next_to_clean <45>
    buffer_info[next_to_clean]:
    time_stamp <1001163ad>
    next_to_watch <49>
    jiffies <100117785>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1444.846526] ------------[ cut here ]------------
    [ 1444.846537] WARNING: at net/sched/sch_generic.c:254 dev_watchdog+0xe5/0x156()
    [ 1444.846539] Hardware name: System Product Name
    [ 1444.846542] NETDEV WATCHDOG: en0 (e1000e): transmit queue 0 timed out
    [ 1444.846543] Modules linked in: vboxnetflt(O) vboxdrv(O) nfsd exportfs w83627ehf hwmon_vid bridge stp llc tun xt_LOG ipt_MASQUERADE xt_nat iptable_nat nf_nat_ipv4 nf_nat cpufreq_stats cpufreq_ondemand nvidia(PO) hid_logitech_dj e1000e(O) osst st firewire_ohci coretemp firewire_core crc32c_intel crc_itu_t acpi_cpufreq ghash_clmulni_intel mperf freq_table lpc_ich mei thermal video mfd_core fan button processor aesni_intel ablk_helper cryptd lrw xts gf128mul aes_x86_64 sha256_generic fuse linear hid_generic xhci_hcd mpt2sas raid_class arcmsr sata_mv
    [ 1444.846591] Pid: 0, comm: swapper/0 Tainted: P O 3.8.4-gentoo #1
    [ 1444.846593] Call Trace:
    [ 1444.846595] <IRQ> [<ffffffff8103026e>] warn_slowpath_common+0x7e/0x97
    [ 1444.846604] [<ffffffff814bbb63>] ? netif_tx_lock+0x86/0x86
    [ 1444.846607] [<ffffffff8103031b>] warn_slowpath_fmt+0x41/0x43
    [ 1444.846611] [<ffffffff814bbc48>] dev_watchdog+0xe5/0x156
    [ 1444.846616] [<ffffffff8103b97b>] call_timer_fn+0x3b/0x102
    [ 1444.846618] [<ffffffff814bbb63>] ? netif_tx_lock+0x86/0x86
    [ 1444.846622] [<ffffffff8103cf82>] run_timer_softirq+0x192/0x1d6
    [ 1444.846627] [<ffffffff81036c01>] __do_softirq+0xb6/0x1db
    [ 1444.846632] [<ffffffff8106d0d6>] ? clockevents_program_event+0x9c/0xba
    [ 1444.846636] [<ffffffff8160f3cc>] call_softirq+0x1c/0x30
    [ 1444.846640] [<ffffffff81003ba7>] do_softirq+0x32/0x6b
    [ 1444.846644] [<ffffffff81036e12>] irq_exit+0x4b/0xa3
    [ 1444.846649] [<ffffffff8101d619>] smp_apic_timer_interrupt+0x77/0x85
    [ 1444.846652] [<ffffffff8160edca>] apic_timer_interrupt+0x6a/0x70
    [ 1444.846654] <EOI> [<ffffffff8106d0d6>] ? clockevents_program_event+0x9c/0xba
    [ 1444.846661] [<ffffffff8141897b>] ? cpuidle_wrap_enter+0x3b/0x6e
    [ 1444.846664] [<ffffffff81418977>] ? cpuidle_wrap_enter+0x37/0x6e
    [ 1444.846668] [<ffffffff814189be>] cpuidle_enter_tk+0x10/0x12
    [ 1444.846671] [<ffffffff814185d8>] cpuidle_enter_state+0x10/0x39
    [ 1444.846674] [<ffffffff814186f5>] cpuidle_idle_call+0xf4/0x1b1
    [ 1444.846678] [<ffffffff81009115>] cpu_idle+0x51/0x9b
    [ 1444.846682] [<ffffffff815f3289>] rest_init+0x6d/0x6f
    [ 1444.846685] [<ffffffff81ccaac2>] start_kernel+0x345/0x352
    [ 1444.846689] [<ffffffff81cca585>] ? repair_env_string+0x5a/0x5a
    [ 1444.846693] [<ffffffff81cca2ad>] x86_64_start_reservations+0xb1/0xb5
    [ 1444.846696] [<ffffffff81cca389>] x86_64_start_kernel+0xd8/0xdc
    [ 1444.846698] ---[ end trace 19dd6f8534b46c01 ]---
    [ 1444.846718] e1000e 0000:00:19.0 en0: Reset adapter unexpectedly
    [ 1444.862329] br0: port 1(en0) entered disabled state
    [ 1449.150937] e1000e: en0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [ 1449.150987] br0: port 1(en0) entered forwarding state
    [ 1449.151003] br0: port 1(en0) entered forwarding state
    [ 1457.835229] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <e3>
    TDT <f3>
    next_to_use <f3>
    next_to_clean <df>
    buffer_info[next_to_clean]:
    time_stamp <10011a7f9>
    next_to_watch <e3>
    jiffies <10011ae35>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <7800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1459.833789] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <e3>
    TDT <f3>
    next_to_use <f3>
    next_to_clean <df>
    buffer_info[next_to_clean]:
    time_stamp <10011a7f9>
    next_to_watch <e3>
    jiffies <10011b604>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <7800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1461.832645] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <e3>
    TDT <f3>
    next_to_use <f3>
    next_to_clean <df>
    buffer_info[next_to_clean]:
    time_stamp <10011a7f9>
    next_to_watch <e3>
    jiffies <10011bdd5>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <7800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1463.831225] e1000e 0000:00:19.0 en0: Detected Hardware Unit Hang:
    TDH <e3>
    TDT <f3>
    next_to_use <f3>
    next_to_clean <df>
    buffer_info[next_to_clean]:
    time_stamp <10011a7f9>
    next_to_watch <e3>
    jiffies <10011c5a4>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <7800>
    PHY Extended Status <3000>
    PCI Status <10>
    [ 1464.178009] br0: port 1(en0) entered forwarding state
    [ 1464.833611] e1000e 0000:00:19.0 en0: Reset adapter unexpectedly
    [ 1464.852971] br0: port 1(en0) entered disabled state
    [ 1468.971956] e1000e: en0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [ 1468.972002] br0: port 1(en0) entered forwarding state
    [ 1468.972015] br0: port 1(en0) entered forwarding state
    [ 1484.005172] br0: port 1(en0) entered forwarding state

     
  • Erik van Velzen

    Erik van Velzen - 2013-05-15

    I have this same issue.

    The issue occurs when using certain download managers like Firefox or Bittorrent. It also occurs when working with large files combined with certain software on a drive shared over the network. Simply moving a large file at full speed isn't enough, it is only specific scenarios where some kind of "streaming" is involved where it doesn't usually utilize the full available bandwith.

    I think the issue started when I upgraded my kernel from 3.5 to 3.8 (Ubuntu). I've switched to Arch and also there the problem is present (kernel 3.9.2). I'm not 100% positive on that it started after the upgrade of 3.5, but I have had the same hardware and use case for a long time without issues and now I have this issue constantly.

    My hardware is an integrated Intel 82579V on Asus P9X79 Deluxe rev 1.03 motherboard.

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    I've tested without bonding or bridging, and the issue remains.

    I've used this interface connected to different hardware: a switch and a cable modem, and the issue occurs in both situations.

    I've used e1000e driver versions 2.1.4, 2.2.14 and 2.3.2. No difference.

    I've used a realtek adapter in the same scenarios, and there were no issues.

    What else I've done:
    - updated UEFI BIOS.
    - replaced physical cables
    - turned off auto-negotiation
    - disabled rx flow control
    - enabled arp filtering

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    I can supply more dmesg logs if you tell me what debug level I should set for the e1000e kernel module (the highest levels don't fit in the dmesg buffer).

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    May 15 13:51:26 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
    TDH <a4>
    TDT <8>
    next_to_use <8>
    next_to_clean <a4>
    buffer_info[next_to_clean]:
    time_stamp <100133d12>
    next_to_watch <a4>
    jiffies <100133eea>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    May 15 13:51:28 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
    TDH <a4>
    TDT <8>
    next_to_use <8>
    next_to_clean <a4>
    buffer_info[next_to_clean]:
    time_stamp <100133d12>
    next_to_watch <a4>
    jiffies <100134142>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    May 15 13:51:30 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
    TDH <a4>
    TDT <8>
    next_to_use <8>
    next_to_clean <a4>
    buffer_info[next_to_clean]:
    time_stamp <100133d12>
    next_to_watch <a4>
    jiffies <10013439a>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    May 15 13:51:32 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
    TDH <a4>
    TDT <8>
    next_to_use <8>
    next_to_clean <a4>
    buffer_info[next_to_clean]:
    time_stamp <100133d12>
    next_to_watch <a4>
    jiffies <1001345f2>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    May 15 13:51:34 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
    TDH <a4>
    TDT <8>
    next_to_use <8>
    next_to_clean <a4>
    buffer_info[next_to_clean]:
    time_stamp <100133d12>
    next_to_watch <a4>
    jiffies <10013484a>
    next_to_watch.status <0>
    MAC Status <40080083>
    PHY Status <796d>
    PHY 1000BASE-T Status <3800>
    PHY Extended Status <3000>
    PCI Status <10>
    May 15 13:51:35 PRIME kernel: e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
    May 15 13:51:35 PRIME dhcpcd[23019]: eno1: carrier lost
    May 15 13:51:35 PRIME dhcpcd[23935]: eno1: eno1: MTU restored to 1500
    May 15 13:51:39 PRIME dhcpcd[23019]: eno1: carrier acquired
    May 15 13:51:39 PRIME dhcpcd[23019]: eno1: configured as a router, not a host
    May 15 13:51:39 PRIME kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    May 15 13:51:39 PRIME dhcpcd[23019]: eno1: rebinding lease of 62.194.15.103
    May 15 13:51:39 PRIME dhcpcd[23019]: eno1: acknowledged 62.194.15.103 from 10.15.160.1
    May 15 13:51:39 PRIME dhcpcd[23019]: eno1: checking for 62.194.15.103
    May 15 13:51:44 PRIME dhcpcd[23019]: eno1: leased 62.194.15.103 for 257386 seconds
    May 15 13:51:44 PRIME dhcpcd[24759]: eno1: eno1: MTU set to 1500

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    lspci -vvv

    ....
    00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 05)
    Subsystem: ASUSTeK Computer Inc. P8P67 Deluxe Motherboard
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort-="">SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 92
    Region 0: Memory at fbf00000 (32-bit, non-prefetchable) [size=128K]
    Region 1: Memory at fbf28000 (32-bit, non-prefetchable) [size=4K]
    Region 2: I/O ports at f040 [size=32]
    Capabilities: [c8] Power Management version 2
    Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
    Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
    Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
    Address: 00000000fee00000 Data: 4055
    Capabilities: [e0] PCI Advanced Features
    AFCap: TP+ FLR+
    AFCtrl: FLR-
    AFStatus: TP-
    Kernel driver in use: e1000e
    Kernel modules: e1000e
    ...

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    ethtool -i eno1

    driver: e1000e
    version: 2.3.2-NAPI
    firmware-version: 0.13-4
    bus-info: 0000:00:19.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    ethtool -k eno1

    Features for eno1:
    rx-checksumming: on
    tx-checksumming: on
    tx-checksum-ipv4: off [fixed]
    tx-checksum-ip-generic: on
    tx-checksum-ipv6: off [fixed]
    tx-checksum-fcoe-crc: off [fixed]
    tx-checksum-sctp: off [fixed]
    scatter-gather: on
    tx-scatter-gather: on
    tx-scatter-gather-fraglist: off [fixed]
    tcp-segmentation-offload: on
    tx-tcp-segmentation: on
    tx-tcp-ecn-segmentation: off [fixed]
    tx-tcp6-segmentation: on
    udp-fragmentation-offload: off [fixed]
    generic-segmentation-offload: on
    generic-receive-offload: on
    large-receive-offload: off [fixed]
    rx-vlan-offload: on
    tx-vlan-offload: on
    ntuple-filters: off [fixed]
    receive-hashing: on
    highdma: on [fixed]
    rx-vlan-filter: off [fixed]
    vlan-challenged: off [fixed]
    tx-lockless: off [fixed]
    netns-local: off [fixed]
    tx-gso-robust: off [fixed]
    tx-fcoe-segmentation: off [fixed]
    fcoe-mtu: off [fixed]
    tx-nocache-copy: on
    loopback: off [fixed]
    rx-fcs: off
    rx-all: off

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    ethtool eno1

    Settings for eno1:
    Supported ports: [ TP ]
    Supported link modes: 10baseT/Half 10baseT/Full
    100baseT/Half 100baseT/Full
    1000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Advertised link modes: 10baseT/Half 10baseT/Full
    100baseT/Half 100baseT/Full
    1000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: 1000Mb/s
    Duplex: Full
    Port: Twisted Pair
    PHYAD: 2
    Transceiver: internal
    Auto-negotiation: on
    MDI-X: on (auto)
    Supports Wake-on: pumbg
    Wake-on: g
    Current message level: 0x0000ffff (65535)
    drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol 0x8000
    Link detected: yes

     
    Last edit: Erik van Velzen 2013-05-15
  • Erik van Velzen

    Erik van Velzen - 2013-05-20

    [nvm]

     
    Last edit: Erik van Velzen 2013-05-22
  • Todd Fujinaka

    Todd Fujinaka - 2013-07-02
    • status: open --> wont-fix
     
  • Todd Fujinaka

    Todd Fujinaka - 2013-07-02

    Looks like there are two different issues attached to this bug. This started as an issue for the 82571EB and there were no replies for that so I'm closing this bug.

    Can you open a separate bug for the 82579?

    Thanks.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks