Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#333 eth1: Detected Hardware Unit Hang

closed
nobody
e1000e (107)
in-kernel_driver
5
2015-01-02
2010-04-01
Anonymous
No

After update to kernel from 2.6.29.1 to 2.6.33.1 i have this info in dmesg:
0000:05:00.0: eth1: Detected Hardware Unit Hang:
TDH <1e>
TDT
next_to_use

next_to_clean <1d>
buffer_info[next_to_clean]:
time_stamp <33bae15>
next_to_watch <20>
jiffies <33bafaf>
next_to_watch.status <0>
MAC Status <80080783>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
0000:05:00.0: eth1: Detected Hardware Unit Hang:
TDH <1e>
TDT

next_to_use

next_to_clean <1d>
buffer_info[next_to_clean]:
time_stamp <33bae15>
next_to_watch <20>
jiffies <33bb1a3>
next_to_watch.status <0>
MAC Status <80080783>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
0000:05:00.0: eth1: Detected Hardware Unit Hang:
TDH <1e>
TDT

next_to_use

next_to_clean <1d>
buffer_info[next_to_clean]:
time_stamp <33bae15>
next_to_watch <20>
jiffies <33bb397>
next_to_watch.status <0>
MAC Status <80080783>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x118/0x19c()
Hardware name: X7DCT
NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.33.1 #2
Call Trace:
[<c1024e3d>] ? warn_slowpath_common+0x52/0x71
[<c1024e49>] ? warn_slowpath_common+0x5e/0x71
[<c1024e8e>] ? warn_slowpath_fmt+0x26/0x2a
[<c1261f54>] ? dev_watchdog+0x118/0x19c
[<c102135c>] ? wake_up+0x29/0x39
[<c10320c6>] ? insert_work+0x40/0x44
[<c1261e3c>] ? dev_watchdog+0x0/0x19c
[<c102cc15>] ? run_timer_softirq+0x11a/0x173
[<c1028e5b>] ?
do_softirq+0x74/0xdf
[<c1028ee9>] ? do_softirq+0x23/0x27
[<c10290be>] ? irq_exit+0x26/0x58
[<c10102d7>] ? smp_apic_timer_interrupt+0x6c/0x76
[<c12c5f9a>] ? apic_timer_interrupt+0x2a/0x30
[<c1007e06>] ? mwait_idle+0x49/0x4e
[<c10017e8>] ? cpu_idle+0x41/0x5a
---[ end trace bcca9926a046332c ]---

More info about this problem:
http://www.spinics.net/lists/netdev/msg125726.html

Discussion


  • Anonymous
    2010-04-01

    Ethtool stats/debug

     
    Attachments
  • Emil Tantilov
    Emil Tantilov
    2010-04-01

    dmesg

     
    Attachments
  • Emil Tantilov
    Emil Tantilov
    2010-04-01

    kernel config file

     
    Attachments
  • Emil Tantilov
    Emil Tantilov
    2010-04-06

    I haven't been able to reproduce the issue so far, but I think aside from the LOMs my system is fairly different. Could you please attach the output from lspci -vvv?

     
  • Pawel, why do you have a multi-bit mask set in smp_affinity (you said in the mail thread cat /proc/irq/30/smp_affinity == 0c? Doesn't this cause interrupts to bounce between two cpus in your system?

    There was a recent patch also to fix a condition where the transmit could get out of sync (but we're unclear if that can cause a tx hang)

     

  • Anonymous
    2010-04-15

    yes that was 0c/01/02 and many other settings was there... because i test this - and try if it change anything.

    But what i want to say that with 2.6.34-rc3-next-20100412 i don't see this problems.

     
  • Pawel, is this issue closed now? Did 2.6.34 release fix the issue?

     

  • Anonymous
    2010-10-20

    I'm also having this problem upon upgrading to Ubuntu Maverick 10.10

    [ 290.849474] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
    [ 290.905196] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
    [ 290.906045] ADDRCONF(NETDEV_UP): eth0: link is not ready
    [ 290.914625] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: RX/TX
    [ 290.914633] e1000e 0000:02:00.0: eth0: 10/100 speed: disabling TSO
    [ 290.915129] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
    [ 301.296068] eth0: no IPv6 routers present
    [ 337.005358] e1000e 0000:02:00.0: eth0: Detected Hardware Unit Hang:
    [ 337.005362] TDH <59>
    [ 337.005364] TDT <76>
    [ 337.005366] next_to_use <76>
    [ 337.005368] next_to_clean <58>
    [ 337.005370] buffer_info[next_to_clean]:
    [ 337.005372] time_stamp <146d>
    [ 337.005375] next_to_watch <59>
    [ 337.005377] jiffies <2423>
    [ 337.005379] next_to_watch.status <0>
    [ 337.005381] MAC Status <80080703>
    [ 337.005383] PHY Status <796d>
    [ 337.005385] PHY 1000BASE-T Status <4000>
    [ 337.005387] PHY Extended Status <3000>
    [ 337.005389] PCI Status <10>

     
  • Bruce Allan
    Bruce Allan
    2010-11-19

    Pawel, is this issue resolvered for you, can it be closed?

    Evan, I see you opened another bug so I assume your issue can be tracked there.

     
  • bugreporta
    bugreporta
    2012-08-31

    isseu not solved. see #356

     
  • Todd Fujinaka
    Todd Fujinaka
    2013-07-08

    • status: open --> closed
     
  • Todd Fujinaka
    Todd Fujinaka
    2013-07-08

    Closing issue.