#370 82541GI e1000 linux 2.6.32 7.3.21-k8-NAPI kernel panics

closed
e1000 (138)
in-kernel_driver
1
2015-02-26
2013-01-14
D Miles
No

CentOS 6.3 with a system with 2 e1000 ports on 2 distinct controllers. 1 ethernet controller is on board motherboard the other is on an add-in card.

NOTE the udev perform a device name swap eth0<>eth1 on bootup
after bootup completes, eth0 is the only used port, eth1 is not connected

eth0 is HWaddr 00:14:22:75:D7:1E and on PCI bus address 03:07.0
eth1 is HWaddr 00:03:47:6B:45:6D and on PCI bus address 02:05.0

The problem port is eth0 as eth1 has not been used in a while.

PROBLEM

Many kernel panics/Opps over just 1 months of operation. I am unable to make this system's network stable.

The easiest way to make it panic is to issue:

ethtool -k eth0 gso off sg off rx off tx off

This causes:

e1000 0000:03:07.0: eth0: TSO is Disabled
e1000 0000:03:07.0: eth0: TSO is Disabled
e1000 0000:03:07.0: eth0: Reset adapter

Then the port no longer works and the system will start to opps and panic. I include this stack trace in another follow up comment.

SYSTEM DATA

Ask if you need more data.

# ethtool -i eth0
driver: e1000
version: 7.3.21-k8-NAPI
firmware-version:
bus-info: 0000:03:07.0

# IGNORE THIS ETHERNET PORT THIS DATA HERE FOR COMPLETENESS
# ethtool -i eth1
driver: e1000
version: 7.3.21-k8-NAPI
firmware-version:
bus-info: 0000:02:05.0

# KERNEL DATA
# uname -r
2.6.32-279.19.1.el6.i686

# BOOTUP MESSAGES BEWARE of device name swapping eth0<>eth1
# dmesg | egrep -i "eth|e1000"
e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
e1000: Copyright (c) 1999-2006 Intel Corporation.
e1000 0000:02:05.0: PCI->APIC IRQ transform: INT A -> IRQ 29
e1000 0000:02:05.0: eth0: (PCI:66MHz:64-bit) 00:03:47:6b:45:6d
e1000 0000:02:05.0: eth0: Intel(R) PRO/1000 Network Connection
e1000 0000:03:07.0: PCI->APIC IRQ transform: INT A -> IRQ 53
e1000 0000:03:07.0: eth1: (PCI:66MHz:32-bit) 00:14:22:75:d7:1e
e1000 0000:03:07.0: eth1: Intel(R) PRO/1000 Network Connection
udev: renamed network interface eth0 to rename2
udev: renamed network interface eth1 to eth0
udev: renamed network interface rename2 to eth1
ADDRCONF(NETDEV_UP): eth0: link is not ready
e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
ADDRCONF(NETDEV_UP): eth1: link is not ready

# lspci -s 03:07.0 -vv
03:07.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller (rev 05)
    Subsystem: Dell Device 0183
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
    Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 32 (63750ns min), Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 53
    Region 0: Memory at fe7e0000 (32-bit, non-prefetchable) [size=128K]
    Region 2: I/O ports at dcc0 [size=64]
    Capabilities: [dc] Power Management version 2
            Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
            Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
    Capabilities: [e4] PCI-X non-bridge device
            Command: DPERE- ERO+ RBC=512 OST=1
            Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
    Kernel driver in use: e1000
    Kernel modules: e1000

# IGNORE THIS ETHERNET PORT THIS DATA HERE FOR COMPLETENESS
# lspci -s 02:05.0 -vv
02:05.0 Ethernet controller: Intel Corporation 82543GC Gigabit Ethernet Controller (Copper) (rev 02)
    Subsystem: Intel Corporation PRO/1000 T Server Adapter
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
    Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 32 (63750ns min), Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 29
    Region 0: Memory at fe9c0000 (32-bit, non-prefetchable) [size=128K]
    Region 1: Memory at fe9b0000 (32-bit, non-prefetchable) [size=64K]
    Expansion ROM at fe900000 [disabled] [size=64K]
    Capabilities: [dc] Power Management version 2
            Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
            Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    Kernel driver in use: e1000
    Kernel modules: e1000

# ethtool eth0
Settings for eth0:
   Supported ports: [ TP ]
   Supported link modes:   10baseT/Half 10baseT/Full
                           100baseT/Half 100baseT/Full
                           1000baseT/Full
   Supports auto-negotiation: Yes
   Advertised link modes:  10baseT/Half 10baseT/Full
                           100baseT/Half 100baseT/Full
                           1000baseT/Full
   Advertised pause frame use: No
   Advertised auto-negotiation: Yes
   Speed: 1000Mb/s
   Duplex: Full
   Port: Twisted Pair
   PHYAD: 0
   Transceiver: internal
   Auto-negotiation: on
   MDI-X: Unknown
   Supports Wake-on: umbg
   Wake-on: d
   Current message level: 0x00000007 (7)
   Link detected: yes

# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

# I ALWAYS PUT THE 100baseTx-FD down to a bug in mii-tool and/or user/kernel interface
#  but the Ethernet switch it is plugged into reports 1000MBit and the performance over
#  100MBit is easily possible.  Maybe this is a separate driver bug?  or mii-tool bug?

# mii-tool eth0
eth0: negotiated 100baseTx-FD flow-control, link ok
# mii-tool eth1
eth1: no link

Discussion

  • D Miles

    D Miles - 2013-01-14
    # THE FOLLOWING IS FROM ISSUING THE 'eth-tool -k eth0 gso off sg off rx off tx off'
    #  COMMAND AS Jan 14 06:03:03 THEN 1 min 40 seconds LATER THIS OUTPUT THAT IS
    #  EXPLAINED IN THE BUG SUMMARY.
    Jan 14 06:04:44 tyr kernel: ------------[ cut here ]------------
    Jan 14 06:04:44 tyr kernel: WARNING: at drivers/net/e1000/e1000_main.c:1394 e1000_close+0xa7/0xb0 [e1000]() (Not tainted)
    Jan 14 06:04:44 tyr kernel: Hardware name: PowerEdge 1800
    Jan 14 06:04:44 tyr kernel: Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipt_REJECT ipt_LOG nf_conntrack_ipv4 n
    f_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6t_rt ip6table_filter ip
    6_tables ipv6 ppdev parport_pc parport e1000 snd_cmipci snd_seq snd_pcm snd_page_alloc snd_opl3_lib snd_timer snd_hwdep snd_mpu401_u
    art snd_rawmidi snd_seq_device snd soundcore dcdbas iTCO_wdt iTCO_vendor_support sg e752x_edac edac_core ext4 mbcache jbd2 sd_mod cr
    c_t10dif video output 3w_9xxx mptspi mptscsih mptbase scsi_transport_spi sr_mod cdrom ata_generic ata_piix radeon ttm drm_kms_helper
     drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
    Jan 14 06:04:44 tyr kernel: Pid: 3261, comm: ip Not tainted 2.6.32-279.19.1.el6.i686 #1
    Jan 14 06:04:44 tyr kernel: Call Trace:
    Jan 14 06:04:44 tyr kernel: [<c04550a1>] ? warn_slowpath_common+0x81/0xc0
    Jan 14 06:04:44 tyr kernel: [<f8f96d57>] ? e1000_close+0xa7/0xb0 [e1000]
    Jan 14 06:04:44 tyr kernel: [<f8f96d57>] ? e1000_close+0xa7/0xb0 [e1000]
    Jan 14 06:04:44 tyr kernel: [<c04550fb>] ? warn_slowpath_null+0x1b/0x20
    Jan 14 06:04:44 tyr kernel: [<f8f96d57>] ? e1000_close+0xa7/0xb0 [e1000]
    Jan 14 06:04:44 tyr kernel: [<c07869cb>] ? dev_close+0x5b/0xb0
    Jan 14 06:04:44 tyr kernel: [<c0784920>] ? dev_set_rx_mode+0x20/0x40
    Jan 14 06:04:44 tyr kernel: [<c0786307>] ? dev_change_flags+0x87/0x1a0
    Jan 14 06:04:44 tyr kernel: [<c0522ee3>] ? __mem_cgroup_commit_charge.clone.3+0x33/0x80
    Jan 14 06:04:44 tyr kernel: [<c0790a18>] ? do_setlink+0x188/0x720
    Jan 14 06:04:44 tyr kernel: [<c06060b1>] ? nla_parse+0x21/0xd0
    Jan 14 06:04:44 tyr kernel: [<c0791e14>] ? rtnl_newlink+0x424/0x4f0
    Jan 14 06:04:44 tyr kernel: [<c07919f0>] ? rtnl_newlink+0x0/0x4f0
    Jan 14 06:04:44 tyr kernel: [<c0791706>] ? rtnetlink_rcv_msg+0x146/0x230
    Jan 14 06:04:44 tyr kernel: [<c07915c0>] ? rtnetlink_rcv_msg+0x0/0x230
    Jan 14 06:04:44 tyr kernel: [<c07a6a4e>] ? netlink_rcv_skb+0x7e/0xa0
    Jan 14 06:04:44 tyr kernel: [<c07915a0>] ? rtnetlink_rcv+0x0/0x20
    Jan 14 06:04:44 tyr kernel: [<c07915b4>] ? rtnetlink_rcv+0x14/0x20
    Jan 14 06:04:44 tyr kernel: [<c07a6740>] ? netlink_unicast+0x250/0x280
    Jan 14 06:04:44 tyr kernel: [<c07a6f1c>] ? netlink_sendmsg+0x1bc/0x2a0
    Jan 14 06:04:44 tyr kernel: [<c07759d5>] ? sock_sendmsg+0xe5/0x120
    Jan 14 06:04:44 tyr kernel: [<c0475d20>] ? autoremove_wake_function+0x0/0x40
    Jan 14 06:04:44 tyr kernel: [<c04eea94>] ? __alloc_pages_nodemask+0xf4/0x870
    Jan 14 06:04:44 tyr kernel: [<c0475d20>] ? autoremove_wake_function+0x0/0x40
    Jan 14 06:04:44 tyr kernel: [<c04dd97d>] ? find_get_page+0x1d/0x90
    Jan 14 06:04:44 tyr kernel: [<c05fd885>] ? copy_from_user+0x35/0x120
    Jan 14 06:04:44 tyr kernel: [<c077f8f2>] ? verify_iovec+0x62/0xb0
    Jan 14 06:04:44 tyr kernel: [<c07771fd>] ? __sys_sendmsg+0x2ad/0x2c0
    Jan 14 06:04:44 tyr kernel: [<c0439b90>] ? kmap_atomic_prot+0x120/0x150
    Jan 14 06:04:44 tyr kernel: [<c0503c81>] ? handle_mm_fault+0x131/0x1d0
    Jan 14 06:04:44 tyr kernel: [<c0433a5a>] ? __do_page_fault+0x1aa/0x430
    Jan 14 06:04:44 tyr kernel: [<c0777379>] ? sys_sendmsg+0x39/0x70
    Jan 14 06:04:44 tyr kernel: [<c07774aa>] ? sys_socketcall+0xfa/0x2e0
    Jan 14 06:04:44 tyr kernel: [<c04af32e>] ? audit_syscall_entry+0x1be/0x1e0
    Jan 14 06:04:44 tyr kernel: [<c08302fa>] ? do_page_fault+0x2a/0x90
    Jan 14 06:04:44 tyr kernel: [<c04099bf>] ? sysenter_do_call+0x12/0x28
    Jan 14 06:04:44 tyr kernel: ---[ end trace a47fd97d66ac12c3 ]---
    Jan 14 06:05:10 tyr kernel: INFO: task events/2:21 blocked for more than 120 seconds.
    
    # THIS APPEARS TO BE FROM THE  "events/2" PROCESS
    Jan 14 06:05:10 tyr kernel: INFO: task events/2:21 blocked for more than 120 seconds.
    Jan 14 06:05:10 tyr kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Jan 14 06:05:10 tyr kernel: events/2      D f7109e38     0    21      2 0x00000000
    Jan 14 06:05:10 tyr kernel: f70df000 00000046 00000002 f7109e38 c1f04024 00000000 00000400 00000499
    Jan 14 06:05:10 tyr kernel: 00000000 f08f3d00 00000036 40a88682 00000036 c0b14680 c0b14680 f70df2a8
    Jan 14 06:05:10 tyr kernel: c0b14680 c0b10024 c0b14680 f70df2a8 fffefa53 f708a000 c04082c7 f70df000
    Jan 14 06:05:10 tyr kernel: Call Trace:
    Jan 14 06:05:10 tyr kernel: [<c04082c7>] ? __switch_to+0xd7/0x1a0
    Jan 14 06:05:10 tyr kernel: [<c082b0c0>] ? schedule+0x3c0/0xad0
    Jan 14 06:05:10 tyr kernel: [<c082bda5>] ? schedule_timeout+0x195/0x250
    Jan 14 06:05:10 tyr kernel: [<c0691980>] ? vt_console_print+0x0/0x300
    Jan 14 06:05:10 tyr kernel: [<c045520b>] ? __call_console_drivers+0x5b/0x70
    Jan 14 06:05:10 tyr kernel: [<c082bb09>] ? wait_for_common+0xe9/0x150
    Jan 14 06:05:10 tyr kernel: [<c044de30>] ? default_wake_function+0x0/0x10
    Jan 14 06:05:10 tyr kernel: [<c0471fab>] ? __cancel_work_timer+0x15b/0x180
    Jan 14 06:05:10 tyr kernel: [<c0471a90>] ? wq_barrier_func+0x0/0x10
    Jan 14 06:05:10 tyr kernel: [<f8f90bb6>] ? e1000_down_and_stop+0x16/0x40 [e1000]
    Jan 14 06:05:10 tyr kernel: [<f8f95d8f>] ? e1000_down+0x12f/0x1b0 [e1000]
    Jan 14 06:05:10 tyr kernel: [<f8f962d0>] ? e1000_reset_task+0x0/0xc0 [e1000]
    Jan 14 06:05:10 tyr kernel: [<f8f96331>] ? e1000_reset_task+0x61/0xc0 [e1000]
    Jan 14 06:05:10 tyr kernel: [<c047168b>] ? worker_thread+0x11b/0x230
    Jan 14 06:05:10 tyr kernel: [<c0475d20>] ? autoremove_wake_function+0x0/0x40
    Jan 14 06:05:10 tyr kernel: [<c0471570>] ? worker_thread+0x0/0x230
    Jan 14 06:05:10 tyr kernel: [<c0475ae4>] ? kthread+0x74/0x80
    Jan 14 06:05:10 tyr kernel: [<c0475a70>] ? kthread+0x0/0x80
    Jan 14 06:05:10 tyr kernel: [<c0409f1f>] ? kernel_thread_helper+0x7/0x10
    
    # THIS APPEARS TO BE FROM THE 'ntpd' PROCESS
    Jan 14 06:07:10 tyr kernel: INFO: task ntpd:1542 blocked for more than 120 seconds.
    Jan 14 06:07:10 tyr kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Jan 14 06:07:10 tyr kernel: ntpd          D ebd8de40     0  1542      1 0x00000080
    Jan 14 06:07:10 tyr kernel: f5250000 00200082 00000002 ebd8de40 c1e04024 00000000 00100100 00200200
    Jan 14 06:07:10 tyr kernel: f4a43090 f525e580 0000004e 372de5b7 0000004e c0b14680 c0b14680 f52502a8
    Jan 14 06:07:10 tyr kernel: c0b14680 c0b10024 c0b14680 f52502a8 00008b36 c053bb30 00100100 f5250000
    Jan 14 06:07:10 tyr kernel: Call Trace:
    Jan 14 06:07:10 tyr kernel: [<c053bb30>] ? pollwake+0x0/0x60
    Jan 14 06:07:10 tyr kernel: [<c05a7d04>] ? avc_has_perm+0x64/0x80
    Jan 14 06:07:10 tyr kernel: [<c082c428>] ? __mutex_lock_slowpath+0xd8/0x140
    Jan 14 06:07:10 tyr kernel: [<c082c32d>] ? mutex_lock+0x1d/0x40
    Jan 14 06:07:10 tyr kernel: [<c0788375>] ? dev_ioctl+0xe5/0x6a0
    Jan 14 06:07:10 tyr kernel: [<c05a89ed>] ? selinux_sk_alloc_security+0x3d/0x50
    Jan 14 06:07:10 tyr kernel: [<c051bec7>] ? kmem_cache_alloc_trace+0x107/0x110
    Jan 14 06:07:10 tyr kernel: [<c051b71d>] ? kmem_cache_alloc+0xfd/0x110
    Jan 14 06:07:10 tyr kernel: [<c07d5c50>] ? udp_ioctl+0x0/0x70
    Jan 14 06:07:10 tyr kernel: [<c07dceee>] ? inet_ioctl+0x2e/0xb0
    Jan 14 06:07:10 tyr kernel: [<c0774abf>] ? sock_ioctl+0x6f/0x260
    Jan 14 06:07:10 tyr kernel: [<c0774a50>] ? sock_ioctl+0x0/0x260
    Jan 14 06:07:10 tyr kernel: [<c0539c8b>] ? vfs_ioctl+0x1b/0xa0
    Jan 14 06:07:10 tyr kernel: [<c0539e6c>] ? do_vfs_ioctl+0x6c/0x5c0
    Jan 14 06:07:10 tyr kernel: [<c053a436>] ? sys_ioctl+0x76/0x90
    Jan 14 06:07:10 tyr kernel: [<c04af0a0>] ? __audit_syscall_exit+0x220/0x250
    Jan 14 06:07:10 tyr kernel: [<c04099bf>] ? sysenter_do_call+0x12/0x28
    
    # WHAT FOLLOWS HERE ARE MORE RANDOM OPPS FROM THE SAME SYSTEM THAT JUST
    # OCCURED 'NATURALLY' FROM UTILIZING THE SYSTEM.
    
    # THE SYSTEM HAS 6Gb RAM AND NEVER USES SWAP, THE WORKING SET FOR PROCESSES
    # IS ABOUT 1.2GB SO ALTHOUGH IT SAYS 'page allocation failure'.
    # THE SYSTEM ONLY HAD 30 HOURS UPTIME AT THE TIME OF THE CRASH
    
    Jan 14 05:50:46 tyr kernel: kswapd0: page allocation failure. order:5, mode:0x20
    Jan 14 05:50:46 tyr kernel: Pid: 58, comm: kswapd0 Not tainted 2.6.32-279.19.1.el6.i686 #1
    Jan 14 05:50:46 tyr kernel: Call Trace:
    Jan 14 05:50:46 tyr kernel: [<c04ef05c>] ? __alloc_pages_nodemask+0x6bc/0x870
    Jan 14 05:50:46 tyr kernel: [<f946c702>] ? nf_conntrack_find_get+0x22/0x110 [nf_conntrack]
    Jan 14 05:50:46 tyr kernel: [<c051b9ec>] ? cache_alloc_refill+0x2bc/0x510
    Jan 14 05:50:46 tyr kernel: [<c051bd82>] ? __kmalloc+0x142/0x180
    Jan 14 05:50:46 tyr kernel: [<c077da83>] ? pskb_expand_head+0x53/0x200
    Jan 14 05:50:46 tyr kernel: [<c077da83>] ? pskb_expand_head+0x53/0x200
    Jan 14 05:50:46 tyr kernel: [<c077e05c>] ? __pskb_pull_tail+0x4c/0x2b0
    Jan 14 05:50:46 tyr kernel: [<c07a8b16>] ? nf_iterate+0x66/0x80
    Jan 14 05:50:46 tyr kernel: [<c07892bd>] ? dev_queue_xmit+0x1ed/0x6f0
    Jan 14 05:50:46 tyr kernel: [<c07b65a0>] ? ip_finish_output+0x0/0x280
    Jan 14 05:50:46 tyr kernel: [<c07a8c82>] ? nf_hook_slow+0x62/0xf0
    Jan 14 05:50:46 tyr kernel: [<c07b65a0>] ? ip_finish_output+0x0/0x280
    Jan 14 05:50:46 tyr kernel: [<c07b66a5>] ? ip_finish_output+0x105/0x280
    Jan 14 05:50:46 tyr kernel: [<c07b68aa>] ? ip_output+0x8a/0xb0
    Jan 14 05:50:46 tyr kernel: [<c07b5d65>] ? ip_local_out+0x15/0x20
    Jan 14 05:50:46 tyr kernel: [<c07b61a5>] ? ip_queue_xmit+0x145/0x3b0
    Jan 14 05:50:46 tyr kernel: [<c07c3a06>] ? tcp_data_snd_check+0xc6/0xe0
    Jan 14 05:50:46 tyr kernel: [<c051c282>] ? slab_destroy+0x22/0x70
    Jan 14 05:50:46 tyr kernel: [<c07c8b63>] ? tcp_transmit_skb+0x3a3/0x710
    Jan 14 05:50:46 tyr kernel: [<c07cab3a>] ? tcp_write_xmit+0x1ea/0x9c0
    Jan 14 05:50:46 tyr kernel: [<c07cb441>] ? __tcp_push_pending_frames+0x31/0xe0
    Jan 14 05:50:46 tyr kernel: [<c07c3963>] ? tcp_data_snd_check+0x23/0xe0
    Jan 14 05:50:46 tyr kernel: [<c07c71fa>] ? tcp_rcv_established+0x37a/0x760
    Jan 14 05:50:46 tyr kernel: [<c07ce59f>] ? tcp_v4_do_rcv+0x27f/0x3c0
    Jan 14 05:50:46 tyr kernel: [<f94f94c8>] ? ipv4_confirm+0x68/0x190 [nf_conntrack_ipv4]
    Jan 14 05:50:46 tyr kernel: [<c07cfb9e>] ? tcp_v4_rcv+0x48e/0x7c0
    Jan 14 05:50:46 tyr kernel: [<c07b1710>] ? ip_local_deliver_finish+0x0/0x260
    Jan 14 05:50:46 tyr kernel: [<c07a8c82>] ? nf_hook_slow+0x62/0xf0
    Jan 14 05:50:46 tyr kernel: [<c07b17af>] ? ip_local_deliver_finish+0x9f/0x260
    Jan 14 05:50:46 tyr kernel: [<c07b19bf>] ? ip_local_deliver+0x4f/0x90
    Jan 14 05:50:46 tyr kernel: [<c07b1043>] ? ip_rcv_finish+0xf3/0x390
    Jan 14 05:50:46 tyr kernel: [<c07b0f50>] ? ip_rcv_finish+0x0/0x390
    Jan 14 05:50:46 tyr kernel: [<c0785361>] ? __netif_receive_skb+0x401/0x5f0
    Jan 14 05:50:46 tyr kernel: [<c078700f>] ? netif_receive_skb+0x3f/0x50
    Jan 14 05:50:46 tyr kernel: [<c079c9ed>] ? eth_type_trans+0x2d/0x120
    Jan 14 05:50:46 tyr kernel: [<c07870df>] ? napi_skb_finish+0x2f/0x40
    Jan 14 05:50:46 tyr kernel: [<c0788e35>] ? napi_gro_receive+0x25/0x40
    Jan 14 05:50:46 tyr kernel: [<f8f8bf31>] ? e1000_clean_rx_irq+0x241/0x4a0 [e1000]
    Jan 14 05:50:46 tyr kernel: [<f8f89e38>] ? e1000_clean+0x198/0x8e0 [e1000]
    Jan 14 05:50:46 tyr kernel: [<c04f3ef9>] ? shrink_page_list.clone.0+0x3e9/0x520
    Jan 14 05:50:46 tyr kernel: [<c0788f2e>] ? net_rx_action+0xde/0x280
    Jan 14 05:50:46 tyr kernel: [<c045c3da>] ? __do_softirq+0x8a/0x1a0
    Jan 14 05:50:46 tyr kernel: [<c042a65f>] ? ack_apic_level+0x5f/0x1f0
    Jan 14 05:50:46 tyr kernel: [<c04b5675>] ? handle_fasteoi_irq+0x85/0xc0
    Jan 14 05:50:46 tyr kernel: [<c045c52d>] ? do_softirq+0x3d/0x50
    Jan 14 05:50:46 tyr kernel: [<c045c685>] ? irq_exit+0x65/0x70
    Jan 14 05:50:46 tyr kernel: [<c040b030>] ? do_IRQ+0x50/0xc0
    Jan 14 05:50:46 tyr kernel: [<c0409f10>] ? common_interrupt+0x30/0x38
    Jan 14 05:50:46 tyr kernel: [<c053dba0>] ? d_callback+0x0/0x10
    Jan 14 05:50:46 tyr kernel: [<c04b7732>] ? __call_rcu+0x22/0x110
    Jan 14 05:50:46 tyr kernel: [<c053d4f6>] ? d_kill+0x36/0x50
    Jan 14 05:50:46 tyr kernel: [<c053d7a3>] ? __shrink_dcache_sb+0x293/0x2e0
    Jan 14 05:50:46 tyr kernel: [<c053d8f2>] ? shrink_dcache_memory+0x102/0x1a0
    Jan 14 05:50:46 tyr kernel: [<c04f379b>] ? shrink_slab+0x11b/0x180
    Jan 14 05:50:46 tyr kernel: [<c04f5beb>] ? kswapd+0x57b/0x920
    Jan 14 05:50:46 tyr kernel: [<c04f5f90>] ? isolate_pages_global+0x0/0x2b0
    Jan 14 05:50:46 tyr kernel: [<c0475d20>] ? autoremove_wake_function+0x0/0x40
    Jan 14 05:50:46 tyr kernel: [<c04f5670>] ? kswapd+0x0/0x920
    Jan 14 05:50:46 tyr kernel: [<c0475ae4>] ? kthread+0x74/0x80
    Jan 14 05:50:46 tyr kernel: [<c0475a70>] ? kthread+0x0/0x80
    Jan 14 05:50:46 tyr kernel: [<c0409f1f>] ? kernel_thread_helper+0x7/0x10
    Jan 14 05:50:46 tyr kernel: Mem-Info:
    Jan 14 05:50:46 tyr kernel: DMA per-cpu:
    Jan 14 05:50:46 tyr kernel: CPU    0: hi:    0, btch:   1 usd:   0
    Jan 14 05:50:46 tyr kernel: CPU    1: hi:    0, btch:   1 usd:   0
    Jan 14 05:50:46 tyr kernel: CPU    2: hi:    0, btch:   1 usd:   0
    Jan 14 05:50:46 tyr kernel: CPU    3: hi:    0, btch:   1 usd:   0
    Jan 14 05:50:46 tyr kernel: Normal per-cpu:
    Jan 14 05:50:46 tyr kernel: CPU    0: hi:  186, btch:  31 usd:  94
    Jan 14 05:50:46 tyr kernel: CPU    1: hi:  186, btch:  31 usd: 170
    Jan 14 05:50:46 tyr kernel: CPU    2: hi:  186, btch:  31 usd: 164
    Jan 14 05:50:46 tyr kernel: CPU    3: hi:  186, btch:  31 usd: 183
    Jan 14 05:50:46 tyr kernel: HighMem per-cpu:
    Jan 14 05:50:46 tyr kernel: CPU    0: hi:  186, btch:  31 usd:  26
    Jan 14 05:50:46 tyr kernel: CPU    1: hi:  186, btch:  31 usd:  24
    Jan 14 05:50:46 tyr kernel: CPU    2: hi:  186, btch:  31 usd: 152
    Jan 14 05:50:46 tyr kernel: CPU    3: hi:  186, btch:  31 usd:  63
    Jan 14 05:50:46 tyr kernel: active_anon:119292 inactive_anon:17142 isolated_anon:0
    Jan 14 05:50:46 tyr kernel: active_file:77328 inactive_file:460484 isolated_file:0
    Jan 14 05:50:46 tyr kernel: unevictable:0 dirty:15650 writeback:0 unstable:0
    Jan 14 05:50:46 tyr kernel: free:801387 slab_reclaimable:21875 slab_unreclaimable:12758
    Jan 14 05:50:46 tyr kernel: mapped:14393 shmem:1011 pagetables:1385 bounce:0
    Jan 14 05:50:46 tyr kernel: DMA free:3528kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:3588kB inactiv
    e_file:372kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB sh
    mem:0kB slab_reclaimable:412kB slab_unreclaimable:60kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pag
    es_scanned:0 all_unreclaimable? no
    Jan 14 05:50:46 tyr kernel: lowmem_reserve[]: 0 863 6075 6075
    Jan 14 05:50:46 tyr kernel: Normal free:25264kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:2483
    36kB inactive_file:248380kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:440kB writeback
    :0kB mapped:4kB shmem:0kB slab_reclaimable:87088kB slab_unreclaimable:50972kB kernel_stack:4944kB pagetables:0kB unstable:0kB bounce
    :0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    Jan 14 05:50:46 tyr kernel: lowmem_reserve[]: 0 0 41701 41701
    Jan 14 05:50:46 tyr kernel: HighMem free:3176756kB min:512kB low:6132kB high:11756kB active_anon:477168kB inactive_anon:68568kB acti
    ve_file:57388kB inactive_file:1593184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5337780kB mlocked:0kB dirty:62
    160kB writeback:0kB mapped:57568kB shmem:4044kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:5540kB unsta
    ble:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    Jan 14 05:50:46 tyr kernel: lowmem_reserve[]: 0 0 0 0
    Jan 14 05:50:46 tyr kernel: DMA: 25*4kB 11*8kB 5*16kB 2*32kB 4*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3532kB
    Jan 14 05:50:46 tyr kernel: Normal: 5868*4kB 150*8kB 19*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2529
    6kB
    Jan 14 05:50:46 tyr kernel: HighMem: 15*4kB 23*8kB 4*16kB 0*32kB 2*64kB 1*128kB 1*256kB 7*512kB 4*1024kB 3*2048kB 772*4096kB = 31767
    56kB
    Jan 14 05:50:46 tyr kernel: 538831 total pagecache pages
    Jan 14 05:50:46 tyr kernel: 0 pages in swap cache
    Jan 14 05:50:46 tyr kernel: Swap cache stats: add 0, delete 0, find 0/0
    Jan 14 05:50:46 tyr kernel: Free swap  = 1557496kB
    Jan 14 05:50:46 tyr kernel: Total swap = 1557496kB
    Jan 14 05:50:46 tyr kernel: 1572863 pages RAM
    Jan 14 05:50:46 tyr kernel: 1346050 pages HighMem
    Jan 14 05:50:46 tyr kernel: 47775 pages reserved
    Jan 14 05:50:46 tyr kernel: 521984 pages shared
    Jan 14 05:50:46 tyr kernel: 220111 pages non-shared
    
     
  • Tushar Dave

    Tushar Dave - 2013-01-14

    Sorry to hear this.
    Would you make sure that your kernel is running with following patch?

    commit 8ce6909f77ba1b7bcdea65cc2388fd1742b6d669
    Author: Tushar Dave tushar.n.dave@intel.com
    Date: Thu May 17 01:04:50 2012 +0000

    e1000: Prevent reset task killing itself.
    
    Killing reset task while adapter is resetting causes deadlock.
    Only kill reset task if adapter is not resetting.
    Ref bug #43132 on bugzilla.kernel.org
    

    -Tushar

     
  • Tushar Dave

    Tushar Dave - 2013-01-14
    • assigned_to: Tushar Dave
     
  • Akemi Yagi

    Akemi Yagi - 2013-01-15

    The current RHEL/CentOS 6.3 kernel (2.6.32-279.19.1.el6) does not have the referenced patch. I can rebuild the kernel with the patch applied so that the OP can test this fix.

     
    Last edit: Tushar Dave 2013-01-16
    • Tushar Dave

      Tushar Dave - 2013-01-16

      OK. Thanks.

       
  • Todd Fujinaka

    Todd Fujinaka - 2013-07-09
    • assigned_to: Tushar Dave --> Todd Fujinaka
     
  • Todd Fujinaka

    Todd Fujinaka - 2013-07-09

    Are you still seeing this problem?

     
  • Akemi Yagi

    Akemi Yagi - 2013-07-09

    I just checked the current RHEL/CentOS kernel 2.6.32-358.11.1.el6. It does have the patch referenced.

     
  • Todd Fujinaka

    Todd Fujinaka - 2013-07-09

    Please let us know if this resolves your issue. Thanks.

     
  • Todd Fujinaka

    Todd Fujinaka - 2013-08-09
    • status: open --> closed
     
  • Todd Fujinaka

    Todd Fujinaka - 2013-08-09

    Closing issue due to inactivity.

     

Log in to post a comment.