eth-tool speed detection for interfaces fails for speeds that aren't one of the predefined heather values from linux/ethtool.h and falls back to MII detectionn which reports the interface speed incorrectly
net-snmp-5.7.1/agent/mibgroup/if-mib/data_access/interface_linux.c
if \(edata.speed \!= SPEED\_10 && edata.speed \!= SPEED\_100
#ifdef SPEED_10000
&& edata.speed != SPEED_10000
#endif
#ifdef SPEED_2500
&& edata.speed != SPEED_2500
#endif
&& edata.speed != SPEED_1000 ) {
DEBUGMSGTL(("mibII/interfaces", "fallback to mii for %s\n",
ifr.ifr_name));
/* try MII */
return netsnmp_linux_interface_get_if_speed_mii(fd,name,defaultspeed);
}
alternative mechanism actually used by ethtool (from ethtool-copy.h)
static __inline__ __u32 ethtool_cmd_speed(struct ethtool_cmd *ep)
{
return (ep->speed_hi << 16) | ep->speed;
}
Note that the speed is simply read directly from the structure and either shifted for a high speed value or reported directly for the standard value.
Example of system effected..
[root@zzzzzz ~]# ethtool eth0
Settings for eth0:
Supported ports: [ AUI ]
Supported link modes: 10000baseT/Full
Supports auto-negotiation: No
Advertised link modes: Not reported
Advertised auto-negotiation: No
Speed: 5000Mb/s
Duplex: Full
Port: AUI
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: g
Wake-on: d
Link detected: no
Note the 5Gb/s speed available on this blade enclosure using the..
Ethernet controller: ServerEngines Corp. Emulex OneConnect 10Gb NIC (rev 02)
..controller.
snmpwalk of this host reports this NIC as follows..
IF-MIB::ifName.2 = STRING: eth0
IF-MIB::ifSpeed.2 = Gauge32: 10000000
IF-MIB::ifHighSpeed.2 = Gauge32: 10
Whereas another one of our (identical SW revision) hosts with an uncapped speed (10Gb) reports the correct values.
This host initially manifest with 290% NIC utilisation in Zenoss and the problem was tracked down as follows.
The problem is clearly related to the use of fixed constant matches from linux/ethtool.h which have no meaning in a gigabit network environment with variable speed allocation. Examining the fixed constants from linux/ethtool.h we find..
/* The following are all involved in forcing a particular link
* mode for the device for setting things. When getting the
* devices settings, these indicate the current mode and whether
* it was foced up into this mode or autonegotiated.
*/
/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
#define SPEED_10 10
#define SPEED_100 100
#define SPEED_1000 1000
#define SPEED_2500 2500
#define SPEED_10000 10000
It seems these speeds are useful as arbitrary increments only when attempting to select rather than retrieve the interface speed from the host.
It would seem to make more sense on the face of it to simply read and utilise the value when retrieving speed rather than confine the selections to the values present in the header.
Have patch, will upload following tests for comment.
Just awaiting approval to test on one of the problem hosts.
Patch tested... snmpwalk results
snmpwalk results... before...
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifSpeed.2 = Gauge32: 10000000
IF-MIB::ifHighSpeed.2 = Gauge32: 10
and with patched version of SNMP module..
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifSpeed.2 = Gauge32: 4294967295
IF-MIB::ifHighSpeed.2 = Gauge32: 5000
This correctly reports the 5GBit/s allocated via the blade enclosure.
Stock version of patch against current (5.7.1)
RedHat 5.5 patch applied as patch 76 using net-snmp.spec
Representative system exhibiting this issue with reporting the configuration. HP Proliant BL465c G7 inside a HP C7000 blade enclosure.
Is there anything else I can provide to help this get some attention? It's quite a prominent problem across a managed estate for an application infrastructure that will have about 500 servers eventually for us.. that's one single app estate and we would like to get the monitoring clean.
Please let me know if you need more information or if the nature of the report is unclear in any way.
moving to patches
Does patch 0001-CHANGES-BUG-3440752-IF-MIB-Report-interface-speed-co.patch help ?
Thanks Bart, I could test that but I'll need to schedule a slot on the servers effected. The patch title suggests something different to the problem I was reporting and the patch that I submitted actually does fix my problem - which was catering for speeds up to 10Gb that weren't defined in ethtool.h. If you want to kill 2 birds with one stone I'll test this patch next week once I have approval to work on those hosts.
Cheers,
Andy
Proposed fix, v2
I've cleaned up my patch a little, and as you can see it works correctly with 10 GbE interfaces:
$ apps/snmpwalk localhost IF-MIB::ifSpeed; apps/snmpwalk localhost IF-MIB::ifHighSpeed
IF-MIB::ifSpeed.1 = Gauge32: 10000000
IF-MIB::ifSpeed.2 = Gauge32: 100000000
IF-MIB::ifSpeed.5 = Gauge32: 0
IF-MIB::ifSpeed.6 = Gauge32: 0
IF-MIB::ifSpeed.7 = Gauge32: 4294967295
IF-MIB::ifSpeed.8 = Gauge32: 0
IF-MIB::ifHighSpeed.1 = Gauge32: 10
IF-MIB::ifHighSpeed.2 = Gauge32: 100
IF-MIB::ifHighSpeed.5 = Gauge32: 0
IF-MIB::ifHighSpeed.6 = Gauge32: 0
IF-MIB::ifHighSpeed.7 = Gauge32: 10000
IF-MIB::ifHighSpeed.8 = Gauge32: 0
Since your patch also includes the proposed changes in my patch as anticipated it does indeed fix our issues..
Here are 2 of our 5Gbit NICs
IF-MIB::ifHighSpeed.2 = Gauge32: 5000
IF-MIB::ifHighSpeed.3 = Gauge32: 5000
The new patch possibly crosses more issues than this one though since you attend to NICs beyond 10Gbit don't you? Worth pointing out that it caters for both on the change log but thanks again Bart.
Thanks for testing. Applied a slightly modified patch as commit d059fb878b1436599953cea5a077499ddcdcbcb5 on the 5.4, 5.5, 5.6, 5.7 and master branches with the following commit message:
CHANGES: BUG: 3440752: IF-MIB: Report interface speed correctly for Ethernet interfaces if other than 10 Mbps, 100 Mbps, 1 Gbps, 2.5 Gbps or 10 Gbps. Add support for NICs faster than 65 Gbps.