Submitter Email: smackinlay@mail.com
Software Version: 5.4.1
Operating System: Solaris 8 (Generic_117350-39)
Agent-only compilation compiled for Solaris 8 per...
./configure --prefix=/opt/NetSNMPAgent --enable-as-needed
--disable-applications --disable-mibs --disable-mib-loading --disable-des
--disable-privacy --disable-md5 --enable-ipv6 --enable-mfd-rewrites
--disable-embedded-perl --disable-perl-cc-checks --disable-shared
--with-cc=gcc --without-openssl --with-default-snmp -version=2
--with-sys-contact='Not RFC822 Email' --with-sys-location='MyCorp'
--with-mib-modules='ucd-snmp/diskio ucd-snmp/lmSensors if-mib tcp-mib
udp-mib agentx' --with-defaults
... (build host is Generic_117350-43) finds itself unable to
produce sane ipOperStatus' viz...
# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
:
groupname ipmp
:
bge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
:
bge1: flags=39040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED,STANDBY> mtu 1500 index 3
:
groupname ipmp
:
# ndd -get /dev/bge0 link_status
1
# ndd -get /dev/bge1 link_status
0
# snmpwalk -v2c -c public localhost ifDescr ifOperStatus
ifDescr :
interfaces.ifTable.ifEntry.ifDescr.1 : DISPLAY STRING- (ascii): lo0
interfaces.ifTable.ifEntry.ifDescr.2 : DISPLAY STRING- (ascii): bge0
interfaces.ifTable.ifEntry.ifDescr.3 : DISPLAY STRING- (ascii): bge1
ifOperStatus :
interfaces.ifTable.ifEntry.ifOperStatus.1 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.2 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.3 : INTEGER: up
... obviously bge0/ bge1 are configured in an IPMP group.
All thoughts appreciated - unfortunately I don't have
ready access to an equivalent test system (the fault
shows in production) - so while I can readily debug
the SNMP agent and run previously compiled test code
on the host, I can't play about with reconfiguring the
network interfaces to find out whether or not IPMP is
triggering a side-effect here.
Thanks in advance (if anyone has any ideas on how to
progress)...
Logged In: YES
user_id=1557771
Originator: NO
Could you run:
kstat bge:{0,1}:mac:link_*
and post the output?
Thanks.
Logged In: NO
Hmm, no output at all from that (not sure we have a
:mac: "name" at all on any of these sol8 interface
types, have checked on eri, hme, bge, ce). However
pressing on many of the *other* iftypes have...
ce:::link_{asmpause,duplex,pause,speed,T4,up}
eri:::link_{duplex,up}
hme:::link_{down_cnt,duplex,up}
... but for bge::bge{0,1} we win the booby prize
once again...
module: bge instance: 0
name: bge0 class: net
align_errors 0
blocked 0
brdcstrcv 766
brdcstxmt 13756
carrier_errors 0
collisions 0
crtime 54.766630697
defer_xmts 0
duplex full
ex_collisions 0
fcs_errors 0
first_collisions 0
ierrors 0
ifspeed 100000000
intr 177947734
ipackets 78705420
ipackets64 78705420
macrcv_errors 0
macxmt_errors 0
media twpair
missed 0
multi_collisions 0
multircv 706
multixmt 0
norcvbuf 0
noxmtbuf 0
obytes 1481059039
obytes64 22955895519
oerrors 0
oflo 0
opackets 129699016
opackets64 129699016
promisc off
rbytes 3849632391
rbytes64 12439566983
rcv_badinterp 0
runt_errors 0
snaptime 4932773.51471441
sqe_errors 0
toolong_errors 0
tx_late_collisions 0
uflo 0
unknowns 0
xmt_badinterp 0
xmtretry 0
module: bge instance: 1
name: bge1 class: net
align_errors 0
blocked 0
brdcstrcv 0
brdcstxmt 2330592
carrier_errors 0
collisions 0
crtime 55.772321779
defer_xmts 0
duplex unknown
ex_collisions 0
fcs_errors 0
first_collisions 0
ierrors 0
ifspeed 0
intr 13080517
ipackets 0
ipackets64 0
macrcv_errors 0
macxmt_errors 0
media twpair
missed 0
multi_collisions 0
multircv 0
multixmt 0
norcvbuf 0
noxmtbuf 0
obytes 176740614
obytes64 176740614
oerrors 0
oflo 0
opackets 3907707
opackets64 3907707
promisc off
rbytes 0
rbytes64 0
rcv_badinterp 0
runt_errors 0
snaptime 4932773.51746266
sqe_errors 0
toolong_errors 0
tx_late_collisions 0
uflo 0
unknowns 0
xmt_badinterp 0
xmtretry 0
... so I'm guessing that this is a driver thing,
and about the only sane workaround I could ask
for on the NetSNMP side of the fence, would be
to assume that "duplex" be (ab)used as a
tristate, with "unknown" signalling no-carrier?
I'm guessing that (as an implementation-specific
detail) it couldn't really mean anything else
_which would also result_ in ifOperStatus = up?
#define CONFIG_BROKENLY_BROKEN_WTF_USE_BGE_DRIVERAUTHOR_CRACKPIPE 1
Just maybe, the bge::phydata: is worth a peek?
module: bge instance: 0
name: phydata class: net
an_advert 1281
an_expansion 4
an_lp_ability 0
an_lp_nextpage 0
aux_control 1024
aux_status 1284
crtime 54.76633078
false_carrier_count 0
gbit_control 0
gbit_status 0
hcd_status 0
ieee_ext_status 12288
intr_mask 65535
intr_status 0
mii_control 8448
mii_status 31053
phy_ext_control 0
phy_ext_status 768
phy_identifier 2122128
receive_error_count 0
receiver_not_ok_count 0
snaptime 4933244.44712054
module: bge instance: 1
name: phydata class: net
an_advert 1281
an_expansion 4
an_lp_ability 0
an_lp_nextpage 0
aux_control 1024
aux_status 1280
crtime 55.772065112
false_carrier_count 0
gbit_control 0
gbit_status 0
hcd_status 0
ieee_ext_status 12288
intr_mask 65535
intr_status 0
mii_control 8448
mii_status 31049
phy_ext_control 0
phy_ext_status 0
phy_identifier 2122128
receive_error_count 0
receiver_not_ok_count 0
snaptime 4933244.44945337
... note "phy_ext_status" (but I assume other things
like "gbit_status" and "mii_status" are probably
interesting too, if you have the secret sauce recipie
for what I gather are probably bitmasks).
Logged In: NO
Have just had a look at a dmfe* system, and
we have similar statistics to bge...
module: dmfe instance: 0
name: dmfe0 class: net
... (link is up here - I don't have a system I can
readily plumb a dead interface into just to test
this) but here we don't have :phydata: at all.
Logged In: NO
fyi: here's dmfe on an ifconfig up, but datalink-down
(cable pulled) interface...
module: dmfe instance: 0
name: dmfe0 class: net
Logged In: YES
user_id=1557771
Originator: NO
So for these cases, when the driver does not provide an explicit variable for state, the best we can do is to rely on heuristics. I have gone through a few drivers, and so far they all implement ifspeed correctly. So on NICs that do not provide "link_up", we could use "ifspeed == 0" as an indication of OperStatus being "down". However, as always with these things I am a little concerned about false negatives...
Logged In: YES
user_id=1557771
Originator: NO
So for these cases, when the driver does not provide an explicit variable for state, the best we can do is to rely on heuristics. I have gone through a few drivers, and so far they all implement ifspeed correctly. So on NICs that do not provide "link_up", we could use "ifspeed == 0" as an indication of OperStatus being "down". However, as always with these things I am a little concerned about false negatives...
Logged In: YES
user_id=1557771
Originator: NO
Please download and apply the patch available at:
http://sourceforge.net/tracker/index.php?func=detail&aid=1824196&group_id=12694&atid=312694
I would appreciate if you report back on whether the patch addressed your problem or not.
Logged In: NO
Hmm, that's busted it for the loopback interface...
ifName :
ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.1 : DISPLAY STRING- (ascii): lo0
ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.2 : DISPLAY STRING- (ascii): bge0
ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.3 : DISPLAY STRING- (ascii): bge1
ifOperStatus :
interfaces.ifTable.ifEntry.ifOperStatus.1 : INTEGER: down
interfaces.ifTable.ifEntry.ifOperStatus.2 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.3 : INTEGER: down
... which obviously doesn't even have a ifspeed property...
# kstat lo
module: lo instance: 0
name: lo0 class: net
crtime 53.492251364
ipackets 1181553
opackets 1181553
snaptime 5692062.23888096
... so to my untrained eye, the correct approach would be
to discriminate between "no ifspeed/ ifSpeed property at
all" and "ifspeed/ ifSpeed == 0" in
kernel_sunos5.c:set_if_info() if we're going to then feed
this as an input to our new heuristic?
As it happens my management suite doesn't care about
ifOperStatus on ifType==softwareLoopback interfaces
anyway - but I bet someone's does. :)
Thoughts? Thanks once again for getting us this far...
Logged In: YES
user_id=1557771
Originator: NO
Thanks for testing. I have created a new patch that address that problem. You can grab from the same place as before. Please test it if you can.
Logged In: YES
user_id=848638
Originator: NO
Fixed by patch 1824196 applied in SVN Rev 16736.
Logged In: YES
user_id=848638
Originator: NO
Thanks for the bug report!
We've fixed the problem in the 5.4.x code branch
and the main development tree, so it should be
fixed in future releases of the Net-SNMP package.
Logged In: NO
Confirming patch works as expected.