Menu

#1886 if-mib ifOperstatus - wrongly "up" on Solaris 8

closed
agent (1105)
5
2012-11-08
2007-10-29
Anonymous
No

Submitter Email: smackinlay@mail.com
Software Version: 5.4.1
Operating System: Solaris 8 (Generic_117350-39)

Agent-only compilation compiled for Solaris 8 per...

./configure --prefix=/opt/NetSNMPAgent --enable-as-needed
--disable-applications --disable-mibs --disable-mib-loading --disable-des
--disable-privacy --disable-md5 --enable-ipv6 --enable-mfd-rewrites
--disable-embedded-perl --disable-perl-cc-checks --disable-shared
--with-cc=gcc --without-openssl --with-default-snmp -version=2
--with-sys-contact='Not RFC822 Email' --with-sys-location='MyCorp'
--with-mib-modules='ucd-snmp/diskio ucd-snmp/lmSensors if-mib tcp-mib
udp-mib agentx' --with-defaults

... (build host is Generic_117350-43) finds itself unable to
produce sane ipOperStatus' viz...

# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
:
groupname ipmp
:
bge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
:
bge1: flags=39040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED,STANDBY> mtu 1500 index 3
:
groupname ipmp
:

# ndd -get /dev/bge0 link_status
1

# ndd -get /dev/bge1 link_status
0

# snmpwalk -v2c -c public localhost ifDescr ifOperStatus

ifDescr :
interfaces.ifTable.ifEntry.ifDescr.1 : DISPLAY STRING- (ascii): lo0
interfaces.ifTable.ifEntry.ifDescr.2 : DISPLAY STRING- (ascii): bge0
interfaces.ifTable.ifEntry.ifDescr.3 : DISPLAY STRING- (ascii): bge1

ifOperStatus :
interfaces.ifTable.ifEntry.ifOperStatus.1 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.2 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.3 : INTEGER: up

... obviously bge0/ bge1 are configured in an IPMP group.

All thoughts appreciated - unfortunately I don't have
ready access to an equivalent test system (the fault
shows in production) - so while I can readily debug
the SNMP agent and run previously compiled test code
on the host, I can't play about with reconfiguring the
network interfaces to find out whether or not IPMP is
triggering a side-effect here.

Thanks in advance (if anyone has any ideas on how to
progress)...

Discussion

  • Anders Persson

    Anders Persson - 2007-10-29

    Logged In: YES
    user_id=1557771
    Originator: NO

    Could you run:

    kstat bge:{0,1}:mac:link_*

    and post the output?

    Thanks.

     
  • Nobody/Anonymous

    Logged In: NO

    Hmm, no output at all from that (not sure we have a
    :mac: "name" at all on any of these sol8 interface
    types, have checked on eri, hme, bge, ce). However
    pressing on many of the *other* iftypes have...

    ce:::link_{asmpause,duplex,pause,speed,T4,up}
    eri:::link_{duplex,up}
    hme:::link_{down_cnt,duplex,up}

    ... but for bge::bge{0,1} we win the booby prize
    once again...

    module: bge instance: 0
    name: bge0 class: net
    align_errors 0
    blocked 0
    brdcstrcv 766
    brdcstxmt 13756
    carrier_errors 0
    collisions 0
    crtime 54.766630697
    defer_xmts 0
    duplex full
    ex_collisions 0
    fcs_errors 0
    first_collisions 0
    ierrors 0
    ifspeed 100000000
    intr 177947734
    ipackets 78705420
    ipackets64 78705420
    macrcv_errors 0
    macxmt_errors 0
    media twpair
    missed 0
    multi_collisions 0
    multircv 706
    multixmt 0
    norcvbuf 0
    noxmtbuf 0
    obytes 1481059039
    obytes64 22955895519
    oerrors 0
    oflo 0
    opackets 129699016
    opackets64 129699016
    promisc off
    rbytes 3849632391
    rbytes64 12439566983
    rcv_badinterp 0
    runt_errors 0
    snaptime 4932773.51471441
    sqe_errors 0
    toolong_errors 0
    tx_late_collisions 0
    uflo 0
    unknowns 0
    xmt_badinterp 0
    xmtretry 0

    module: bge instance: 1
    name: bge1 class: net
    align_errors 0
    blocked 0
    brdcstrcv 0
    brdcstxmt 2330592
    carrier_errors 0
    collisions 0
    crtime 55.772321779
    defer_xmts 0
    duplex unknown
    ex_collisions 0
    fcs_errors 0
    first_collisions 0
    ierrors 0
    ifspeed 0
    intr 13080517
    ipackets 0
    ipackets64 0
    macrcv_errors 0
    macxmt_errors 0
    media twpair
    missed 0
    multi_collisions 0
    multircv 0
    multixmt 0
    norcvbuf 0
    noxmtbuf 0
    obytes 176740614
    obytes64 176740614
    oerrors 0
    oflo 0
    opackets 3907707
    opackets64 3907707
    promisc off
    rbytes 0
    rbytes64 0
    rcv_badinterp 0
    runt_errors 0
    snaptime 4932773.51746266
    sqe_errors 0
    toolong_errors 0
    tx_late_collisions 0
    uflo 0
    unknowns 0
    xmt_badinterp 0
    xmtretry 0

    ... so I'm guessing that this is a driver thing,
    and about the only sane workaround I could ask
    for on the NetSNMP side of the fence, would be
    to assume that "duplex" be (ab)used as a
    tristate, with "unknown" signalling no-carrier?
    I'm guessing that (as an implementation-specific
    detail) it couldn't really mean anything else
    _which would also result_ in ifOperStatus = up?
    #define CONFIG_BROKENLY_BROKEN_WTF_USE_BGE_DRIVERAUTHOR_CRACKPIPE 1

    Just maybe, the bge::phydata: is worth a peek?

    module: bge instance: 0
    name: phydata class: net
    an_advert 1281
    an_expansion 4
    an_lp_ability 0
    an_lp_nextpage 0
    aux_control 1024
    aux_status 1284
    crtime 54.76633078
    false_carrier_count 0
    gbit_control 0
    gbit_status 0
    hcd_status 0
    ieee_ext_status 12288
    intr_mask 65535
    intr_status 0
    mii_control 8448
    mii_status 31053
    phy_ext_control 0
    phy_ext_status 768
    phy_identifier 2122128
    receive_error_count 0
    receiver_not_ok_count 0
    snaptime 4933244.44712054

    module: bge instance: 1
    name: phydata class: net
    an_advert 1281
    an_expansion 4
    an_lp_ability 0
    an_lp_nextpage 0
    aux_control 1024
    aux_status 1280
    crtime 55.772065112
    false_carrier_count 0
    gbit_control 0
    gbit_status 0
    hcd_status 0
    ieee_ext_status 12288
    intr_mask 65535
    intr_status 0
    mii_control 8448
    mii_status 31049
    phy_ext_control 0
    phy_ext_status 0
    phy_identifier 2122128
    receive_error_count 0
    receiver_not_ok_count 0
    snaptime 4933244.44945337

    ... note "phy_ext_status" (but I assume other things
    like "gbit_status" and "mii_status" are probably
    interesting too, if you have the secret sauce recipie
    for what I gather are probably bitmasks).

     
  • Nobody/Anonymous

    Logged In: NO

    Have just had a look at a dmfe* system, and
    we have similar statistics to bge...

    module: dmfe instance: 0
    name: dmfe0 class: net

        align\_errors                    21587
        blocked                         0
        brdcstrcv                       1147571
        brdcstxmt                       38059
        carrier\_errors                  24
        collisions                      1
        crtime                          65.611272904
        defer\_xmts                      6
        ex\_collisions                   0
        fcs\_errors                      43402
        ierrors                         43402
        ifspeed                         100000000
        intr                            1534104443
        ipackets                        1534806959
        ipackets64                      1534806959
        media                           PHY/MII
        missed                          0
        multircv                        26
        multixmt                        0
        norcvbuf                        0
        noxmtbuf                        0
        obytes                          4025452449
        obytes64                        145759373217
        oerrors                         28
        oflo                            0
        opackets                        922746681
        opackets64                      922746681
        promisc                         off
        rbytes                          2745215148
        rbytes64                        659875211436
        rcv\_badinterp                   0
        runt\_errors                     0
        snaptime                        91326175.5382303
        tx\_late\_collisions              4
        uflo                            0
        unknowns                        0
        xmt\_badinterp                   0
        xmtretry                        13
    

    ... (link is up here - I don't have a system I can
    readily plumb a dead interface into just to test
    this) but here we don't have :phydata: at all.

     
  • Nobody/Anonymous

    Logged In: NO

    fyi: here's dmfe on an ifconfig up, but datalink-down
    (cable pulled) interface...

    module: dmfe instance: 0
    name: dmfe0 class: net

        align\_errors                    0
        blocked                         0
        brdcstrcv                       869
        brdcstxmt                       12
        carrier\_errors                  1
        collisions                      0
        crtime                          65.217458875
        defer\_xmts                      0
        ex\_collisions                   0
        fcs\_errors                      0
        ierrors                         0
        ifspeed                         0
        intr                            2179
        ipackets                        2176
        ipackets64                      2176
        media                           PHY/MII
        missed                          0
        multircv                        1106
        multixmt                        0
        norcvbuf                        0
        noxmtbuf                        0
        obytes                          504
        obytes64                        504
        oerrors                         1
        oflo                            0
        opackets                        12
        opackets64                      12
        promisc                         off
        rbytes                          142060
        rbytes64                        142060
        rcv\_badinterp                   0
        runt\_errors                     0
        snaptime                        4735.906873021
        tx\_late\_collisions              0
        uflo                            0
        unknowns                        0
        xmt\_badinterp                   0
        xmtretry                        4
    
     
  • Anders Persson

    Anders Persson - 2007-10-31

    Logged In: YES
    user_id=1557771
    Originator: NO

    So for these cases, when the driver does not provide an explicit variable for state, the best we can do is to rely on heuristics. I have gone through a few drivers, and so far they all implement ifspeed correctly. So on NICs that do not provide "link_up", we could use "ifspeed == 0" as an indication of OperStatus being "down". However, as always with these things I am a little concerned about false negatives...

     
  • Anders Persson

    Anders Persson - 2007-10-31

    Logged In: YES
    user_id=1557771
    Originator: NO

    So for these cases, when the driver does not provide an explicit variable for state, the best we can do is to rely on heuristics. I have gone through a few drivers, and so far they all implement ifspeed correctly. So on NICs that do not provide "link_up", we could use "ifspeed == 0" as an indication of OperStatus being "down". However, as always with these things I am a little concerned about false negatives...

     
  • Nobody/Anonymous

    Logged In: NO

    Hmm, that's busted it for the loopback interface...

    ifName :
    ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.1 : DISPLAY STRING- (ascii): lo0
    ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.2 : DISPLAY STRING- (ascii): bge0
    ifMIB.ifMIBObjects.ifXTable.ifXEntry.ifName.3 : DISPLAY STRING- (ascii): bge1

    ifOperStatus :
    interfaces.ifTable.ifEntry.ifOperStatus.1 : INTEGER: down
    interfaces.ifTable.ifEntry.ifOperStatus.2 : INTEGER: up
    interfaces.ifTable.ifEntry.ifOperStatus.3 : INTEGER: down

    ... which obviously doesn't even have a ifspeed property...

    # kstat lo
    module: lo instance: 0
    name: lo0 class: net
    crtime 53.492251364
    ipackets 1181553
    opackets 1181553
    snaptime 5692062.23888096

    ... so to my untrained eye, the correct approach would be
    to discriminate between "no ifspeed/ ifSpeed property at
    all" and "ifspeed/ ifSpeed == 0" in
    kernel_sunos5.c:set_if_info() if we're going to then feed
    this as an input to our new heuristic?

    As it happens my management suite doesn't care about
    ifOperStatus on ifType==softwareLoopback interfaces
    anyway - but I bet someone's does. :)

    Thoughts? Thanks once again for getting us this far...

     
  • Anders Persson

    Anders Persson - 2007-11-08

    Logged In: YES
    user_id=1557771
    Originator: NO

    Thanks for testing. I have created a new patch that address that problem. You can grab from the same place as before. Please test it if you can.

     
  • Thomas Anders

    Thomas Anders - 2007-11-08

    Logged In: YES
    user_id=848638
    Originator: NO

    Fixed by patch 1824196 applied in SVN Rev 16736.

     
  • Thomas Anders

    Thomas Anders - 2007-11-08

    Logged In: YES
    user_id=848638
    Originator: NO

    Thanks for the bug report!
    We've fixed the problem in the 5.4.x code branch
    and the main development tree, so it should be
    fixed in future releases of the Net-SNMP package.

     
  • Nobody/Anonymous

    Logged In: NO

    Confirming patch works as expected.

     

Log in to post a comment.