#1767 Sensors not disabled when switch is down

Future
closed-invalid
5
2013-10-31
2012-09-25
dr_mohan
No

Problem Description:with bladesystem firmware 3.56:
While a switch is down/booting the OA ‘show interconnect status’ # shows “Device Failure Failed”, yet hpitree doesn’t show the corresponding sensor as disabled. The alarms for the down switch DON’T appear unless the OA is restarted. If the switch is kept down AND the OA is restarted only then do these alarms appear:

Discussion

  • dr_mohan

    dr_mohan - 2012-09-26

    Steps to reproduce the case are
    1) boot the switch opposite of the active OA (eg if OA1 is active then boot switch 2)
    2) since it boots quickly, press 0 at the Boot Profiles prompt just to prevent it from coming up so there's time to collect data
    3) verify on the OA that the interconnect status doesn't match reality and/or that hpitree doesn't

    switch 2 is down, the OA shows the interconnect as being Failed on with "Device Failure" and "Health LED" but from the hpitree output it looks like only the Health LED is being reported

     
  • dr_mohan

    dr_mohan - 2012-09-26

    Analysis

    show interconnect status 2 shows different output when it is on or off on OA 3.55. The following are the outputs
    Interconnect Module #2 Status:
    Status : OK
    Thermal: OK
    CPU Fault: OK
    Health LED: OK
    UID: Off
    Powered: On
    Diagnostic Status:
    Internal Data OK
    Management Processor OK
    Thermal Warning OK
    Thermal Danger OK
    I/O Configuration OK
    Power OK
    Device Failure OK
    Device Degraded OK
    ==========================
    Interconnect Module #2 Status:
    Status : OK
    Thermal: OK
    CPU Fault: OK
    Health LED: OK
    UID: Off
    Powered: Off
    Diagnostic Status:
    Internal Data OK
    Management Processor OK
    Thermal Warning OK
    Thermal Danger OK
    I/O Configuration OK
    Power OK

    Effectively there are three fields that are different.
    Powered: field changes from On to Off
    Device Failure OK and Device Degraded OK are removed.

    On OpenHPI-2.16.0 only the POWER_STATE is set to Off but the two sensors Health status operational (28) and Health status predictive failure (29) are left in the ENABLED state.

    On OpenHPI-3.2.0 these sensors are disabled.

     
  • dr_mohan

    dr_mohan - 2012-09-26
     
  • dr_mohan

    dr_mohan - 2012-09-26
     
  • dr_mohan

    dr_mohan - 2012-09-27

    There are cases where
    show interconnect status x
    shows
    ============
    Powered: Off
    but
    Device Failure OK
    Device Degraded OK
    ================
    Not sure why this is the case. In this situations also hpitree output matches show interconnect status output

     
  • dr_mohan

    dr_mohan - 2012-10-18

    Looks like this is an Onboard Administrator Firmware (OA FW) problem. It is taken care of with some changes to OA. New OA version (>OA 3.60) will fix it. So closing it for now. If needed it could be opened later.

     
  • dr_mohan

    dr_mohan - 2012-10-18
    • status: open --> closed-invalid
     
  • dr_mohan

    dr_mohan - 2013-10-31
    • 3.4.0: 3.3.x --> Future
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks