#1768 PSU alarms lost if the active OA restarts

3.2.1
closed-fixed
dr_mohan
5
2013-06-20
2012-09-27
dr_mohan
No

power supplies 5 & 6 not hooked up. The alarms can still be lost if OA is rebooted or failed over. Does openhpi report sensor state for PSUs?

Discussion

  • dr_mohan
    dr_mohan
    2012-10-01

    This Interconnect health LED Failed status is being reported by the OA. So from the data I am given I would expect that someone could tell me why the OA is reporting this state and otherwise it is a problem with either the OA or the switch and not openhpid.

    The data I'm including is a very simple case of the Interconnect health LED sensor showing Failed and not clearing. All I did was reboot the switch, press 0 to put it at the recovery prompt, and after a few minutes quit that and let the switch finish booting. After the switch came back up the Interconnect health LED continued to show Failed and hpitree continues to show it as DISABLED. This data set also includes verbose openhpid data. Why is it Failed when the switch is up?

     
  • dr_mohan
    dr_mohan
    2012-10-01

    The previous comment does not belong to this issue.

     
  • dr_mohan
    dr_mohan
    2012-10-01

    You fail (pull?) a PSU (say PSU 6) and you will get the following
    sensor events:
    > > - Chassis predictive failure
    > > - Power Management 1 Redundancy lost
    > > - PM 1 predictive failure
    > > - PM 1 PSU slot 6 operational status failure
    > > - PM 1 PSU 6 predictive failure
    > > - PM 1 PSU 6 device failure
    > > - PM 1 PSU 6 AC failure
    > >
    client to openhpid is restarted.do a re-discovery and query all existing sensors, I get errors attempting to read the sensors for the failed PSU (the same sensors that generated sensor events when the PSU failed).Only the first 3 sensors (at the chassis level) come back.
    The last 4 (PSU specific) sensor events are lost (due to the fact that
    openhpid cannot read the sensors from the OA any longer).

     
  • dr_mohan
    dr_mohan
    2012-10-02

    Tested with OpenHPI-2.16.0 and OpenHPI-3.2.0.
    We could not get any events when the power supply is pulled, but got 7 events when power cable is pulled.
    Six events (Chassis predictive failure, PM 1 predictive failure, PM 1 PSU slot 6 operational status failure, PM 1 PSU 6 predictive failure, PM 1 PSU 6 device failure, PM 1 PSU 6 AC failure) plus a Power Management 1 predictive failure (instead of redundancy lost).
    Obviously there is a setup difference. Did OA failover or something else happened during this time? Did the power supply came back up between the two runs of the HPI client? Need to try other paths.

     
  • dr_mohan
    dr_mohan
    2012-10-18

    • status: open --> closed-fixed
     
  • dr_mohan
    dr_mohan
    2012-10-18

    Fixed in checkin #7518.

     
  • dr_mohan
    dr_mohan
    2013-06-20

    • Group: 3.3.x --> 3.2.1
     
  • Tariq Shureih
    Tariq Shureih
    2013-06-20

    *ATTENTION**
    This account is disabled and is no longer accessed by the recipient.
    Please remove it from your address book.

    Thanks