Menu

#1381 OpenHpi fails on call to saHpiIdrAreaHeaderGet()

2.13.0
closed-fixed
5
2008-08-29
2008-07-24
No

This defect report concerns the OpenSAF HiSV service which is a program that makes calls into OpenHPI. For reference, I'm running OpenHPI 2.11.1 on c-Class with the OA Soap plugin. The OpenSAF version is 1.2.1.

HiSV makes calls into OpenHPI whenever a server blade is powered-up, powered-down, inserted, or removed. In the case of an existing blade powering-up or powering-down, or being removed, HiSV does not run into any issues with the OpenHPI calls it is making. Only when a server blade is inserted are there problems.

When a server blade is inserted into the c-Class enclosure, two events are generated - and the events are about a minute apart from each other.

The first event occurs as the new blade goes from the "NOT-PRESENT" state to the "INSERTION-PENDING" state. The second event occurs about a minute later as the blade goes from the "INSERTION-PENDING" state to the "ACTIVE" state.

HiSV makes OpenHPI calls on the new resource after receiving the first (INSERTION-PENDING) event, and some of these calls fail.

The first OpenHPI call that fails is:
saHpiHotSwapPolicyCancel()
which fails with a return value of:
-1002 (SA_ERR_HPI_UNSUPPORTED_API)

OpenHPI should never return this value on a managed hotswap device.

The second OpenHPI call that fails is:
saHpiIdrAreaHeaderGet()
which fails with a return value of:
-1011 (SA_ERR_HPI_NOT_PRESENT)

Note that if I add a sleep of 60 seconds in HiSV after receiving the first (INSERTION-PENDING) event, but prior to making these OpenHPI calls, then the calls do succeed. This implies some kind of race condition in OpenHPI. Also note that when powering-up an existing blade, the same 2 events are generated and HiSV makes these same OpenHPI calls on the powered-up resource, but no failures occur. I suspect that on blade insertion, the OpenHPI data structures do not yet have full integrity at the time when the first event (INSERTION-PENDING) is generated - and this causes the about OpenHPI calls that HiSV is making to fail. The OA Soap plugin should perhaps force a re-discovery of the blade prior to sending out the first event, so that the internal OpenHPI data structures will have full integrity if a client application (such as HiSV) starts making OpenHPI calls on the new resource after receiving the first (INSERTION-PENDING) event.

Also, the same situation is true for switch interconnects (on c-Class) as they also support the managed hotswap model. So a solution to the problem described here for server blades should also be applied to switch interconnects.

It is important that these defects are fixed, otherwise OpenSAF/HiSV will not work properly on c-Class enclosures.

Discussion

  • peter dinh phan

    peter dinh phan - 2008-08-07
    • milestone: 761458 --> 2.13.0
     
  • Bryan Sutula

    Bryan Sutula - 2008-08-29

    Logged In: YES
    user_id=814412
    Originator: NO

    Note that this tracker issue really addresses two problems:

    1) Returning SA_ERR_HPI_UNSUPPORTED_API
    * Duplicate of: [ 2021749 ] Correct plugin return values for [In]ActiveSet, PolicyCancel
    * Fixed in commit 6852

    2) The timing of the insertion event
    * Fixed in commit 6871

    This entire issue is now closed. Requesting (as much as possible) that defect reports don't combine unrelated problems.

     
  • Bryan Sutula

    Bryan Sutula - 2008-08-29
    • status: open --> closed-fixed