#1348 Invalid handling of ATCA Led Controls in Manual Mode

2.11.3
closed-fixed
5
2008-06-16
2008-03-31
Jeff Clyne
No

If an ATCA LED is in manual mode, the ipmidirect plugin will revert it to auto mode when creating an RDR for the control. In the method cIpmiMcVendor::CreateControlAtcaLed, there is a comparison against the the third bit which results in led_default_local_color being set to 0. The third bit is the lamp test indicator. It should actually be checking the second bit to test for override mode.

Line 1035

if ( (ledrsp.m_data[2] & 0x01) == 0 led_default_local_color = 0;

should be:

if ( (ledrsp.m_data[1] & 0x01) == 0 led_default_local_color = 0;

We are using an uTCA chassis with a coredge MCH. This MCH puts all the AMC BLUE Leds into manual mode immediately. The openhpi daemon was setting them back to auto mode once they where discovered, resulting in the BLUE Led's being fully lit.

Discussion

  • Pierre Sangouard

    Logged In: YES
    user_id=944149
    Originator: NO

    This must be a different issue that you should investigate further. What the plugin does is test bit 0 of byte 3 (byte number here is according to PICMG 3.0 spec where byte number starts at 1) of the Get FRU LED State command response which indicates if the IPMC has Local Control over the LED. If it is the case, led_default_local_color will be != 0 and Local Control will be restored in cIpmiControlAtcaLed::CreateRdr().

     
  • Jeff Clyne

    Jeff Clyne - 2008-04-01

    Logged In: YES
    user_id=2050929
    Originator: YES

    I'm sorry, I was misinterpreting the above code to be checking the bitmask.
    The issue I am seeing occurs if byte 3 (ledrsp.m_data[2]) has a value of 0x03, both bit 0 and 1 are set, the LED reverts to Local Control. If an LED has a local control state that is currently overridden and in a manual state, both bits will be set. The PICMG 3.0 spec defines the priority of these states. In my case, the MCH takes control of the Blue LEDs on all the AMCs when they are inserted.

    The check should probably be something like:
    if ( ((ledrsp.m_data[2] & 0x01) == 0) || (ledrsp.m_data[2] & 0x02) )
    led_default_local_color = 0;

     
  • Renier Morales

    Renier Morales - 2008-06-09
    • milestone: 774587 --> 831933
    • assigned_to: psangouard --> shuah
     
  • Jeff Clyne

    Jeff Clyne - 2008-06-09
    • assigned_to: shuah --> psangouard
    • milestone: 831933 --> 774587
     
  • Jeff Clyne

    Jeff Clyne - 2008-06-09

    Logged In: YES
    user_id=2050929
    Originator: YES

    UPDATE:
    The issue is a race condition when creating the RDR for atca LED’s in ipmi_control_atca_led.h/cpp. The RDR defines a control mode(Auto or Manual) and a control mode read-only flag. The implementation also defines internal flags for whether an override state (auto to manual) or lamp test state are supported. The issue is that the only way to determine override support is by the return code, meaning the state is changed and then reverted, via the Set FRU Led State command. Any intentional state changes in that time are “lost”.

    The changes I am proposing rely on a the LED state definitions on page 115 of the HPI-to-AdvancedTCA Mapping specification. It states that the Control Mode Read-Only applies to the SetFruLedState IPMI command to support both a lamp test or override state. There is no real need to identify individual states, just that the command is supported. Therefore, the patch uses only the lamp test to set the Control Mode Read-Only flag. This solves the problem of the incorrectly modifying the state since the lamp test is only temporary.

    This is also consistent with the ATCAr.3.0 spec that also defines a control's support for the Set FRU LED State command, not individual lamp test or override states. It is still true that an LED may support lamp test and not override status, but that will be dealt with by a proper return code. The Read-Only status in the RDR only deals with support of the SetFruLedState command.

    Also, since the HPI-to-AdvancedTCA spec states that the Default Control Mode set to SAHPI_CTRL_MODE_MANUAL and the Control Mode Read-Only status set to SAHPI_FALSE is not valid, I set the Control Mode Read-Only status SAHPI_TRUE without issuing the lamp test if the default mode is manual.

    File Added: ipmi_control_atca_led.cpp.diff

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-09

    Logged In: YES
    user_id=2050929
    Originator: YES

    File Added: ipmi_control_atca_led.h.diff

     
  • Shuah Khan

    Shuah Khan - 2008-06-10

    Logged In: YES
    user_id=1884926
    Originator: NO

    Reviewed the patch. A few questions on the patch. First the race condition itself and your comment on "It is still true that an LED may support lamp test and not override status, but that will be dealt with by a proper return code."

    Could you please describe the race condition in more detail. I can see there is a race potential as the create RDR code is changing the status by sending messages to set the LED into override state and lamp test state. However, I am not clear on how it manifests. If you could go over scenarios such as:

    1. LED hardware in override mode when discovery happens
    2. LED hardware in local control mode when discovery happens
    3. LED hardware in lamp test mode when discovery happens

    I think I understand the rationale for removing override test and lamp test when the LED is in Manual mode i.e when m_led_default_local_color = 0, however it is not clear why m_set_led_state_supported gets set to true. This should be set to false in this case based on the HPI-to-AdvancedTCA Mappinf Spec:
    "If the Default Control Mode is set to SAHPI_CTRL_MODE_MANUAL and the Control Mode Read-Only
    status is set to SAHPI_TRUE, this indicates that there is no AdvancedTCA® Local Control support for
    that LED."

    as m_set_led_state_supported tracks Set FRU LED State support for override or lamp test. From the description of your specific hardware, it sounds like it is possible to have hw in a manual state even though it supports Set FRU LED State, but I don't think it would correct to set this flag without testing for override or lamp test case.

    Also the AUTO case is not very clear. In the Auto case , the patch removes override test and keeps lamp test and sets m_set_led_state_supported to true when lamp test is supported. This is fine. However this code will not the handle the case when override test is supported and lamp test is not as it doesn't test for override case.

    There is spacing issue in the conditional:

    if ( m_led_default_local_color == 0 )

    I fixed it the code now reads as follows:

    if ( m_led_default_local_color == 0 )
    {
    rec.DefaultMode.Mode = SAHPI_CTRL_MODE_MANUAL;
    rec.DefaultMode.ReadOnly = SAHPI_TRUE;
    m_set_led_state_supported = true;
    oem_rec.ConfigData[1] = 0;
    }
    else
    {
    rec.DefaultMode.Mode = SAHPI_CTRL_MODE_AUTO;

    // Check if Set FRU LED State is supported by
    // issuing a lamp test
    ledmsg.m_data[3] = 0xFB;
    ledmsg.m_data[4] = 0x01;
    ledmsg.m_data[5] = 0x0F;

    rv = Resource()->SendCommand( ledmsg, ledrsp );

    if ( rv != SA_OK
    || ledrsp.m_data_len < 2
    || ledrsp.m_data[0] != eIpmiCcOk
    || ledrsp.m_data[1] != dIpmiPicMgId )
    {
    m_set_led_state_supported = false;
    rec.DefaultMode.ReadOnly = SAHPI_TRUE;
    oem_rec.ConfigData[2] = 0;
    }
    else
    {
    m_set_led_state_supported = true;
    rec.DefaultMode.ReadOnly = SAHPI_FALSE;
    }
    }

     
  • Shuah Khan

    Shuah Khan - 2008-06-11

    Logged In: YES
    user_id=1884926
    Originator: NO

    Email update from jclyne.

    To review the problem I am having, after discovery and subsequent RDR creation, the states of all the Blud LEDs in my microTCA chassis where incorrect. The MCH in our chassis enables the override mode on all the Blue LEDs in order to provide uniform operation for device hot-swapping. After openhpi discovery, all the Blue LEDs are illuminated (incorrectly) and in incorrect states(local control).

    The reason became obvious in the original code. There was a check for the existence of the local control state, as indicated by the Get FRU LED state command which sets m_led_default_local_color, but no check for whether the override state was currently enabled. The subsequent logic to check for the support of the manual override mode when an LED has a local control state was to use the Set FRU LED state command to enable override, check the completion code and then revert it back to local control. If an LED has a local control state and is currently in override when discovery occurs, the RDR creation code will set it back to local(incorrectly) when it reverts. The original fix I propsed was to check both the existence of a local control state and a whether an override is currently enabled. The problem here is that this still suffers from the potential race of 1. Check override, not enabled 2. Set override mode in RDR creation 3. MCH enables override mode 4. RDR creation code reverts back to local(incorrectly). Without being able to lock the override state, there is no atomic and therefore 'safe' way to check for support of the override state. The PICMG 3.0 standard ATCA base spec says in REQ 3.222 "Managed FRUs that do not support the Set FRU LED State command for control fo the BLUE LED SHALL return the "Invalid data field in Request (CCH) completion code", but defines to flag or bit indicating support of the state.

    The lamp test, however, does not present a race in that it is a high-priority temporary state that will revert automatically. In the above example, 1. Check for local control state is true 2. Issue lamp test 3. MCH sets override mode 4. Lamp test expires, reverts to override mode, the LED state is correct after RDR creation. Therefore, checking for support of a Lamp Test state is not an issue.

    The conclusion seems to be that there is an issue with ATCA-HPI mapping in that Req 4.5.7.7 says that an LED with a local control state and either an Override or Lamp test state should set the Control Mode Read Only status to SAHPI_FALSE, however there is no 'safe' and way to check for support of the Override State. Therefore is no obvious "safe" way to implement the spec as stated (unless I am missing something), only best alternatives. The two I devised are:

    1. Set the Control Mode Read Only status to SAHPI_FALSE automatically if the LED has a local control state. This will allow for issuing the Set FRU LED state commands, however the unsupported states (Override and/or Lamp Test) will return the specified Completion Code of CCh(invalid date field), as stated in the ATCA base spec. This is what I meant by letting the completion codes take care of it.

    2. Since the lamp test is safe, use it's completion code to set the Control Mode Read Only status. This obviously works if both the lamp test and override mode are supported. If only the lamp test is supported, the Control Mode Read Only status will get set to True allowing attempts at enabling the override state, resulting in the Completion Code of CCh(invalid date field) as stated above. If only the override state is supported, it will incorrectly block attempts to enable it, however this is by far the least likely case. Rarely will a device support the IPMI command, have both a local control and override state and not support a simple lamp test. This is the alternative I suggested in my patch.

    Finally, the m_set_led_state_supported getting set to true in the manual case is a typo... sorry for that, it was a cut and paste error from the original code.

    Jeff Clyne

     
  • Shuah Khan

    Shuah Khan - 2008-06-11

    Logged In: YES
    user_id=1884926
    Originator: NO

    How about going with your alternative 1? I like that the best because it eliminates
    the problem when just override is supported. Any cons with this option? Am I missing
    something?

    1. Set the Control Mode Read Only status to SAHPI_FALSE automatically if
    the LED has a local control state. This will allow for issuing the Set FRU
    LED state commands, however the unsupported states (Override and/or Lamp
    Test) will return the specified Completion Code of CCh(invalid date
    field), as stated in the ATCA base spec. This is what I meant by letting
    the completion codes take care of it.

    The logic for override and local control mode will look as follows: No change to
    the override part - listing for completeness fixing the bug:

    if ( m_led_default_local_color == 0 )
    {
    rec.DefaultMode.Mode = SAHPI_CTRL_MODE_MANUAL;
    rec.DefaultMode.ReadOnly = SAHPI_TRUE;
    m_set_led_state_supported = false;
    oem_rec.ConfigData[1] = 0;
    }
    else
    {
    rec.DefaultMode.Mode = SAHPI_CTRL_MODE_AUTO;
    m_set_led_state_supported = true;
    rec.DefaultMode.ReadOnly = SAHPI_FALSE;
    }

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-11

    Logged In: YES
    user_id=2050929
    Originator: YES

    The only con I see with going with alternative #1 would be that it essentially invalidates the Control Mode Read Only status for an LED with a local control state, which isn't really that big of a deal. Its already meaningless for LEDs without a local control state. I'm fine with taking that route, it will certainly fix the problem.

     
  • Shuah Khan

    Shuah Khan - 2008-06-11
    • assigned_to: psangouard --> shuah
     
  • Shuah Khan

    Shuah Khan - 2008-06-12

    Logged In: YES
    user_id=1884926
    Originator: NO

    jclyne will generate a new patch with alternative 1 and resend it.

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-12

    Logged In: YES
    user_id=2050929
    Originator: YES

    File Added: ipmi_control_atca_led.cpp.diff

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-12

    Logged In: YES
    user_id=2050929
    Originator: YES

    File Added: ipmi_control_atca_led.h.diff

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-12

    Logged In: YES
    user_id=2050929
    Originator: YES

    Patches updated with changes described below. Tested in in Global Velocity's uTCA hardware.

     
  • Jeff Clyne

    Jeff Clyne - 2008-06-12

    Logged In: YES
    user_id=2050929
    Originator: YES

    File Added: ipmi_control_atca_led.cpp.diff

     
  • Shuah Khan

    Shuah Khan - 2008-06-16
    • milestone: 774587 --> 2.11.3
    • status: open --> open-fixed
     
  • Shuah Khan

    Shuah Khan - 2008-06-16
    • status: open-fixed --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks