Menu

#1757 BIOS sensors and events are not handled

3.2.1
closed-fixed
5
2013-06-20
2012-08-29
Andy Cress
No

On all Intel motherboards, there are various BIOS events logged by the BMC into the SEL.
Some are informational, such as System Events that are logged normally during boot, but others are for various BIOS detected faults related to CPU, Memory, etc.
The ipmidirect plugin artificially restricts the HandleEvents from passing up any BIOS-detected events.
It returns the message "remove event: system software event"
The BIOS fault events need to be handled.

Example Intel motherboards: S5520UR, S2600CO, S2400EP, S5000PAL, etc.
Sample command to simulate a BIOS Memory ECC error event for testing:
ipmiutil cmd 00 20 10 02 33 03 0c 02 6f a0 00 21

In investigating this, I also see that SDRs that are owned by BIOS are not included in the RDRs.

Discussion

  • Anton Pak

    Anton Pak - 2012-08-29
    • milestone: 3074951 --> 3.3.x
     
  • Lars Wetzel

    Lars Wetzel - 2012-08-29

    Hi Andy,
    first of all: I don't have any IPMI hardware to debug it.
    Which means I have to guess and ask a little bit.
    Are we talking about an ATCA system?
    Can you check if the sensors have a valid IPMI SDR Entry?
    Does the ShelfManger recognize these sensors?

    Regards
    Lars

     
  • Andy Cress

    Andy Cress - 2012-08-29

    This is not ATCA, no shelf manager. It is an Intel server with a baseboard BMC. Yes the IPMI sensors are fine, and the IPMI events are fine, just a problem with the openhpi events.
    The libipmi plugin does at least include the BIOS events at HPI events, but mis-interprets them. The libipmidirect plugin artificially throws these events away.

     
  • Anton Pak

    Anton Pak - 2012-08-29

    IPMI Direct via system interface (SMI)?

     
  • Andy Cress

    Andy Cress - 2012-08-29

    Yes, ipmidirect with smi, although the interface wouldn't matter. This would be an issue with IPMI LAN (RMCP) also.

     
  • Anton Pak

    Anton Pak - 2012-08-29

    As I can guess those events come with odd source address, right?
    The "remove event: system software event" means that.
    I am not IPMI expert but I would expect any event has SA = 0x20.
    Also entity in SDRs can be different from any MC/FRU/Entity Association entity.

     
  • Lars Wetzel

    Lars Wetzel - 2012-08-29

    The ipmi direct plugin is designed to fulfill the HPI-to-ATCA Mapping Specification. One thing is to forward events which are not covered by well defined sensors, the other thing is that you also have no RDR entries for the BIOS sensors. For me it's not a bug against IPMI Direct plugin.

     
  • Andy Cress

    Andy Cress - 2012-08-29

    Lars,
    I was there when Thomas wrote this plugin, and although handling ATCA is an important facet of this plugin, it is also the only active plugin for all IPMI servers, so yes it is a bug with ipmidirect, Since you don't have any hardware, it sounds like I'll need to submit a patch.
    Andy

     
  • Andy Cress

    Andy Cress - 2012-08-29

    Anton,
    Yes the BIOS-related events come in with SA=0x01 or SA=0x33, and the ipmi_discover.cpp code apparently was written to exclude these. I understand that the code was written expecting the SA to always be 0x20, but that isn't always the case.
    Until now, we have mainly used HPI for the high-level stuff only, since it works best with full SDR sensors, but although the sensor readings for compact & event-only sensors are not important, the events are.
    Andy

     
  • Anton Pak

    Anton Pak - 2012-08-29

    Compact and Event-Only SDRs can be easily converted to full ones (or set of full ones).
    I think the main expectations are:
    1) source address is the same as MC source address
    2) any sensor SDR has parent MC/FRU
    2a) SDR entity matches one of MC/FRU Device Locator
    2b) or SDR entity matches Entity Association Record or Device-Relative Entity Association Record
    2c) Device-Relative Entity Association Records form entity tree where root entity is MC or FRU.

     
  • Lars Wetzel

    Lars Wetzel - 2012-08-29

    Andy,
    I know Thomas also and due to the fact that Thomas worked with ATCA systems as I did it in the past, it makes sense that you will submit the patch.
    Lars

     
  • Andy Cress

    Andy Cress - 2012-09-26

    Lars,

    I have a patch that resolves this problem, adding the capability to recognize and handle Event-Only SDRs. This works on Intel baseboards, and should work for any other firmware that supports the Event-Only SDRs. For platforms that do not use this SDR type, there should be no impact.

    Please review/comment.

    Andy

    Andy

     
  • Andy Cress

    Andy Cress - 2012-09-26

    Patch to openhpi-3.2.0 for EventOnly SDRs

     
  • Lars Wetzel

    Lars Wetzel - 2012-09-26

    Hi Andy
    for me it looks good.
    Thx
    Lars

     
  • Andy Cress

    Andy Cress - 2012-09-27

    OK, changes submitted to SVN head as rev 7515.

     
  • Andy Cress

    Andy Cress - 2012-09-27
    • assigned_to: larswetzel --> arcress
    • status: open --> pending-fixed
     
  • Andy Cress

    Andy Cress - 2013-05-29
    • status: pending-fixed --> closed-fixed
     
  • dr_mohan

    dr_mohan - 2013-06-20
    • Group: 3.3.x --> 3.2.1
     
  • Tariq Shureih

    Tariq Shureih - 2013-06-20

    *ATTENTION**
    This account is disabled and is no longer accessed by the recipient.
    Please remove it from your address book.

    Thanks