Hi,
If the SEL buffer become full, IPMI will stop logging new events. There is a
mechanism in the IPMI specification to help monitoring the status of the SEL
buffer and a flag that indicates when the buffer is full so that events had to
be dropped. This is part of the data returned by "Get SEL Info Command" as
defined in the IPMI 2.0 specification (section 31.2, page 386).
The attached patch improves "ipmievd" to monitor the percentage used in the SEL
buffer and log warnings when the percentage is above a 80% and every time the
percentage used increases above 80% and also log another alert syslog message
when "overflow" occurs.
1) If the percentage used is above 80% a LOG_WARNING is emitted:
"SEL buffer used at nn%, please consider clearing the SEL buffer"
2) A new LOG_WARNING is emitted for any increase of percentage used above 80%
(eg. a new messages will be logged for 81%, 82%, ... 99%)
3) If the percentage decreases, no warning is emitted
4) If the "overflow" flag is set, a LOG_ALERT is emitted
"SEL buffer overflow, no SEL message can be logged until the SEL buffer is
cleared"
Then it would be the sysadmin's responsibility to clear the SEL buffer (possibly
after saving the log of events to a file), but at least "ipmievd" provides a
mechanism to help diagnosing before the SEL buffer becomes full.
Cheers,
Olivier
|