From: Jonathan F. <jon...@wi...> - 2008-05-15 19:48:37
|
For unknown reason, when an alarm got raised and added to the DAT (not persisted to memory) and I was then restarting the HPI daemon, the alarm wasn't always put pack to the DAT (the alarm wasn't cleared and still present on the Shelf Manager in my case so it should have been added back, but it wasn't). I think the HPI daemon has to be alive to process the assert/deassert of the alarm properly. The store to disk logic is there to help that I guess. But it's true that the domain manager should be aware that an alarm read from the DAT persist can be a false positive, it musts have a way of validating that information with the Shelf Manager. Regards, /jonathan On Thu, 2008-05-15 at 17:15 +0000, Ganesha, Raghavendra Pandimakki wrote: > Hi, > > While trying to analyze the second half (duplicate alarms) of the bug > 1794430, I got stuck with a basic question. > Why do we need to store the DAT entries in persistent memory? > > Please correct me if I'm wrong. > > According to SAF HPI spec, DAT contains the entries for active alarm. > The DAT stores the active alarms and deletes when alarms get cleared. > > Below is the extraction for SAF HPI B.02.01 spec (section 6.6): > The domain controller maintains a Domain Alarm Table (DAT) which > contains entries for each active alarm in the domain. Alarms are added > to and deleted from the DAT by the HPI implementation as the presence > or absence of the corresponding conditions are detected by the domain > controller. > > Storing the entries in persistent memory is required for marinating > the history. > This is true for Domain Event Log (DEL) and not correct for DAT. > > I'll try to explain with examples. > Scenario 1 > ------------- > 1. The user has enabled option of saving DAT entries to persistent > memory. > 2. The openhpi daemon is started. > 3. A resource (say) R1, reports an alarm related to temperature > sensor. > 4. An entry is created in the DAT and same is stored into the DAT > file. > 5. The openhpi daemon is brought down (for any reason). > > 6. Before the openhpi daemon is restarted, the resource R1 got > resolved problem and temperature alarm got cleared. > 7. openhpi daemon is restarted. > Domain controller (openhpi daemon) reads the DAT file and creates > an alarm entry for the temperature sensor in the DAT. > > This entry is a wrong as the temperature sensor alarm of resource R1 > is not active anymore. > Since the domain controller alarm entries added by reading DAT file, > openhpi plugin (which is managing resource R1) will not be aware of > these entries. > > Hence, alarm entries whose state is already cleared will never get > deleted from DAT. > > Scenario 2 > ------------ > Let us take the same situation as explained in scenario 1 till step 5. > > 6. The alarm condition in resource R1 still persists. > 7. The openhpi daemon is restarted. > Domain controller (openhpi daemon) reads the DAT file and creates > an alarm entry for the temperature sensor in the DAT. > 8. Since the openhpi plugin (which is managing the resource R1) is not > aware of the alarm added by the domain controller, plugin detects the > temperature sensor alarm condition and reports the same into DAT. > > Scenario 2 is root cause of the duplicate alarms as reported in bug > 1794430. > > I'm not able to find a reference in SAF HPI spec for storing the DAT > entries in persistent memory. > > Regards, > Raghavendra PG > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ Openhpi-devel mailing list Ope...@li... https://lists.sourceforge.net/lists/listinfo/openhpi-devel -- Jonathan Fournier, Senior Engineer, Wind River direct +1.613.270.5786 mobile +1.613.263.9223 fax +1.613.592.2283 350 Terry Fox Drive, Suite 200, Ottawa, Ontario, K2K 2W5, Canada |