From: Buchan M. <bg...@st...> - 2009-09-01 11:54:58
|
On Tuesday, 1 September 2009 10:38:44 lor...@pn... wrote: > Hi Buchan, > > Thanks for your reply. > The temperature info is part of the "status" test, which includes uptime, > CPU utilization, fan status, enclosure/tray(s) status and other bits I > really care about. > I'll try to "adjust" the template and remove the temperature bit. In my opinion, it makes sense to split these tests, for example a number of other templates have separate "temp" tests, and since they are relatively consistent, allow graphing of temperatures on a number of devices. It also allows you to separate alerts (on the xymon/hobbit side) for different aspects (rather than acknowledging a high temperature, only to miss a failed tray later ...). I think the following should really be done with this template: netapptemp renamed to temp New tests created for: power fans raid ? For example, consider the compaq-server and dell-poweredge templates. Once I have access to the Netapp's here, I will try and take a look at this, or someone who is already monitoring Netapp's is free to supply the relevant changes. > Still, why does a missing OID for the "status" test cause all the other > tests to turn grey? It's a bit of a bug, any failed queries, no matter the reason/result code, result in devmon assuming the device is down, and no more OIDs are queried. Disabling the tests that use those OIDs is at present a workaround, until I fix the root cause. > I may be missing something here, but I would have expected this to affect > just one test. No? Regards, Buchan |