From: <kri...@th...> - 2009-09-15 07:02:53
|
Okay, I will try that later today. Already thanks for replying. Kind regards, Met vriendelijke groet, Kristof Van Den Ouweland System Engineer Thomson Reuters T: +32 3 220 76 40 F: +32 3 220 76 31 kri...@th... http://compumark.thomson.com -----Original Message----- From: Buchan Milne [mailto:bg...@st...] Sent: Monday, September 14, 2009 22:27 To: dev...@li... Cc: Van Den Ouweland, Kristof (Prof II&RS) Subject: Re: [Devmon] Some work, some don't On Thursday, 10 September 2009 07:59:10 kri...@th... wrote: > Hi all, > > I recently created some additional templates for our HP servers types > bl35p, bl45p, dl585 g2 and dl585 g5. Hmm, at least dl585 should have worked with the existing compaq-server or compaq-servernohspare template. > After completing the tests, I > started to roll it out to all our hp machines one by one. Most of them > work fine and deliver the results to devmon. > > What I basically check for those machines are: > > - Server model, serials nr and product id > - ILO logs entries and errors > - CPU type, speed, cache and status > - Mem type, speed and status > - Power status + Power redundancy > - Fan Status > - Temp status > - Drive status, size, cache, serial nr.. > .... > > > The strange thing is, is that although some machines have the same > configuration, (so same cpu, same memory, same sw version, same os, > same hphealth agent version...), some respond with the results as > expected, while others don't return the data correctly and show the error: > 'Missing repeater data for primary OID [NAME]'. > (devmon logs show the same error but nothing extra.) > > Another result of this is, if one specific oid is impacted, lets say > Fan Status, all other OID's for that same test (temp status, drive > status...) also fail with the same error. > > What I already tried to do > - do a manuel snmpget/snmpwalk of those OID's to check if they indeed > return repeater or non-repeater data. Seems to be ok as it shows the > same results as one a machine where it does work. What I saw on the ProLiant's was that some OID tables just disappeared totally, in the following cases: 1)No hot spare configured on a RAID controller, in this case use compaq- servernohspare instead of compaq-server template 2)The IML log being totally empty, this occurs e.g. if you run "hpasmcli -e 'clear iml'" (IIRC), instead you should clear the IML log from the iLO web interface, or use hpimllogview (or something like that). I will try and look at fixing the assumption that devmon currently has that if one tree doesn't respond the device is not responding .... but if you follow the two rules above you shouldn't have problems. Regards, Buchan |