From: Mark S. <Mar...@hp...> - 2009-01-16 18:19:41
|
Al Chu wrote: > Hi Mark, > > An additional comment following up some of the other ones assuming > you're using IPMI over LAN. > no, I'm running it on each local box >> In any event, if it turns out that sending/receiving the command to >> the sensors is where all the is spent and startup is less of a big >> deal, this whole discussion is moot. >> > > We monitor, power control, do various things w/ IPMI on a number of > large clusters w/ IPMI over LAN. I'm not sure how large the systems are > that collectl is trying to monitor, but I've found that there can be > slowdowns in IPMI over LAN with 1000+ nodes. As a UDP protocol, > eventually one or two packets will get lost, so you can't monitor > consistently all the time. > I guess this is all moot since I'm monitoring locally, but there are people running collectl on multi-K clusters I also guess based on previous notes I'm running into the whole Heisenberg thing by getting in the way of the systems I'm trying to monitor if I do it too frequently and so if 1-2 minutes is more appropriate then the performance is certainly acceptable where it is. -mark > ipmitool seems to indicate a default timeout of 2 seconds for over lan. > So if just one packet gets lost, you're hosed on your < 1 second goal. > > Al > > On Fri, 2009-01-16 at 07:48 -0500, Mark Seger wrote: > >> Carol Hebert wrote: >> >>> Hi and Happy '09! >>> >>> To celebrate the New Year, I propose we start working on rolling >>> ipmitool to v1.8.11 to include all the great fixes and patches folks >>> have been sending in since last summer! Do you have a fix you've been >>> using in your local ipmitool version? Send it in for review! Do you >>> have a patch or an idea for some new functionality or support you'd like >>> to see get into the tool? Send it in for review! >>> >> Well you did ask, so here ya go... >> >> I'm the author of collectl [see: http:// collectl.sourceforge.net/], >> which is a tool for doing system monitoring. A couple of the things I >> think that sets it apart from others is it's very light-weight and that >> it collects a very broad set of data, making it possible to drill down >> and see what the state of all your system resources are any point in time. >> >> About 6 months ago I discovered ipmitool and now use it to monitor >> temps, fans and power though I could see adding other resources as >> well. My own problem with ipmitool is that while it's pretty >> light-weight, it's not light-weight enough and so I had to give it its >> own monitoring interval - I monitor most things every 10 seconds when >> running as a daemon, though there are those who sample every 5 seconds >> or even 1. I did some tests with ipmitool running at those frequencies >> and there was just too much system load compared to collectl proper >> which uses <0.1% and so I'm only sampling sensors every 2 minutes, which >> I thought at first was reasonable. >> >> However since adding power sensor monitoring and watching the power >> every second, just to see what's happening over short bursts of time, I >> saw that sensor changes much more frequently than I thought it would. >> Given all the interest in green computing I can see where more frequent >> power monitoring would be a good thing. I even went as far to see what >> it might take to query ipmi directly to save the overhead of running a >> new instance of ipmitool every sample period but I think I can now >> appreciate just how complex ipmi is and would prefer to continue to use >> ipmitool as it already does a great job. >> >> So, as for an enhancement I'd like to see a way to run it at rates on >> the order of a few seconds and not incur a lot of overhead. I'm basing >> this on my assumption that the bulk of ipmitool's time is spent on >> starting the image as well as establishing the internal communications >> connection(s). If so, my logical conclusion would be there needs to be >> a way to do the initialization one time and not exit. While I can think >> of a few ways to do it, and I think the first is probably the easiest, >> perhaps someone intimately familiar with ipmitool's innards such as >> yourself would have better ways: >> - run it as a daemon, allowing it to receive commands over a socket and >> send the results back >> - build some sort of library that could be called via something like >> perl, once call to initialize and a second to query, probably others... >> >> In any event, if it turns out that sending/receiving the command to the >> sensors is where all the is spent and startup is less of a big deal, >> this whole discussion is moot. >> >> Anyhow, you DID ask... ;-) >> -mark >> >> >> ------------------------------------------------------------------------------ >> This SF.net email is sponsored by: >> SourcForge Community >> SourceForge wants to tell your story. >> http:// p.sf.net/sfu/sf-spreadtheword >> _______________________________________________ >> Ipmitool-devel mailing list >> Ipm...@li... >> https:// lists.sourceforge.net/lists/listinfo/ipmitool-devel >> >> |