From: Cook, G. <GW...@ma...> - 2004-08-26 17:22:34
|
I didn't see this original message on the PerfParse User's mailing list. Did this thread begin somewhere else, or am I missing some messages? per...@li... wrote: >> Hi all. >>=20 >> I've observed my nagios restart every perfparse.sh run. I use >> perfparse-0.99.07=20 You should upgrade to 0.99.09. It may or may not help with this issue, but there are a couple of other bugfixes/enhancements available in the latest version. And don't forget to run '/<path_to>/nagios/bin/perfparse-db-tool --update' after each upgrade... >> [26-08-2004 12:35:00] Nagios 1.2 starting... (PID=3D12443) >> [26-08-2004 12:35:00] Caught SIGHUP, restarting... >>=20 >> Is this ok? If so, why is needed nagios restarting? >=20 > If you are using --delete* option, yes, perfparse sends some SIGHUP > signal to nagios to ask for restart and use a new serviceperf.log > file.=20 > Otherwise, nothing to see with perfparse. I've seen some postings (on Nagios Users or Devel lists) about Nagios restarting when sent a HUP, whereas it should only re-read it's config files. I don't remember exactly, but I think the original posters may have been using Nagios 2.0 and/or Fedora Linux Distro. Are you using either of these? What versions of Nagios/Linux are you using? >> I'm trying to figure out why my nagios stops working after few hours >> and I'd like to know if this "nagios restarting" has some relation >> to "nagios stopping". > > perfparse does not ask nagios to stop, but only to start > writing to a new > serviceperf.log file, which also mean restarting, but not stopping. > In your log, strange that nagios starts and restarts at the > same time. SIGHUP caught by > change at that time ? If Nagios fails to HUP properly, it could be leaving you with more than one Nagios process running. This can sometimes cause odd behavior, like checks not being run and/or notifications not being sent. Is Nagios actually shutting down? Or is it running but not performing actions as expected? >> When nagios stops its checks, it always has a zombie process. >=20 > How long do they live ? (just asking for if others can help :) >=20 > Yves >=20 >>=20 >> Regards, >>=20 >> Wilson It looks like you've already answered the question that I asked above. So, Nagios is running but not performing checks, correct? This is most likely related to the zombie/second running Nagios process. This may be due to the HUP sent by PerfParse, but not exactly caused by it. A HUP sent to Nagios by PP, command line, or some other process should be interpreted correctly and not cause this problem. If it's not working correctly, I believe that the problem is related to Nagios itself. If you can provide the version information that I asked above (Nagios/Linux), I can look back through the archives and see what information I can find. Or, you could do this if you are on Nagios and/or Nagios Devel lists. Garry W. Cook, CCNA Network Infrastructure Manager MACTEC, Inc. - http://www.mactec.com/ 303.308.6228 (Office) - 720.220.1862 (Mobile) |