From: John S. <jse...@sy...> - 2005-05-31 21:52:53
|
There's a race condition of sorts in daemon-init.in that can result in multiple nagios daemons running. We have a nagios machine that gets a little slow sometimes, and we were seeing dupicate copies of nagios running (which of course made the slowness problem worse). We're using nagmin for config, which restarts nagios (on our redhat box) by running /etc/init.d/nagios restart which is very much like /etc/init.d/nagios stop /etc/init.d/nagios start My conclusion is that the likely cause of the duplicate nagios daemons was this: - busy machine - /etc/init.d/nagios stop - which kill -TERMs nagios, and then immdiately removes the pid lock file - /etc/init.d/nagios start - new nagios daemon creates new pid lock file - original nagios daemon finishes reaping and syncing, and rounds the event loop, removes the new pid lock file and exits So the next "/etc/init.d/nagios restart" didn't know nagios was already running (since the newest lock file was removed by the exiting nagios), and so another daemon gets started. The (or at least a) fix is to make sure that the exiting nagios daemon has a chance to clean up after itself. Suggested patch against CVS head below. Cheers, and thanks! John *** daemon-init.in.old Tue May 31 17:18:39 2005 --- daemon-init.in Tue May 31 17:44:25 2005 *************** *** 131,136 **** --- 131,156 ---- stop) echo "Stopping network monitor: nagios" killproc_nagios nagios + # now we have to wait for nagios to exit and remove its + # own NagiosRunFile, otherwise a following "start" could + # happen, and then the exiting nagios will remove the + # new NagiosRunFile, allowing multiple nagios daemons + # to (sooner or later) run + echo -n 'Waiting for nagios to exit .' + for i in 1 2 3 4 5 6 7 8 9 10 ; do + if status_nagios > /dev/null; then + echo -n ' .' + sleep 1 + else + break + fi + done + if status_nagios > /dev/null; then + echo '' + echo 'Warning - running nagios did not exit in time' + else + echo ' done.' + fi rm -f $NagiosStatusFile $NagiosTempFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile ;; |