Thread: [Nagios-users] Log Rotations Issue
Nagios network monitoring software is enterprise server monitoring
Brought to you by:
egalstad,
sawolf-nagios
From: Alaric <pax...@gm...> - 2013-01-31 15:56:54
|
Hi, I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. Any help is much appreciated! -a |
From: Martin H. <Mar...@hb...> - 2013-01-31 16:24:43
|
Hi Alaric, I had a similar issue and never did figure it out. Unfortunately, I only had one server but it was virtual so I just built a new one from scratch and transferred my configs. Sometimes expediency demands undesirable methods. -- Martin T. Hugo Network Administrator Hilliard City Schools 614-921-7102 (Ph) 614-921-7243 (Fax) -----Original Message----- From: Alaric [mailto:pax...@gm...] Sent: Thursday, January 31, 2013 10:57 AM To: nag...@li... Subject: [Nagios-users] Log Rotations Issue Hi, I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. Any help is much appreciated! -a ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan _______________________________________________ Nagios-users mailing list Nag...@li... https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null |
From: Alaric <pax...@gm...> - 2013-01-31 16:55:07
|
On Jan 31, 2013, at 11:09 AM, Martin Hugo <Mar...@hb...> wrote: > Hi Alaric, > > I had a similar issue and never did figure it out. Unfortunately, I only had one server but it was virtual so I just built a new one from scratch and transferred my configs. > > Sometimes expediency demands undesirable methods. > > -- > Martin T. Hugo > Network Administrator > Hilliard City Schools > 614-921-7102 (Ph) > 614-921-7243 (Fax) > > > Yikes, I was hoping I wouldn't have to do a complete rebuild! |
From: Assaf F. <na...@fl...> - 2013-01-31 16:35:17
|
On 31/01/13 15:56, Alaric wrote: > Hi, > > I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. > > I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. > I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? > My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. > > Any help is much appreciated! > > > -a > > > > What is the difference in the volume of activity on those servers , it could be that you found an issue related the the amount of checks or traffic generated to the the log. If the internal log rotation is faulty - have you considered using logrotate as a dirty hack to fix your issue ? |
From: Alaric <pax...@gm...> - 2013-01-31 16:53:27
|
On Jan 31, 2013, at 11:32 AM, Assaf Flatto <na...@fl...> wrote: > On 31/01/13 15:56, Alaric wrote: >> Hi, >> >> I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. >> >> I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. >> I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? >> My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. >> >> Any help is much appreciated! >> >> >> -a >> >> >> >> > What is the difference in the volume of activity on those servers , it > could be that you found an issue related the the amount of checks or > traffic generated to the the log. > > If the internal log rotation is faulty - have you considered using > logrotate as a dirty hack to fix your issue ? > While logrotate or a cronjob will clean up the actual files, part of what I'm trying to troubleshoot what looks like the failure of some internal nagios processes, for example, some processes nightly normally adds these enteries into the top of the log: [1359608400] CURRENT HOST STATE: example1;UP;HARD;1;FPING OK - 10.1.2.3 (loss=0%, rta=1.210000 ms) Which goes missing, even if I manually rotate the logs... The difference in volume seems pretty low i've been trying to keep dev and prod as similar as possible: Host Service Checks from Dev: # Active Host / Service Checks: 1486 / 7219 # Passive Host / Service Checks: 0 / 0 Host Service Checks form Prod: # Active Host / Service Checks: 1564 / 8264 # Passive Host / Service Checks: 0 / 84 |
From: Randal, P. <phi...@ho...> - 2013-01-31 18:22:14
|
I'm seeing the same issue here :-( Phil -----Original Message----- From: Alaric [mailto:pax...@gm...] Sent: 31 January 2013 15:57 To: nag...@li... Subject: [Nagios-users] Log Rotations Issue Hi, I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. Any help is much appreciated! -a ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan _______________________________________________ Nagios-users mailing list Nag...@li... https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null “Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Hoople Ltd. You should be aware that Hoople Ltd. monitors its email service. This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it. |
From: Alaric <pax...@gm...> - 2013-02-04 17:14:54
|
On Jan 31, 2013, at 12:53 PM, "Randal, Phil" <phi...@ho...> wrote: > I'm seeing the same issue here :-( > > Phil > > > -----Original Message----- > From: Alaric [mailto:pax...@gm...] > Sent: 31 January 2013 15:57 > To: nag...@li... > Subject: [Nagios-users] Log Rotations Issue > > Hi, > > I was hoping that someone on this list might have some insight to an issue that I recently ran into after upgrading my Nagios core installation to 3.4.4 (out of the EPEL Repo) After upgrading, log rotation stopped on one of my two servers, and perfdata_file_processing_commands stopped working, and whatever Job that sets 'CURRENT HOST STATE' in the main nagios logs stopped working. > > I upgraded both my Dev server, and my Production server, and only my Prod servers seems to have the issue. Both run the same code, and both have the same configs. > I'm confident the configs as the same, as both get the configs deployed via puppet. I've googled around quite a bit, and haven't had any luck figuring it out. Has anyone seen anything similar? > My feeling, based on the behavior of my Dev server is that it's not a problem with the code, but that something got "stuck" but I'm darned if I can figure out what. I've cleared out the logs, restarted nagios, rebooted the server. Audited configs and checked the debug output. > > Any help is much appreciated! > > > -a > > I just wanted to bump this..... I was thinking I might delete retention.dat and let nagios recreate it... although I can't think of why that would have much effect.... can anyone think of anything I can do to trace back the root cause? It seems like most of the other cache or temp files get recreated each time nagios starts. Thanks, -a |
From: Alaric <pax...@gm...> - 2013-02-06 15:43:38
|
On Jan 31, 2013, at 12:53 PM, "Randal, Phil" <phi...@ho...> wrote: > I'm seeing the same issue here :-( > > Phil > > Phil, Stopping Nagios completely, and then flushing retention.dat worked for me. I stopped Nagios, NSCA, and NRPE, then did a "cat > retention.dat " then restarted all of the nagios related services. (reloaded xinetd too which is where mk_livestatus runs for me) The following night, log rotations worked and Current status updated correctly. I have no real idea what this worked, but it did. I hope it helps you! -a |
From: Randal, P. <phi...@ho...> - 2013-02-07 12:43:07
|
Hmm, does Nagios clean out old entries in retention.dat on a daily basis? And it's this process which is failing, and it's not log-related? I'm unlikeky to find any time to check the source in the immediate future, alas. Phil -----Original Message----- From: Alaric [mailto:pax...@gm...] Sent: 06 February 2013 15:43 To: Nagios Users List Subject: Re: [Nagios-users] Log Rotations Issue On Jan 31, 2013, at 12:53 PM, "Randal, Phil" <phi...@ho...> wrote: > I'm seeing the same issue here :-( > > Phil > > Phil, Stopping Nagios completely, and then flushing retention.dat worked for me. I stopped Nagios, NSCA, and NRPE, then did a "cat > retention.dat " then restarted all of the nagios related services. (reloaded xinetd too which is where mk_livestatus runs for me) The following night, log rotations worked and Current status updated correctly. I have no real idea what this worked, but it did. I hope it helps you! -a ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Nagios-users mailing list Nag...@li... https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null “Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Hoople Ltd. You should be aware that Hoople Ltd. monitors its email service. This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it. |