|
From: Patrick M. <pat...@ga...> - 2007-08-30 18:19:32
|
Hi all, I've been running Nagios 2.6 for about 6 months now, and every now and then we get critical pages about a machine being down, or at least Nagios can't connect to it. It causes the CEO to freak out and believe something is up with our network. To me, it seems like the box is getting stressed out during the tests and is causing the plugins to time out. Here's some of the alerts from this morning: ####################################### [08-30-2007 09:24:10] HOST ALERT: tu.xyz.com;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical [08-30-2007 09:24:00] SERVICE ALERT: seismo.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:24:00] SERVICE ALERT: p.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - popen timeout received, but no child process Service Critical[08-30-2007 09:24:00] SERVICE ALERT: ap.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - popen timeout received, but no child process Service Critical[08-30-2007 09:24:00] SERVICE ALERT: cry.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:24:00] SERVICE ALERT: wns.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:24:00] SERVICE ALERT: qke.xyz.com;/work;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds. Service Critical[08-30-2007 09:24:00] SERVICE ALERT: hl-hayes-br.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - popen timeout received, but no child process Service Critical[08-30-2007 09:24:00] SERVICE ALERT: pl.xyz.com;SMTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds Service Critical[08-30-2007 09:24:00] SERVICE ALERT: qke.xyz.com;/home2;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds. Service Critical[08-30-2007 09:24:00] SERVICE ALERT: qke.xyz.com;/;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds. Service Critical[08-30-2007 09:24:00] SERVICE ALERT: o.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - popen timeout received, but no child process Service Critical[08-30-2007 09:24:00] SERVICE ALERT: o.xyz.com;SSH;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds Service Critical[08-30-2007 09:24:00] SERVICE ALERT: o.xyz.com;DNS;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call Service Critical[08-30-2007 09:23:41] SERVICE ALERT: sgull.xyz.com;PING;CRITICAL;HARD;3;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: hister.xyz.com;PING;CRITICAL;HARD;3;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: hs1.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: nbridged.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: h1.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: dfied-1.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: pes.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: ruits.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: nge-routed.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: eng-1.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: pe.xyz.com;FTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: g1.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: gb1.xyz.com;PING;CRITICAL;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: jith.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Warning[08-30-2007 09:23:40] SERVICE ALERT: pule.xyz.com;PING;WARNING;SOFT;1;PING WARNING - Packet loss = 44%, RTA = 3.64 ms Service Critical[08-30-2007 09:23:40] SERVICE ALERT: gd2.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: hx1.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: g2.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:40] SERVICE ALERT: eo.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Host Up[08-30-2007 09:23:40] HOST ALERT: e.xyz.com;UP;SOFT;3;PING OK - Packet loss = 0%, RTA = 4.58 ms Host Down[08-30-2007 09:23:40] HOST ALERT: e.xyz.com;DOWN;SOFT;2;CRITICAL - Plugin timed out after 10 seconds Host Down[08-30-2007 09:23:20] HOST ALERT: e.xyz.com;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:10] SERVICE ALERT: cr.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:02] SERVICE ALERT: wm.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:23:00] SERVICE ALERT: t.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds Service Critical[08-30-2007 09:22:52] SERVICE ALERT: smo.xyz.com;PING;CRITICAL;SOFT;1;CRITICAL - Plugin timed out after 10 seconds ####################################### The machine is a p4 2.4 ghz with 1gb ram. I'm not sure how to troubleshoot this - any ideas? What can I provide you folks in order to help me out? Thanks in advance. |