I have a problem monitoring a big number of Cisco and non unix hosts. ( about 5000)
I use nagios 3.0a5.
i do nothing but "host checks" to these hosts.( no service checks), as all I need is availability.
I initially used check_ping for all and the hosts were polled in about 7-8 minutes.
Check_ping opens about 100 -> 200 processes , and takes my machine to loads between 2 and 7
I have 50% free memory left and the CPU usage reaches 20%
I tried "check_fping" for all ,( for a few hosts check_fping works just fine , it's way faster than normal ping) but the problem is that it opens lots and lots of processes that stay in the sleep state a while and then close.
It reached 4000 processes if you can belive that and the load reached 270 :))
-- incredible right...
This is not a server problem
4* Intel(R) Xeon(TM) CPU 3.40GHz
280 GB HD space
So is there any way to limit the number of fping processes ??
Should I upgrade to 3.0b5 ?? is there any major difference?
Is this a known issue that is solved by changing some config line ??
Does anybody monitor a huge number of hosts with Nagios, if yes, how does he do it?
Any help will be appreciated..
Thanks a lot