From: Joaquim H. <jo...@we...> - 2010-07-01 08:12:45
|
We realized that we had a (virtual) SuSE 9.0 system being monitored from the central server. We, unfortunately, need that server to be in place since we need a binary distribution of MySQL 4.0.x (and it's hard to compile 4.0.x from source on a modern Linux system). It turns out that Webmin could not use SSL when talking to it because I had not seen that the SSL (perl) module that Webmin needs wasn't installed. I had already removed the monitors for that server from the centeral Webmin and placed them directly on the SuSE 9.0 server when we realized this. But as soon as this server was removed from the central monitoring, things began to act normally. So I guess a question could be how this could affect all the other monitors .. could it be that due to timeout reasons for this particular server, the ones "following" in the list, were delayed so much that a timeout occurred, and thus generated a warning? We found a heap of dead/zombie rpc processes too ... -joho On 06/30/2010 10:39 PM, Jamie Cameron wrote: > Could the issue perhaps be networking problems between the monitoring > Webmin system and the hosts being monitored? That could lead to false > reports of outages. > > Also, you might want to increase the number of failures before alerting > for each monitor from 1 to 2 or 3. This will prevent false alarms due to > short transient failures. > > - Jamie > |