From: Jamie C. <jca...@we...> - 2018-07-04 05:01:04
|
On 02/Jul/2018 22:48 Joaquim Homrighausen <jo...@we...> wrote .. > > We use Webmin for basic monitoring of a number of servers. We usually > set the check interval to somewhere between 15 and 30 minutes, depending > on which server it is. > > Quite often, we'll get a "monitor down" for a server, at which point we > check the server and find it to be very much up and running. On the next > check, Webmin will again report the server (or monitor) to be up. We > typically see this reported from "monitoring servers" that have a lot of > checks defined. Could this somehow confuse Webmin or congest the network > stack somehow? > > If this only happened once or on one server, I wouldn't worry so much > about it. But we recently switched one of our "bigger" monitoring > servers from one cloud platform to another, which completely different > architecture behind it, and then re-created the monitors on the new > server. Sure enough, pretty soon we began seeing false positives again. One possible work-around is to edit the ping monitor and increase the number of failures before emailing from the default of 1 to 2 or 3. > (By the way, it'd be nice if one could export monitors and then > re-import them on another server) You could do this manually already by copying files from /etc/webmin/status to the new system. |