From: Buchan M. <bg...@st...> - 2010-03-10 23:28:25
|
On Friday, 19 February 2010 17:58:57 Young, Tom wrote: > Hi, > > I have one of three devmon pollers that keeps going purple, every few hours > or so. Running wireshark shows it completely stops communicating with the > xymon server. Is there a fix to this other than restarting it every time > it goes purple, or restarting it ever X hours? In the various environments I have/have had devmon running, I have always had problems reproducing this problem within any useful timeframe. E.g., in a previous position, the production environment had 2 servers polling a total of about 80 devices, and I would experience this problem about once in two weeks. However, I have now committed some changes I have been working on, that should hopefully: 1)Log more information in case of any socket communication errors 2)Provide information on fork behaviour in the dm test for the polling host 3)Provide for terminating forks that seem to be stalled (by sending data to idle forks to ensure they are alive), as a workaround that should prevent having to have scripts to restart devmon 4)Add timeouts for all socket communication (using "alarm" and "eval"), hopefully fixing the original problem If you can reproduce this issue with any better frequency, I would appreciate it if you could run the version currently in subversion (rev 180 or later). Preferably, run it in debug mode with high (5) verbosity, e.g.: ./devmon --debug -vvvvv -d /var/lib/devmon/hosts.db -c /etc/devmon.cfg If the problem persists, watch to see if the dm test for the polling device goes yellow or red, or polled devices go purple, and provide the log contents for at least the last two poll cycles before any of these changes (or, the relevant problems). Ideally, provide your feedback on the SF tracker: https://sourceforge.net/tracker/index.php?func=detail&aid=2897345&group_id=160720&atid=816977 Regards, Buchan |