[Mon-devel] mon "fork bomb"
Brought to you by:
trockij
From: Anders S. <and...@ba...> - 2011-11-17 09:51:22
|
Hello, We've been running mon for a decade now, and it's been working great. However, the last month we've started to run into a problem. The best explaination I have is that mon "fork bombs", and the load goes thru the roof. I've only been able to do a dump of ps before we had to reboot the system, and it lists: 365 of these: qroot 607 0.0 3.0 225952 124644 ? S 13:36 0:00 /local/bin/perl /local/etc/mon/mon -l -f -c /etc/mon/mon.cf -P /var/run/mon.pid root 3044 2.6 3.3 225952 137156 ? D 13:38 0:09 /local/bin/perl /local/etc/mon/mon -l -f -c /etc/mon/mon.cf -P /var/run/mon.pid and 1684 of these: root 3043 0.7 0.0 0 0 ? Z 13:38 0:02 [mon] <defunct> This week is has happened 3 times already. This is something I've haven't seen in the past. During normal use, it doesn't seem to be overloaded: root@mon03.osl mon]# free total used free shared buffers cached Mem: 4040056 1654504 2385552 0 68576 1119588 -/+ buffers/cache: 466340 3573716 Swap: 4192944 0 4192944 [root@mon03.osl mon]# uptime 10:10am up 1:11, 5 users, load average: 3.11, 3.05, 2.77 I'm still trying to debug whenever this happens, but there is limits to how long we can debug each time as monitoring is down in this period, and deubgging with triple-digit load time is tideous. I haven't found any indicators in any logfiles either. Does anyone have any ideas what could be causing this? -- Anders Synstad Basefarm AS |