I enabled the keepalive daemon on my 3 node SSI cluster (of which two are
init node capable). I am able to run my programs
fine using the spawndaemon interface.
This is the line I added to inittab for keepalive.
[root@... root]# grep keepalive /etc/inittab
I have a basic shell script which checks for a previous instance of the
program, unregisters it from keeplive using the slot number, then spawns a
fresh instance of the program for the given day. This part works fine.
However after some time, crond dies abruptly with nothing logged to syslog
or /var/log/cron. This has happened about 3-4 times so far and I am clueless
as to why this is happening.
The crond was stable enough till I enabled keepalive daemon. Does anyone
have a clue on how I could get more info on this problem and fix it?
Also, ever since I enabled keepalive, the pidof command behaves funny and
will not return at all. And on running pidof, the cluster just hangs...
(pidof crond to be specific; ps -ef| crond does not cause any problems)