From: Russ G. <gr...@la...> - 2003-03-06 00:46:43
|
Sorry for more confusion, but I am having some trouble getting some things working. I am trying out the CI software to see if it will work in a losely coupled cluster setup I am building. Using ci-linux-2.4.18-v0.7.6 and cluster-tools-0.7.6. I have two nodes (for now), identical, both have fresh RH7.3 installs and fresh 2.4.18 patched kernels. The CI kernel patches are installed, and seem to be working. cluster -v gives: [root@b root]# cluster -V Node 1: State: UP Previous state: COMINGUP Reason for last transition: API Last transition ID: 4 Last transition time: Wed Mar 5 16:49:45.654512 2003 First transition ID: 3 First transition time: Wed Mar 5 16:49:45.604512 2003 Number of CPUs: 1 Number of CPUs online: 1 Node 2: State: UP Previous state: COMINGUP Reason for last transition: API Last transition ID: 2 Last transition time: Wed Mar 5 16:49:37.174512 2003 First transition ID: 1 First transition time: Wed Mar 5 16:49:37.104512 2003 Number of CPUs: 1 Number of CPUs online: 1 [root@b root]# Installed the cluster-tools ssi components spawndaemon and keepalive. Here come the questions: At first, the /dev/keepalivecfg pipe was not created as part of install, so I made it by hand. Also, I had to add the keepalive section into the inittab by hand as well. Did I miss something in the install? Now after a reboot, the CI stuff seems to be working ok, but an attempt to use the spawndaemon leaves the following log entry: Mar 5 17:06:00 b spawndaemon[1051]: spawndaemon: Could not open pipe /dev/keepalivecfg. Keepalive is not active. Retrying ... I have to find the running keepalive and kill it, let init restart it then it seems to be able to open the pipe. Is this a problem in how I am starting keepalive? And finally, does keepalive run on each node, or only once on the cluster? Currently I am running keepalive on each node. If I run spawndaemon on node 1 and try and register a daemon to run on node 2, the keepalive on 1 registers a failure to start the daemon but it indeed does start, but on node 1. node 2 seems oblibious to the whole thing. Hmmmm. [root@b log]# spawndaemon -L -v human keepalive running: TRUE quiesce flag: FALSE pid: 1058 node number: -1 registered processes: 0 table size: 200 max. possible processes: 200 polling: FALSE polling interval: 5 primary node: None secondary node: None [root@b log]# It seems incorrect that the node number is -1.... Have I missed something fundamental in the configuration, or am I off in some other variable space? TIA, r. |
From: Brian J. W. <Bri...@hp...> - 2003-04-14 20:51:30
|
Russ Gritzo wrote: > I am trying out the CI software to see if it will work in a losely > coupled cluster setup I am building. Using > ci-linux-2.4.18-v0.7.6 and cluster-tools-0.7.6. > > [snip] > > Installed the cluster-tools ssi components spawndaemon and keepalive. Sorry for the long delay. The ssi components of cluster-tools (including spawndaemon and keepalive) are only intended for use with the OpenSSI kernel (openssi.org). Both CI and OpenSSI are maintained by the same group of developers. Unfortunately, there are also a few commands outside the ssi/ directory which are only intended for use with OpenSSI (e.g., loadlevel, loads, onall, onnode, etc.). Only libcluster and the cluster* commands (and maybe ha-lvs, Aneesh?) are meant to be used with CI. Sorry about the confusion, Brian |
From: Aneesh K. K.V <ane...@di...> - 2003-04-15 03:04:10
|
On Tue, 2003-04-15 at 02:19, Brian J. Watson wrote: > Russ Gritzo wrote: > > I am trying out the CI software to see if it will work in a losely > > coupled cluster setup I am building. Using > > ci-linux-2.4.18-v0.7.6 and cluster-tools-0.7.6. > > > > [snip] > > > > Installed the cluster-tools ssi components spawndaemon and keepalive. > > Sorry for the long delay. The ssi components of cluster-tools (including > spawndaemon and keepalive) are only intended for use with the OpenSSI > kernel (openssi.org). Both CI and OpenSSI are maintained by the same > group of developers. > > Unfortunately, there are also a few commands outside the ssi/ directory > which are only intended for use with OpenSSI (e.g., loadlevel, loads, > onall, onnode, etc.). > > Only libcluster and the cluster* commands (and maybe ha-lvs, Aneesh?) > are meant to be used with CI. > No ha-lvs need a cluster wide root ( /etc ) for finding config files. It also uses /etc/lvs.VIP.active for informing other nodes regarding the master director node for a CVIP. It uses vproc chan for registering cluster service.I have also kept the files under linux/cluster/ssi/net . So one will also have to play with Makefiles to get it build. That makes it a bit difficult. -aneesh |