Re: [SSI] Building a cluster with cheap parts...
Brought to you by:
brucewalker,
rogertsang
From: john c. <ca...@wo...> - 2001-09-10 23:33:17
|
> > john casu wrote: > > > > We've been trying to build an SSI cluster, with some parts that we > > had lying around, including a PII box, and a Pentium box > > > > We got to the point where we could start 2 single node clusters, > > and have them share GFS storage, but as soon as we tried to get the > > second > > machine (in this case, the pentium box) to join the cluster, all hell > > would break loose: > > > > second box crashes with a NULL pointer, > > and the first box gets stuck in a loop with memexpd error messages > > > > The question I have is what hardware configuration I need to have > > in order to duplicate an existing successful SSI installation. > > Is there a set of minimal hw requirements ? > > > We've only tested with memexpd running on a box outside the cluster. If > you have a third box on the same LAN as your cluster interconnect, copy > memexpd to it and run it there. Edit /etc/gfscf.cf on node 1 to specify > the IP address of the third box and write it out to the first GFS > partition using gfsconf (you might have to erase the old data first with > the -e option). Bring up node 1 all the way, then bring up node 2 (our > single init doesn't handle simultaneous boots, yet). ok.. let me be a little more precise as to what's going on. We did try to follow the instructions as closely as possible. We boot our external separate 3rd machine running memexpd, and then we boot node 1, our master. Once node 1 is completely booted, then we boot node 2 node 2 then crashes with a NULL pointer in process icssvr_daemon (and that's all the information we can get from the console, and there's nothing in the logs) In node 1 I get a bunch of informational messages: spawn_daemon_thread: Truncated daemon name: ics_accept_connection (a whole bunch of these) spawn_daemon_proc: Truncated daemon name: VPROC Slave Daemon (just one of these) Then, when I reboot node 2, the last message it gets to is: Found node 1 as root node Waiting to join cluster. Then it just sits there, presumably waiting to join the cluster. Now on node 1, I get a continuous loop of: memexpd asking for lock (4,13414464) action 3 from 2 spawn_daemon_thread: Truncated daemon name: ics_accept_connection hope this helps, thanks for your time, john c. |