Thread: [SSI] Building a cluster with cheap parts...
Brought to you by:
brucewalker,
rogertsang
From: john c. <ca...@wo...> - 2001-09-10 21:33:41
|
We've been trying to build an SSI cluster, with some parts that we had lying around, including a PII box, and a Pentium box We got to the point where we could start 2 single node clusters, and have them share GFS storage, but as soon as we tried to get the second machine (in this case, the pentium box) to join the cluster, all hell would break loose: second box crashes with a NULL pointer, and the first box gets stuck in a loop with memexpd error messages The question I have is what hardware configuration I need to have in order to duplicate an existing successful SSI installation. Is there a set of minimal hw requirements ? thanks, john c. |
From: Brian J. W. <Bri...@co...> - 2001-09-10 22:34:28
|
john casu wrote: > > We've been trying to build an SSI cluster, with some parts that we > had lying around, including a PII box, and a Pentium box > > We got to the point where we could start 2 single node clusters, > and have them share GFS storage, but as soon as we tried to get the > second > machine (in this case, the pentium box) to join the cluster, all hell > would break loose: > > second box crashes with a NULL pointer, > and the first box gets stuck in a loop with memexpd error messages > > The question I have is what hardware configuration I need to have > in order to duplicate an existing successful SSI installation. > Is there a set of minimal hw requirements ? We've only tested with memexpd running on a box outside the cluster. If you have a third box on the same LAN as your cluster interconnect, copy memexpd to it and run it there. Edit /etc/gfscf.cf on node 1 to specify the IP address of the third box and write it out to the first GFS partition using gfsconf (you might have to erase the old data first with the -e option). Bring up node 1 all the way, then bring up node 2 (our single init doesn't handle simultaneous boots, yet). If you're feeling adventurous and/or thrifty, I think you can make memexpd work on node 1. The trick is to edit node 1's ramdisk. You have to gunzip it and mount it using the loopback device. Copy the memexpd program into bin, and edit the linuxrc script to start memexpd before doing the GFS mount. Unmount the ramdisk, gzip it, and re-run lilo. I think this configuration will work if you always bring up node 1 first. The problem is you can't shut off node 1 without bringing down the cluster. A better solution is to integrate together IBM's distributed lock manager (DLM), our cluster membership subsystem (CLMS -- part of CI and SSI), and GFS. Then the lock server would be part of the cluster, and it could gracefully handle arbitrary node failures. Both Sistina and the OpenGFS project are talking about doing this. BTW, node 2 shouldn't have crashed with a NULL pointer dereference. There's probably a bug in our code, much as it pains me to admit. ;) -- Brian Watson | "The common people of England... so Linux Kernel Developer | jealous of their liberty, but like the Open SSI Clustering Project | common people of most other countries Compaq Computer Corp | never rightly considering wherein it Los Angeles, CA | consists..." | -Adam Smith, Wealth of Nations, 1776 mailto:Bri...@co... http://opensource.compaq.com/ |
From: john c. <ca...@wo...> - 2001-09-10 23:33:17
|
> > john casu wrote: > > > > We've been trying to build an SSI cluster, with some parts that we > > had lying around, including a PII box, and a Pentium box > > > > We got to the point where we could start 2 single node clusters, > > and have them share GFS storage, but as soon as we tried to get the > > second > > machine (in this case, the pentium box) to join the cluster, all hell > > would break loose: > > > > second box crashes with a NULL pointer, > > and the first box gets stuck in a loop with memexpd error messages > > > > The question I have is what hardware configuration I need to have > > in order to duplicate an existing successful SSI installation. > > Is there a set of minimal hw requirements ? > > > We've only tested with memexpd running on a box outside the cluster. If > you have a third box on the same LAN as your cluster interconnect, copy > memexpd to it and run it there. Edit /etc/gfscf.cf on node 1 to specify > the IP address of the third box and write it out to the first GFS > partition using gfsconf (you might have to erase the old data first with > the -e option). Bring up node 1 all the way, then bring up node 2 (our > single init doesn't handle simultaneous boots, yet). ok.. let me be a little more precise as to what's going on. We did try to follow the instructions as closely as possible. We boot our external separate 3rd machine running memexpd, and then we boot node 1, our master. Once node 1 is completely booted, then we boot node 2 node 2 then crashes with a NULL pointer in process icssvr_daemon (and that's all the information we can get from the console, and there's nothing in the logs) In node 1 I get a bunch of informational messages: spawn_daemon_thread: Truncated daemon name: ics_accept_connection (a whole bunch of these) spawn_daemon_proc: Truncated daemon name: VPROC Slave Daemon (just one of these) Then, when I reboot node 2, the last message it gets to is: Found node 1 as root node Waiting to join cluster. Then it just sits there, presumably waiting to join the cluster. Now on node 1, I get a continuous loop of: memexpd asking for lock (4,13414464) action 3 from 2 spawn_daemon_thread: Truncated daemon name: ics_accept_connection hope this helps, thanks for your time, john c. |
From: Kai-Min S. <ks...@ca...> - 2001-09-11 02:16:30
|
Hi John, This seems to be some race condition in the ICS code. I'm not sure why we're not seeing this in our test cluster, but I'm currently looking into it. It would be helpful to get a dump of the kernel stack at the time of the NULL dereference. At our lab, we normally patch our kernels with KDB so that we can dump the kernel stack during a panic or NULL dereference. Here's a link to the KDB site run by SGI: http://oss.sgi.com/projects/kdb/ We've modified their patch for our specific SSI source. You can retrieve the patch from this link: http://ci-linux.sourceforge.net/download/kdb-v1.8-2.4.6-ssi To apply the patch, run this command at the top of your source tree: patch -p1 < kdb-v1.8-2.4.6-ssi There might be some fuzz, but don't worry about it as long as you don't see any "failed" messages. Make sure the following CONFIG options are turned on when you recompile your kernel: CONFIG_KDB=y CONFIG_KALLSYMS=y CONFIG_FRAME_POINTER=y Booting with this kdb-enhanced kernel, you'll automatically be dropped into the debugger when your kernel panics or hits a NULL pointer dereference. You can also drop into the debugger manually by hitting the Pause/Break key. At the kdb prompt, type "bt" to get a stack trace. It would be helpful, if you could send me the stack trace of the machine that crashed. There is more documentation on the various kdb commands in the Documentation/kdb directory of the patched source tree. Thanks, Kai-Min Sung CI/SSI Linux Developer kai...@co... john casu wrote: > > > > john casu wrote: > > > > > > We've been trying to build an SSI cluster, with some parts that we > > > had lying around, including a PII box, and a Pentium box > > > > > > We got to the point where we could start 2 single node clusters, > > > and have them share GFS storage, but as soon as we tried to get the > > > second > > > machine (in this case, the pentium box) to join the cluster, all > hell > > > would break loose: > > > > > > second box crashes with a NULL pointer, > > > and the first box gets stuck in a loop with memexpd error messages > > > > > > The question I have is what hardware configuration I need to have > > > in order to duplicate an existing successful SSI installation. > > > Is there a set of minimal hw requirements ? > > > > > > We've only tested with memexpd running on a box outside the cluster. > If > > you have a third box on the same LAN as your cluster interconnect, > copy > > memexpd to it and run it there. Edit /etc/gfscf.cf on node 1 to > specify > > the IP address of the third box and write it out to the first GFS > > partition using gfsconf (you might have to erase the old data first > with > > the -e option). Bring up node 1 all the way, then bring up node 2 (our > > single init doesn't handle simultaneous boots, yet). > > ok.. let me be a little more precise as to what's going on. > We did try to follow the instructions as closely as possible. > > We boot our external separate 3rd machine running memexpd, and then we > boot node 1, our master. > > Once node 1 is completely booted, then we boot node 2 > > node 2 then crashes with a NULL pointer in process icssvr_daemon > (and that's all the information we can get from the console, and there's > nothing in the logs) > > In node 1 I get a bunch of informational messages: > spawn_daemon_thread: Truncated daemon name: ics_accept_connection > (a whole bunch of these) > spawn_daemon_proc: Truncated daemon name: VPROC Slave Daemon > (just one of these) > > Then, when I reboot node 2, the last message it gets to is: > Found node 1 as root node > Waiting to join cluster. > Then it just sits there, presumably waiting to join the cluster. > > Now on node 1, I get a continuous loop of: > memexpd asking for lock (4,13414464) action 3 from 2 > spawn_daemon_thread: Truncated daemon name: ics_accept_connection > > hope this helps, > thanks for your time, > > john c. > > _______________________________________________ > ssic-linux-devel mailing list > ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel |