From: <er...@he...> - 2002-11-01 00:28:42
|
On Thu, Oct 31, 2002 at 04:04:52PM -0800, Joshua J. England wrote: > > I think I'm getting very close now. I'm finally catching some RARPs > with beoserv when a slave boots, although the slave dies pretty > quickly. The last thing seen on the slave is: > > boot: Server IP address: 10.0.4.100 > boot: My IP address : 10.0.4.10 > boot: starting bpslave: bpslave -d -i 10.0.4.100 2223 > bpslave: IO daemon started; pid=11 > > beoserv on the master shows: > > beoserv: RARP: 00:30:59:00:98:26 == 10.0.4.10 > beoserv: Starting node_up worker for 1 clients. > nodeup : Child process for node 0 died with signal 4 > > > I'm booting from an elf image created from a standard bproc kernel, > along with the initrd created by 'beoboot -2'. Is this considered a > badbadthing? Do I need to roll my own initrd and run 'bpslave' from it? This is just the node setup program from beoboot. BProc is running and it appears to be at least mostly happy. Try this: /usr/lib/beoboot/bin/node_up -s ## This is the way to run the node setup program in interactive mode. This will let you muck around with it without having to reboot all the time. SIGILL sounds like there might be a migration problem of time kind. Did I just say BProc appeared happy? Whups. Here come the questions: Are there mixed architectures between the slave and the front end? (e.g. a P4 front end and an athlon slave node) If so, you need to make sure that the libraries you have installed will run on both nodes. I believe Red Hat (and possibly others) have started shipping libraries compiled specifically for i686, etc. Are there any messages on the slave's console at all? Some kind of mapping failure could be a clue here. Make sure your library list (bplib -l) doesn't include everything in /lib and /usr/lib. Here are the "libraries" lines that I'm using in my /etc/beowulf/config libraries /lib/ld-2* /lib/libc-2* /lib/libm-2* /lib/libcrypt* libraries /lib/librt-2* /lib/libpthread-* libraries /usr/lib/libbproc* /lib/libtermcap* /lib/libproc* libraries /lib/libresolv-2* libraries /lib/libpthread* libraries /lib/libnss_bproc* libraries /lib/libdl-2* libraries /lib/libnsl* libraries /usr/lib/libncurses* libraries /lib/libutil-2* > Also, what is the role of the 'bootfile' parameter in > /etc/beowulf/config? It looks like beoserv feeds it to a slave after a > RARP request, but changing it seems to have no effect. Hrm. It should have some effect. Make sure you SIGHUP beoserv after modifying the file. - Erik |