From: Joshua J. E. <jj...@sa...> - 2002-11-01 00:43:15
|
Uh-Oh. I think you might have hit it. I'm running RH8.0 on a PIII as the master for smartcore PII slaves. I think i686 libs might not be happy on the PIIs. What to do? Install i386 libs in a separate partition or scrap the master and go with an identical arch? -JE On Thu, 2002-10-31 at 17:20, er...@he... wrote: > On Thu, Oct 31, 2002 at 04:04:52PM -0800, Joshua J. England wrote: > > > > I think I'm getting very close now. I'm finally catching some RARPs > > with beoserv when a slave boots, although the slave dies pretty > > quickly. The last thing seen on the slave is: > > > > boot: Server IP address: 10.0.4.100 > > boot: My IP address : 10.0.4.10 > > boot: starting bpslave: bpslave -d -i 10.0.4.100 2223 > > bpslave: IO daemon started; pid=11 > > > > beoserv on the master shows: > > > > beoserv: RARP: 00:30:59:00:98:26 == 10.0.4.10 > > beoserv: Starting node_up worker for 1 clients. > > nodeup : Child process for node 0 died with signal 4 > > > > > > I'm booting from an elf image created from a standard bproc kernel, > > along with the initrd created by 'beoboot -2'. Is this considered a > > badbadthing? Do I need to roll my own initrd and run 'bpslave' from it? > > This is just the node setup program from beoboot. BProc is running > and it appears to be at least mostly happy. > > Try this: > /usr/lib/beoboot/bin/node_up -s ## > > This is the way to run the node setup program in interactive mode. > This will let you muck around with it without having to reboot all the > time. > > SIGILL sounds like there might be a migration problem of time kind. > Did I just say BProc appeared happy? Whups. > > Here come the questions: > > Are there mixed architectures between the slave and the front end? > (e.g. a P4 front end and an athlon slave node) If so, you need to make > sure that the libraries you have installed will run on both nodes. I > believe Red Hat (and possibly others) have started shipping libraries > compiled specifically for i686, etc. > > Are there any messages on the slave's console at all? Some kind of > mapping failure could be a clue here. Make sure your library list > (bplib -l) doesn't include everything in /lib and /usr/lib. > > Here are the "libraries" lines that I'm using in my /etc/beowulf/config > > libraries /lib/ld-2* /lib/libc-2* /lib/libm-2* /lib/libcrypt* > libraries /lib/librt-2* /lib/libpthread-* > libraries /usr/lib/libbproc* /lib/libtermcap* /lib/libproc* > libraries /lib/libresolv-2* > libraries /lib/libpthread* > libraries /lib/libnss_bproc* > libraries /lib/libdl-2* > libraries /lib/libnsl* > libraries /usr/lib/libncurses* > libraries /lib/libutil-2* > > > > Also, what is the role of the 'bootfile' parameter in > > /etc/beowulf/config? It looks like beoserv feeds it to a slave after a > > RARP request, but changing it seems to have no effect. > > Hrm. It should have some effect. Make sure you SIGHUP beoserv after > modifying the file. > > - Erik |