You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: Larry B. <ba...@us...> - 2002-08-09 21:33:25
|
I noticed that the web page postings of my code for fixing NFS file locking turns all double < and > into single < and >, so beware. >$MTABFILE and >$FSTABFILE in /usr/lib/beoboot/bin/setup_fs, and <EOF in /usr/lib/beoboot/bin/node_up should be double arrows. Larry Baker US Geological Survey |
From: Larry B. <ba...@us...> - 2002-08-09 21:25:57
|
My posting of 08/08/2002, "This time it's right: NFS locking support and other enhancements", fixes Daniel Widyono's posting of 05/29/2002, "getpwuid and getgrgid bombing". Part of getting NFS file locking to work required getting the getpwxxx() functions to work. Without exorting to code modifications, I found the simplest fix is to copy the /etc/passwd and /etc/group files from the master node to the slave node, and create an nsswitch.conf on the slave node with entries that reference those files. I don't have a clue how the "bproc" NSS stuff works, so I left it first in the list of places NSS tries. For example, given the /etc/beowulf/nsswitch.conf: # # /etc/beowulf/nsswitch.conf # hosts: bproc passwd: bproc files group: bproc files rpc: files Copy the following files to each slave node (e.g., in /etc/beowulf/node_up): bpcp 0 /etc/{passwd,group,rpc} /etc/beowulf/nsswitch.conf 0:/etc This seems to be sufficient, as shown by my terminal session below. Larry Baker US Geological Survey [baker@sfsmanet tmp]$ cat getpwuid.c #include <stdio.h> #include <sys/bproc.h> #include <pwd.h> #include <sys/types.h> main() { struct passwd *pwent = getpwuid(0); printf("%s\n",pwent->pw_name); } [baker@sfsmanet tmp]$ su Password: [root@sfsmanet tmp]# bpcp nsswitch.conf 0:/etc [root@sfsmanet tmp]# bpsh 0 cat /etc/nsswitch.conf hosts: bproc passwd: bproc [root@sfsmanet tmp]# exit exit [baker@sfsmanet tmp]$ bpsh 0 ./a.out bpsh: Child process exited abnormally. [baker@sfsmanet tmp]$ su Password: [root@sfsmanet tmp]# bpcp /etc/beowulf/nsswitch.conf 0:/etc [root@sfsmanet tmp]# bpsh 0 cat /etc/nsswitch.conf # # /etc/beowulf/nsswitch.conf # hosts: bproc passwd: bproc files group: bproc files rpc: files [root@sfsmanet tmp]# exit exit [baker@sfsmanet tmp]$ bpsh 0 ./a.out root |
From: Erik A. H. <er...@he...> - 2002-08-09 21:11:47
|
I've released the scheduler we've been working on for BProc. PLEASE READ THE RELEASE NOTES! http://sourceforge.net/project/showfiles.php?group_id=24453&release_id=104390 Here are the release notes and a change log: 1.0 ------------------------------------------------------------------------ This is the first public release of BJS. BJS is a simple/efficient scheduler for clusters running BProc. --- WARNING --- WARNING --- WARNING --- WARNING --- WARNING --- WARNING --- THIS IS NOT PRODUCTION QUALITY CODE. THERE ARE ALMOST CERTAINLY SEVERE SECURITY HOLES IN THIS SYSTEM. SINCE THE BJS SERVER MUST RUN AS ROOT, IT IS LIKELY THAT ANY SUCH HOLES CAN/WILL LEAD TO ROOT COMPROMISE. --- WARNING --- WARNING --- WARNING --- WARNING --- WARNING --- WARNING --- Ok, got that out of the way. Normally, I wouldn't call code like this version 1.0 but there are political reasons for doing so. This code is being released in this state so that people willing to work with something that's not 100% finished or secure will have something to work with. I hope that those people will also help us get this scheduler into a production ready state. Please note that the student who wrote this code (Paul Ruth <ru...@ac...) will be returning to school. It's good to include him in any discussions regarding the scheduler but I will be the one merging patches, making new releases, etc. - Erik Hendriks <hen...@la...> Version 1.0 - First release * Scheduler crawls out of the slime. -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Erik A. H. <er...@he...> - 2002-08-09 20:08:29
|
You'll need a modified mpirun to work with BProc 3.2.0 and MPICH. I've put the tarball (which contains patches for MPICH) in the BProc files section on sourceforge as well. - Erik |
From: Erik A. H. <er...@he...> - 2002-08-09 20:07:24
|
On Fri, Aug 09, 2002 at 10:33:55AM -0400, Mike Snitzer wrote: > On Mar 25, 2002 Erik Arjan Hendriks (er...@he...) said: > > > A new version of the clustermatic CD was placed on > > www.clustermatic.org today. The diff looks like this: > > > > * Updated BProc (3.1.9): fewer bugs, remote exec hacks, etc. > > > > * Updated Beoboot (lanl 1.2): minor changes. > <snip> > > The MPICH hacks which I've mentioned are included on the CD in the > > mpirun-0.1 tar ball in the tarballs directory. We used those hacks to > > run a 396 process job on Chiba city at Argonne National Lab the other > > day. This clustermatic also survived some boot time torture testing > > on about 200 nodes there. (Thanks guys!) The load spike on the front > > end is severe when they all boot at once but all the nodes we started > > with came up. We've got a guy working on that problem here. > > Hopefully beoboot lanl 1.3 will mostly eliminate that problem. > > Has there been any progress on beoboot lanl 1.3 development? Yup. Lots. It's gotten a bit more complicated than I had in mind but it's actually started working (as of a few days ago). I'm planning on cleaning up a bit and throwing it up "soon". I expect next week. It's looking good right now. I setup 100 nodes (with 40MB of libraries) in 12 seconds yesterday. There's basically no more load spike though 12 seconds isn't really long enough to tell via the usual mechanisms. - Erik |
From: Erik A. H. <er...@he...> - 2002-08-09 19:58:15
|
I've put BProc 3.2.0 up on sourceforge in the usual place: http://sourceforge.net/project/showfiles.php?group_id=24453&release_id=104379 Here are the release notes and change log. 3.2.0 ---------------------------------------------------------------------- Demand loading of libraries The file request stuff has been ifdef'ed so that building it in is optional. I anticipate removing it as soon as our boot-time environement which doesn't use it becomes stable. Support for it can be switched on and off with the FILEREQ:= line in Makefile.conf. I've been testing with this turned off. It builds fine with it turned on and I presume it still works. The file request stuff is essentially untested from here on in. vrfork() and vexecmove() vrfork() and vexecmove() have been mostly rewritten. The interface changed a little bit to separate the input (node numbers) and the return values (pids). They are now more resiliant for failure of individual moves. The returned array of pids may include negative numbers which are error values. The return value is now a processes index in the list of children or the total number of nodes (regardless of failure) to the parent. Hopefully this interface will be stabilizing somewhat. There's also the usual round of bug fixes. See the change log for details. Changes from 3.1.10 to 3.2.0 * Added another work around a Linux TCP bug. This problem resulted in occasional segfaults when using the ghost execve hook. * Fixed a locking bug that could lead to master node crashes. * Fixed a lingering bpsh problem which could lose output from child processes. (Patch from Sean Dilda <ag...@sc...>) * Made flush_icache in vmadump conditional as suggested by Grant Taylor <gt...@sw...> * Fixed problems with using PTRACE_TRACEME on slave nodes. * Fixed a vrfork buglet which could lead to kernel oopses. * Fixed a problem with the ghost execve hook not doing mm_release on the slave node like a real execve. This lead to the parent process hanging if it used vfork() on the slave node. * Changed vrfork return value semantics. The vrfork return value is now -1, your index in the list of nodes or the total number of nodes. * Changed vrfork and vexecmove interface to separate input and output. * Added a bunch of BProc-specific errno values to allow for more detailed error reporting. * Fixed process migration hangs in the case of sender failure. * Added rank reporting to vexecmove via the environment variable BPROC_RANK=XXXXXXX. * Changed bpsh to use vexecmove instead of migrating each copy of the process off the front end manually. * Fixed a case where permissions of the ghost process and the real process could get out of sync. * Changed behavior for failed power off on the slave node from a reboot to a halt. * Added a patch for Linux 2.4.19. |
From: <gor...@ph...> - 2002-08-09 15:53:15
|
On Tue, 16 Jul 2002 08:29:52 -0600 Erik Arjan Hendriks wrote: >It is possible for a machine to be a master and slave for itself at >the same time. It's a pretty confusing arrangement though since >processes will show up several times in the process tree. > > . . . Ideally, all of the processes in a parallel job would be created on the master node and migrated to slave nodes. It also appears to be possible (see perl script at end) for a single bproc process on a slave node to rfork itself off to other slave nodes. The "ghost" processes only end up on the master node, which is fine. In this is the case, what advantage is there to having bpmaster run on every node? #!/usr/bin/perl use Parallel::Bproc; sub printhost { $host=`hostname`; print "I am $host\n"; } printhost; Parallel::Bproc::bproc_rfork(1) ; printhost; Parallel::Bproc::bproc_rfork(10) ; printhost; sleep 1000; It appeared to work: # /tmp/a.pl I am lxsrvr0 I am lxsrva2 I am lxsrva11 |
From: Mike S. <msn...@pl...> - 2002-08-09 14:30:25
|
On Mar 25, 2002 Erik Arjan Hendriks (er...@he...) said: > A new version of the clustermatic CD was placed on > www.clustermatic.org today. The diff looks like this: > > * Updated BProc (3.1.9): fewer bugs, remote exec hacks, etc. > > * Updated Beoboot (lanl 1.2): minor changes. <snip> > The MPICH hacks which I've mentioned are included on the CD in the > mpirun-0.1 tar ball in the tarballs directory. We used those hacks to > run a 396 process job on Chiba city at Argonne National Lab the other > day. This clustermatic also survived some boot time torture testing > on about 200 nodes there. (Thanks guys!) The load spike on the front > end is severe when they all boot at once but all the nodes we started > with came up. We've got a guy working on that problem here. > Hopefully beoboot lanl 1.3 will mostly eliminate that problem. Has there been any progress on beoboot lanl 1.3 development? Thanks, Mike |
From: Larry B. <ba...@us...> - 2002-08-08 23:11:26
|
This supercedes the posting I made on July 25, 2002. I have worked more on the support for NFS locking on the slave nodes. While I was at it, I made some additional fixes and enhancements to the beoboot slave node setup process. This fixes most of the problems I had running the ANL MPICH IO validation suite. However, I still occassionally have problems with parallel IO -- between the master and a slave node, usually. Also, it seems to matter whether the test file exists before the test is run. I am suspicious of the MPICH code. (E.g., errno comes back with a non-zero value, even though an MPI call returns MPI_SUCCESS.) One problem I have seen with Clustermatic/bproc that I don't understand: sometimes input redirection (< or <<) results in an empty input stream. For example, "cat <file | bpsh -n 0 cat" will echo nothing, even though "cat <file" works fine. I found this out because my previous submission created a zero-length slave node /etc/nsswitch.conf file when I added more lines to the <<EOF input stream in /usr/lib/beoboot/bin/node_up. That's why I now use a bpcp option to copy /etc/beowulf/nsswitch.conf. If someone can explain what this is symptomatic of, I'd like to know how to fix it or avoid it. I am using the Clustermatic CD image distribution of March 2002. Larry Baker US Geological Survey Steps in slave node file system setup: Create directories/soft links specified in /etc/beowulf/config (mkdir option). As specified in /etc/beowulf/fstab[.$NODE]: Load kernel modules for all file system types. Create device nodes for all local file systems. Mount all local and "nolock" network file systems without "noauto". Create cooked version in /etc/fstab on slave node. Copy files specified in /etc/beowulf/config (bpcp option). Create a default slave node /etc/nsswitch.conf file, if none exists. If there are NFS file systems without the "nolock" option: Create the statd database directories. Start the portmapper and statd daemons. Complete any deferred NFS mounts. Summary of file changes: /etc/beowulf/config Add mkdir option to create directories and soft links (no more hard-coded directories in /usr/lib/beoboot/bin/node_up). Add bpcp option to copy files to slave node. /etc/beowulf/fstab Add NODE to list of variables that will get substituted. "noauto" option is now honored. Cooked version is created as slave node /etc/fstab. /etc/beowulf/nsswitch.conf (new file) Name Server Switch configuration file to add local passwd, group, and rpc files to NSS search lists. (See bpcp entry in /etc/beowulf/config.) /etc/beowulf/node_up Define NODE, MASTER, and PATH variables. /usr/lib/beoboot/bin/node_up (#--- 1.17.1 --- brackets changes) Remove hard-coded creation of slave node /dev, /etc, /tmp and /scratch directories. Copy configuration files (bpcp option in config). Conditionally create a default slave node /etc/nsswitch.conf. If there are any NFS mounts in the slave node /etc/fstab without the "nolock" option: create the slave node statd database files, start the portmap and rpc.statd daemons, and "mount -a -t nfs". /usr/lib/beoboot/bin/setup_fs (#--- 1.4.1 --- brackets changes) Create default directories/soft links (mkdir option in config). Only tar device nodes that begin with "/dev". Always load file system kernel modules. Don't mount file systems with "noauto" option. Defer network mounts without "nolock" option until the portmapper and status daemons are running (completed in node_up). Add support for ext3 file systems. Create cooked version of /etc/beowulf/fstab[.$NODE] in slave node /etc/fstab. Below are the files I have modified/use (watch out for extra e-mail line breaks): /etc/exports The NFS file systems exported by the master /etc/beowulf/config The bproc/beoboot configuration file /etc/beowulf/fstab The file systems file for the nodes /etc/beowulf/nsswitch.conf The Name Server Switch configuration file for the nodes /etc/beowulf/node_up The beoboot stub node startup script /usr/lib/beoboot/bin/node_up The beoboot node startup script /usr/lib/beoboot/bin/setup_fs The beoboot node file system setup script After rebooting, this is what /var/log/beowulf/node.0 looks like: node_up: Setting system clock. node_up: Configuring loopback interface. setup_fs: Configuring node filesystems... setup_fs: mkdir -p /dev setup_fs: mkdir -p /etc setup_fs: ln -s /var/tmp /tmp setup_fs: ln -s /home/node.0 /scratch setup_fs: Using /etc/beowulf/fstab. setup_fs: Checking 192.168.50.209:/bin (type=nfs)... setup_fs: Mounting 192.168.50.209:/bin on /rootfs/bin... (type=nfs; options=ro,nolock,rsize=8192) setup_fs: Checking 192.168.50.209:/home (type=nfs)... setup_fs: Mounting 192.168.50.209:/home on /rootfs/home... (type=nfs; options=rw,rsize=8192,wsize=8192,noac) setup_fs: Mount deferred until lock daemon running. setup_fs: Checking 192.168.50.209:/opt (type=nfs)... setup_fs: Mounting 192.168.50.209:/opt on /rootfs/opt... (type=nfs; options=ro,nolock,rsize=8192) setup_fs: Checking 192.168.50.209:/sbin (type=nfs)... setup_fs: Mounting 192.168.50.209:/sbin on /rootfs/sbin... (type=nfs; options=ro,nolock,rsize=8192) setup_fs: Checking 192.168.50.209:/usr (type=nfs)... setup_fs: Mounting 192.168.50.209:/usr on /rootfs/usr... (type=nfs; options=ro,nolock,rsize=8192) setup_fs: Checking 192.168.50.209:/var/node.0 (type=nfs)... setup_fs: Mounting 192.168.50.209:/var/node.0 on /rootfs/var... (type=nfs; options=rw,nolock,rsize=8192,wsize=8192) setup_fs: Checking none (type=proc)... setup_fs: Mounting none on /rootfs/proc... (type=proc; options=defaults) setup_fs: Checking none (type=devpts)... setup_fs: Mounting none on /rootfs/dev/pts... (type=devpts; options=gid=5,mode=620) node_up: Copying over device nodes. node_up: Copying over time zone info. node_up: Copying /etc/{passwd,group,rpc} /etc/beowulf/nsswitch.conf to 0:/etc. node_up: Starting the RPC portmapper and status daemon. node_up: Completing deferred NFS mounts. node_up: Node setup finished. ---------- /etc/exports ---------- # # /etc/exports # # Read-only exports # /bin 192.168.50.209/255.255.255.224(ro) /opt 192.168.50.209/255.255.255.224(ro) /sbin 192.168.50.209/255.255.255.224(ro) /usr 192.168.50.209/255.255.255.224(ro) # # Private read-write exports # /var/node.0 192.168.50.210(rw,no_root_squash) /var/node.1 192.168.50.211(rw,no_root_squash) # # Shared read-write exports (MPICH 1.2.4, section 4.11.1: use "noac") # /home 130.118.45.45/255.255.252.0(rw) \ 192.168.50.209/255.255.255.224(rw,no_root_squash) ---------- /etc/beowulf/config ---------- # # /etc/beowulf/config # # Sample Beowulf Configuration file # # $Id: config,v 1.7 2002/03/12 20:54:58 hendriks Exp $ # $Id: config,v 1.7.1 2002/08/05 L. M. Baker $ # # # Default cluster configuration (uses eth1, and 192.168.1.0/24) # interface: internal cluster interface (the one connected to the nodes) # # iprange: range of IP addresses for nodes. interface eth1 192.168.50.209 255.255.255.224 # Setup addresses in the cluster. The "nodes" line is REQUIRED here to specify # cluster size. "iprange" and "ip" assign addresses to nodes. The "0" in # iprange here tells it to start assigning at node zero. nodes 2 iprange 0 192.168.50.210 192.168.50.211 # Default libraries (These are the libraries which will automagically be made # available to the slaves.) # No line continuation; multiple lines are concatenated. libraries /lib /usr/lib /usr/X11R6/lib libraries /opt/intel/compiler60/ia32/lib /opt/intel/mkl/lib/32 # Default directories. Syntax: mkdir { [ { -m mode | -s target } ] name } ... # $NODE is slave node no. No line continuation; multiple lines are # concatenated. # /dev and /etc are required. mkdir /dev /etc # Useful (local) temporary and scratch directories. #mkdir -m 1777 /tmp -m 1777 /scratch # Use NFS for /tmp and /scratch. # (NFS exports for /var and /home must be no_root_squash.) mkdir -s /var/tmp /tmp -s /home/node.$NODE /scratch # Optional bpcp file copy commands, executed one line at a time. # Syntax: bpcp [ options ] from ... to. Do not specify slave node no. -- the # destination is automatically translated to $NODE:to. $NODE is slave node no. # Enable the following line for NFS file locking support. bpcp /etc/{passwd,group,rpc} /etc/beowulf/nsswitch.conf /etc # Default file system policies. fsck full mkfs if_needed # Default location of boot images bootfile /var/beowulf/boot.img kernelimage /boot/vmlinuz-2.4.18-lanl.16 kernelcommandline apm=power-off # Here we assign MAC addresses to nodes. Nodes can have multiple MAC # addresses. Here the optional "0" zero argument states that the address # should be assigned to node zero. Node lines following that will assign # addresses to nodes sequentially # Onboard RealTek RTL8100BL chip node 0 00:40:63:c0:5e:08 node 00:40:63:c0:5f:b4 ---------- /etc/beowulf/fstab ---------- # # /etc/beowulf/fstab # # This file is the fstab for nodes. # One difference is that we allow for shell variable expansions... # # Variables that will get substituted: # MASTER = IP address of the master node. (good for doing NFS mounts) # NODE = slave's node no. # RAMDISK = device name (/dev/<ramdev>) of a device suitable for a root fs # # A cooked version (with variable substitution) of this file will be copied # to /etc/fstab on the slave node. # # The root file system is a tmpfs provided by the boot scripts. You # can mount something on / if you'd like but due to oddities in the file # caching code it's not recommended right now. # This is the default setup from beofdisk, once you setup your disks. #/dev/hda2 swap swap defaults 0 0 #/dev/hda3 / ext2 defaults 0 0 # These should always be added none /proc proc defaults 0 0 none /dev/pts devpts gid=5,mode=620 0 0 # NFS (for example and default friendliness) # Note: Mounts without the "nolock" option are deferred until the RPC portmapper # and status daemons are running -- see /usr/lib/beoboot/bin/{node_up,setup_fs}. # # Read-only mount points # $MASTER:/bin /bin nfs ro,nolock,rsize=8192 0 0 $MASTER:/opt /opt nfs ro,nolock,rsize=8192 0 0 $MASTER:/sbin /sbin nfs ro,nolock,rsize=8192 0 0 $MASTER:/usr /usr nfs ro,nolock,rsize=8192 0 0 # # Private read-write mount points # $MASTER:/var/node.$NODE /var nfs rw,nolock,rsize=8192,wsize=8192 0 0 # # Shared read-write mount points (MPICH 1.2.4, section 4.11.1: use "noac") # $MASTER:/home /home nfs rw,rsize=8192,wsize=8192,noac 0 0 ---------- /etc/beowulf/nsswitch.conf ---------- # # /etc/beowulf/nsswitch.conf # hosts: bproc passwd: bproc files group: bproc files rpc: files ---------- /etc/beowulf/node_up ---------- #!/bin/sh # # /etc/beowulf/node_up # # This shell script is called automatically by BProc to perform any # steps necessary to bring up the nodes. This is just a stub script # pointing to the real script NODE=$1 MASTER=`bpstat -a master` BINDIR=/usr/lib/beoboot/bin PATH=$BINDIR:/sbin:/usr/sbin:$PATH $BINDIR/node_up $* || exit 1 # Clean out /tmp every boot bpsh -n $NODE rm -r -f /tmp/* bpsh -n $NODE rm -r -f /tmp/.* 2>/dev/null # Ignore rm errors exit 0 ---------- /usr/lib/beoboot/bin/node_up ---------- #!/bin/sh #--- 1.17.1 --- # # /usr/lib/beoboot/bin/node_up # #--- 1.17.1 --- #--------------------------------------------------------------------- # Erik Arjan Hendriks <hen...@la...> # Copyright (C) 2000 Scyld Computing Corporation # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # # $Id: node_up,v 1.17 2002/01/04 00:39:59 hendriks Exp $ # $Id: node_up,v 1.17.1 2002/08/05 L. M. Baker $ #--------------------------------------------------------------------- umask 022 # Default umask for this stuff. cd / # Argument sanity checking if [ "$1" = "" ] ; then echo "Usage: node_up <nodenumber>" exit 1 fi NODE=$1 CONFIG=/etc/beowulf/config BINDIR=/usr/lib/beoboot/bin #--- 1.17.1 --- PATH=$BINDIR:/sbin:/usr/sbin:$PATH # Standard location of statd database files SMDIR=/var/lib/nfs # Location of statd database files on Red Hat Linux if [ -f /etc/redhat-release ] ; then SMDIR=$SMDIR/statd fi #--- 1.17.1 --- #--- 1.17.1 --- # Usage: do_bpcp node [ options ] from [ ... ] to do_bpcp() { if [ -z "$1" ] ; then return fi local NODE=$1 shift local OPTS= while [ "${1:0:1}" = "-" ] ; do local OPTS="$OPTS $1" shift done local NFILES=$(( $# - 1 )) if [ $NFILES -lt 1 ] ; then return 1 fi local FILES= for (( i = $NFILES ; i ; i-- )) ; do local FILES="$FILES $1" shift done echo "node_up: Copying$FILES to $NODE:$1." eval bpcp $OPTS $FILES $NODE:/rootfs$1 } #--- 1.17.1 --- # Usage: beoconfig tag [config_file] beoconfig() { local FILE=$2 if [ -z "$FILE" ] ; then FILE=${CONFIG} ; fi if [ ! -f ${FILE} ] ; then echo "Warning: ${FILE} file not found." >&2 return fi # These sed bits: # - strip spaces # - strip leading + trailing space # - if line starts with $1, strip off $1 and print it. sed -ne "s/#.*//" < ${FILE} \ -e "s/^[[:space:]]\+//;s/[[:space:]]\+\$//" \ -e "/^$1[[:space:]]/{s/^$1[[:space:]]\+//;p;}" } die() { if [ -n "$1" ] ; then echo 1>&2 "$1" fi if [ -n "$2" ] ; then echo 1>&2 "Fatal error performing: $*" fi if [ -n "$MOUNTED" ] ; then umount $INITRD_BUILD rmdir $INITRD_BUILD fi exit 1 } run_cmd() { eval "$*" || die "" "$*" } # A message for the console on the remote end. bpsh $NODE --stdout /dev/console \ echo -e "node_up: This is node $NODE.\nnode_up: boot log available in /var/log/beowulf/node.$NODE on the master." #--------------------------------------------------------------------- # First things first... set the system clock echo "node_up: Setting system clock." run_cmd $BINDIR/bdate $NODE # mapping of ram devices at this point. # /dev/ram0 <- initrd goes here #run_cmd bpsh $NODE mount -nt proc none /proc # XXX We need a way to figure out what interface is up at this point # so that we know which one to slap a netmask onto. #--- 1.17.1 --- #echo "node_up: TODO set interface netmask." #--- 1.17.1 --- # ... and kick on that loop back interface echo "node_up: Configuring loopback interface." run_cmd bpsh $NODE ifconfig lo 127.0.0.1 netmask 255.0.0.0 run_cmd bpsh $NODE route add -net 127.0.0.0 netmask 255.0.0.0 lo #--------------------------------------------------------------------- # Kernel Modules # # We should probably pay attention to "insmod" lines in the config # file here... KVER=`bpsh $NODE uname -r` # Make note of the remote kernel version for module in `$BINDIR/pcilookup $NODE`; do modprobe --node $NODE $module done #--------------------------------------------------------------------- # File Systems # # We need a way for setup_fs to let us know where the root filesystem # is mounted... $BINDIR/setup_fs $NODE || exit 1 # Populate it ? # Setup scratch and tmp space... #--- 1.17.1 --- #run_cmd bpsh $NODE mkdir -p /rootfs/{tmp,scratch} #run_cmd bpsh $NODE chmod 1777 /rootfs/{tmp,scratch} #--- 1.17.1 --- bplib -l | bpsh $NODE bplib -a - #$BINDIR/setup_libs $NODE /rootfs || exit 1 # Copy over device nodes from the front end. #--- 1.17.1 --- #echo "node_up: Populating /dev and /etc." #run_cmd bpsh $NODE mkdir -p /rootfs/{dev,etc} #--- 1.17.1 --- echo "node_up: Copying over device nodes." run_cmd bpsh $NODE mkdir -p /rootfs/dev #find /dev -mount -type b -o -type c | \ # sed -e 's!^/!!' | tar cf - -T - | bpsh $NODE tar -C /rootfs -xf - DEVLIST="console zero null" tar -C /dev -cf - $DEVLIST | bpsh $NODE tar -C /rootfs/dev -xf - [ "$?" = "0" ] || die "" "copying device nodes" echo "node_up: Copying over time zone info." run_cmd bpcp /etc/localtime $NODE:/rootfs/etc/localtime #--- 1.17.1 --- # Copy configuration files beoconfig bpcp | ( while read line ; do if ! do_bpcp $NODE $line ; then echo 1>&2 "Failed to copy files." exit 1 fi done ) || die # Supply a default /etc/nsswitch.conf, if needed if ! bpsh -n $NODE ls /rootfs/etc/nsswitch.conf >/dev/null 2>&1 ; then echo "node_up: Copy over default nsswitch info." run_cmd cat << EOF | bpsh -n $NODE --stdout /rootfs/etc/nsswitch.conf cat passwd: bproc hosts: bproc EOF fi #--- 1.17.1 --- # nss_bproc is optional equipment so ignore errors.... #echo "node_up: Copying over bproc nss library." #bpcp /lib/libnss_bproc.so.2 $NODE:/rootfs/lib #--------------------------------------------------------------------- # Finish up... #run_cmd bpsh $NODE umount -n /proc run_cmd bpctl -S $NODE -r /rootfs # This is a hack to make the dynamic linker work for things which are # exec'ed remotely. run_cmd bpsh -N $NODE /sbin/ldconfig -l /lib/ld-* run_cmd bpsh -N $NODE hostname n$NODE run_cmd $BINDIR/nodeinfo $NODE # Update node information DB #--- 1.17.1 --- # At this point, all file systems in $NODE:/etc/fstab have been mounted, # except for network devices (host:export) without the "nolock" option. # NFS devices without the "nolock" option require the RPC portmapper and # status daemons. The status daemon requires read/write access to the # $SMDIR/sm and $SMDIR/sm.bak directories, which must exist and be owned # 700 by rpcuser (on Red Hat, see http://nfs.sourceforge.net, item 17). # True if there are any NFS mounts in $NODE:/etc/fstab without the "nolock" # option, i.e., that need the RPC portmapper and status daemon. if [ `bpsh -n $NODE cat /etc/fstab | \ while read line ; do if [ -n "$line" -a "${line:0:1}" != "#" ] ; then echo "$line" | ( read device mountpt fstype options rest && \ echo "$fstype" | grep -q nfs && \ echo "$options" | grep -q -v nolock \ ) && echo "$line" fi done | \ wc -l` -gt 0 ] ; then # Create $SMDIR/sm and $SMDIR/sm.bak owned 700 by rpcuser (on Red Hat) bpsh -n $NODE mkdir -m 700 -p $SMDIR/{sm,sm.bak} if [ -f /etc/redhat-release ] ; then bpsh -n $NODE chmod 700 $SMDIR bpsh -n $NODE chown rpcuser $SMDIR bpsh -n $NODE chgrp rpcuser $SMDIR bpsh -n $NODE chown rpcuser $SMDIR/{sm,sm.bak} bpsh -n $NODE chgrp rpcuser $SMDIR/{sm,sm.bak} fi # Start the RPC portmapper and status daemon echo "node_up: Starting the RPC portmapper and status daemon." bpsh -n $NODE initlog -c portmap bpsh -n $NODE initlog -c rpc.statd # Mount the network devices that were deferred earlier echo "node_up: Completing deferred NFS mounts." bpsh -n $NODE mount -a -t nfs fi #--- 1.17.1 --- #--- A message for the log file and node's console. echo "node_up: Node setup finished." bpsh $NODE --stdout /dev/console echo "node_up: Node setup finished." exit 0 ---------- /usr/lib/beoboot/bin/setup_fs ---------- #!/bin/sh #--- 1.4.1 --- # # /usr/lib/beoboot/bin/setup_fs # #--- 1.4.1 --- # # Erik Hendriks <hen...@la...> # # $Id: setup_fs,v 1.4 2001/11/30 17:52:40 hendriks Exp $ # $Id: setup_fs,v 1.4.1 2002/08/05 L. M. Baker $ # # This bit of code is a first stab at understanding fstab for mount. # It's a lot like mount dealing with its own fstab. # Differences with just allowing mount to chew on an fstab: # We can do fsck checks before attempting to mount. # We can (re)create file systems before mounting. # We can create mount points before mounting. # #-------------------------------------------------------------------------- # Generic functions to do operations on varUseful functions #-------------------------------------------------------------------------- #--- 1.4.1 --- # Usage: do_mkdir node { [ -s target ] name } ... do_mkdir() { if [ -z "$1" ] ; then return fi local NODE=$1 shift if [ -z "$1" ] ; then return fi while [ -n "$1" ] ; do if [ "$1" == "-s" ] ; then shift if [ -z "$1" -o -z "$2" ] ; then return 1 fi local target=`eval echo "$1"` local name=`eval echo "$2"` echo "setup_fs: ln -s $target $name" if ! bpsh -n $NODE ln -s $target /rootfs$name ; then return 1 fi shift else if [ "$1" == "-m" ] ; then shift if [ -z "$1" -o -z "$2" ] ; then return 1 fi local mode=$1 local name=`eval echo "$2"` echo "setup_fs: mkdir -m $mode -p $name" if ! bpsh -n $NODE mkdir -m $mode -p /rootfs$name ; then return 1 fi shift else local name=`eval echo "$1"` echo "setup_fs: mkdir -p $name" if ! bpsh -n $NODE mkdir -p /rootfs$name ; then return 1 fi fi fi shift done } # Usage: do_safefsck node device fstype #--- 1.4.1 --- do_safefsck() { case $2 in /dev/ram*) echo "setup_fs: Hmmm...This appears to be a ramdisk. " echo -n "setup_fs: I'm going to try to try checking the " echo "filesystem (fsck) anyway." echo -n "setup_fs: If it is a RAM disk the following will " echo "fail harmlessly." ;; esac case $3 in #--- 1.4.1 --- ext*) bpsh -n $1 e2fsck -p $2 ; ret=$? #--- 1.4.1 --- if [ "$ret" = 1 ] ; then ret=0; fi ;; swap) bpsh -n $1 chkswap $2 ; ret=$? ;; *) ret=0;; esac [ "$ret" = 0 ] } do_fsck() { echo "setup_fs: Checking $2 (type=$3)..." case $2 in /dev/ram*) echo "setup_fs: Hmmm...This appears to be a ramdisk. " echo -n "setup_fs: I'm going to try to try checking the " echo "filesystem (fsck) anyway." echo -n "setup_fs: If it is a RAM disk the following will " echo "fail harmlessly." ;; esac case $3 in #--- 1.4.1 --- ext*) bpsh -n $1 e2fsck -y $2 ; ret=$? #--- 1.4.1 --- if [ "$ret" = 1 ] ; then ret=0; fi ;; swap) bpsh -n $1 chkswap $2 ; ret=$? ;; *) ret=0;; esac [ "$ret" = 0 ] } # Usage: do_mkfs node device fstype fssize do_mkfs() { echo "setup_fs: Creating $3 on $2..." case $3 in ext2) bpsh -n $1 mke2fs -q $2 $4 ; ret=$? ;; #--- 1.4.1 --- ext3) bpsh -n $1 mke2fs -q -j $2 $4 ; ret=$? ;; #--- 1.4.1 --- swap) bpsh -n $1 mkswap $2 $4 ; ret=$? ;; *) ret=0;; esac [ "$ret" = 0 ] } # Usage: load_fs node fstype load_fs () { if [ -z "`bpsh -n $1 grep $2 /proc/filesystems`" ] ; then modprobe --node $1 $2 fi } # Usage: do_mount node device mountpt fstype options do_mount() { #--- 1.4.1 --- # Load file system module for all fstypes so they can be mounted later if [ "$4" != "swap" ] ; then load_fs $1 $4 fi # Don't mount devices with the "noauto" option if echo $5 | grep -q noauto ; then return fi #--- 1.4.1 --- echo "setup_fs: Mounting $2 on $3... (type=$4; options=$5)" case $4 in swap) bpsh -n $1 swapon $2 ;; #--- 1.4.1 --- # Defer mounts of network devices (host:export) without the "nolock" option *) if [ -z "`echo $2 | grep :`" -o \ -n "`echo $5 | grep nolock`" ] ; then if bpsh -n $1 mount -nt $4 -o $5 $2 $3 ; then if [ "${mountpt:0:1}" == "/" ] ; then echo "$device $mountpt $fstype $options" >>$MTABFILE fi fi else echo "setup_fs: Mount deferred until lock daemon running." fi ;; #--- 1.4.1 --- esac } # Usage: beoconfig tag [config_file] beoconfig() { local FILE=$2 if [ -z "$FILE" ] ; then FILE=${CONFIG} ; fi if [ ! -f ${FILE} ] ; then echo "Warning: ${FILE} file not found." >&2 return fi # These sed bits: # - strip spaces # - strip leading + trailing space # - if line starts with $1, strip off $1 and print it. sed -ne "s/#.*//" < ${FILE} \ -e "s/^[[:space:]]\+//;s/[[:space:]]\+\$//" \ -e "/^$1[[:space:]]/{s/^$1[[:space:]]\+//;p;}" } #-------------------------------------------------------------------------- # Argument sanity checking if [ "$1" = "" ] ; then echo "Usage: setup_fs <nodenumber>" exit 1 fi echo "setup_fs: Configuring node filesystems..." NODE=$1 CONFIG=/etc/beowulf/config #--- 1.4.1 --- BINDIR=/usr/lib/beoboot/bin PATH=$BINDIR:/sbin:/usr/sbin:$PATH #--- 1.4.1 --- MASTER=`bpstat -a master` RAMDISK=/dev/ram3 FSCK=`beoconfig fsck` MKFS=`beoconfig mkfs` #--- 1.4.1 --- MKDIR=`beoconfig mkdir` #--- 1.4.1 --- #--- 1.4.1 --- # Select which FSTAB to use. #if [ -r /etc/beowulf/fstab.$NODE ] ; then # FSTAB=/etc/beowulf/fstab.$NODE #else # FSTAB=/etc/beowulf/fstab #fi #echo "setup_fs: Using $FSTAB" #--- 1.4.1 --- # XXX We need a way to pick up per-node commands! # Control flags # #--- 1.4.1 --- # FSCK = #--- 1.4.1 --- # 0 = Don't touch anything, just try to mount. # 1 = Ok to fsck but don't do anything if it fails. # 2 = fsck and do mkfs if it fails. # 3 = skip fsck go straight to mkfs # #--- 1.4.1 --- # Sanity check FSCK (default = 1) #--- 1.4.1 --- case $FSCK in "never"|"safe"|"full") ;; "") FSCK=safe ;; *) echo 1>&2 "Invalid value '$FSCK' for fsck tag in $CONFIG." exit 1 ;; esac case $MKFS in "never"|"if_needed"|"always") ;; "") MKFS=if_needed ;; *) echo 1>&2 "Invalid value '$MKFS' for mkfs tag in $CONFIG." exit 1 ;; esac #--- 1.4.1 --- # Select which FSTAB to use. FSTAB=/etc/beowulf/fstab.$NODE if [ ! -r $FSTAB ] ; then FSTAB=/etc/beowulf/fstab fi #--- 1.4.1 --- if [ ! -f $FSTAB ] ; then echo 1>&2 "setup_fs: $FSTAB (file system table) is missing." exit 1 fi #--- 1.4.1 --- # Create default directories if ! do_mkdir $NODE $MKDIR ; then echo 1>&2 "Failed to create default directories." exit 1 fi #--- 1.4.1 --- # Ok... This is one big nasty pipe line... Here's what this mess does: # * Use sed to remove comments. (starting with #) # * Run it all though eval to do variable substitutions. # * Go through all the lines doing: # + Ignore the empty lines # + Remove trailing slashes from the mount points # + Prepend a number that will allow us to sort the mount points. # * Sort the mount points #--- 1.4.1 --- # * On each point point (depending on the FSCK policy): #--- 1.4.1 --- # + fsck the file system # + if bad, possibly recreate the file system. # + mount the file system #--- 1.4.1 --- # * Create /etc/fstab for the new node. #--- 1.4.1 --- # * Create /etc/mtab for the new node. MTABFILE=/tmp/.setup_fs.mtab.$$ if ! rm -f $MTABFILE ; then echo 1>&2 "setup_fs: $MTABFILE already exists and can't remove." exit 1 fi touch $MTABFILE #--- 1.4.1 --- FSTABFILE=/tmp/.setup_fs.fstab.$$ if ! rm -f $FSTABFILE ; then echo 1>&2 "setup_fs: $FSTABFILE already exists and can't remove." exit 1 fi touch $FSTABFILE echo "setup_fs: Using $FSTAB." cat $FSTAB | \ while read line ; do if [ -z "$line" -o "${line:0:1}" == "#" ] ; then echo "$line" >>$FSTABFILE else line=`eval echo "$line"` echo "$line" >>$FSTABFILE echo "$line" fi done | \ #--- 1.4.1 --- while read device mountpt fstype options junk ; do if [ -z "$options" ] ; then #--- 1.4.1 --- # if [ -n "$device" ] ; then #--- 1.4.1 --- echo 1>&2 "Ignoring incomplete line: $device $mountpt $fstype $options $junk" #--- 1.4.1 --- # fi #--- 1.4.1 --- continue fi # Sanitize mount point... (squeeze multiple slashes, remove # any trailing slashes) mountpt=`echo $mountpt | sed -e 's!/\+!/!g' -e 's!/\+$!!'` slashct=`echo $mountpt | tr -cd / | wc -c` if [ -z $mountpt ] ; then mountpt=/ ; fi echo $slashct $device $mountpt $fstype $options done | \ sort -n | \ (while read slashct device mountpt fstype options junk ; do if [ -z "$options" ] ; then #--- 1.4.1 --- # if [ -n "$device" ] ; then #--- 1.4.1 --- echo 1>&2 "Ignoring incomplete line: $device $mountpt $fstype $options $junk" #--- 1.4.1 --- # fi #--- 1.4.1 --- continue fi # Get a file system size option if it's there... fssize=`echo $options | sed -e 's/.*fs_size=\([0-9]\+\).*/\1/p;d'` options=`echo $options | sed -e 's/fs_size=[0-9]\+//g'` if [ -z "$options" ] ; then options=defaults; fi # Everything gets a "/rootfs" prefix at this stage. Also we create the # mount points as needed. This requires that people have their fstab # in some resonable order. (It might be hard for us to sort it....) #--- 1.4.1 --- # if echo $mountpt | grep -q '^/' ; then # echo "$device $mountpt $fstype $options" >> $MTABFILE # fi #--- 1.4.1 --- # see to it that the device node exists on the remote machine #--- 1.4.1 --- if [ "${device:0:4}" == "/dev" ] ; then (cd / ; tar cf - $device) | bpsh -n $NODE tar xf - #--- 1.4.1 --- fi mknewfs=0 if [ $MKFS = "always" ]; then mknewfs=1 else case $FSCK in "never") ;; # No FSCK! "safe") if ! do_safefsck $NODE $device $fstype ; then echo 1>&2 "setup_fs: RAM disks fail FSCK, that's OK" echo 1>&2 "setup_fs: FSCK failure. (OK for RAM disks)" mknewfs=1 fi ;; "full") if ! do_fsck $NODE $device $fstype ; then echo 1>&2 "setup_fs: FSCK failure. (OK for RAM disks)" mknewfs=1 fi ;; esac fi if [ $MKFS != "never" -a "$mknewfs" = 1 ] ; then if ! do_mkfs $NODE $device $fstype $fssize ; then echo 1>&2 "Failed to create $fstype file system on $device." exit 1 fi fi # See to it that the mount point exists before trying to mount. #--- 1.4.1 --- if [ "${mountpt:0:1}" == "/" ] ; then if ! bpsh -n $NODE mkdir -p /rootfs$mountpt ; then #--- 1.4.1 --- echo 1>&2 "Failed to create mount point." exit 1 fi fi #--- 1.4.1 --- if ! do_mount $NODE $device /rootfs$mountpt $fstype $options ; then #--- 1.4.1 --- echo 1>&2 "Failed to mount $device on $mountpt." exit 1 fi done #--- 1.4.1 --- # Create fstab on the remote node... if ! bpcp $FSTABFILE $NODE:/rootfs/etc/fstab ; then echo 1>&2 "Failed to create /etc/fstab." exit 1 fi rm -f $FSTABFILE #--- 1.4.1 --- # Finally, create mtab on the remote node... #--- 1.4.1 --- # if ! bpsh -n $NODE mkdir -p /rootfs/etc ; then # echo 1>&2 "Failed to create /etc." # exit 1 # fi #--- 1.4.1 --- if ! bpcp $MTABFILE $NODE:/rootfs/etc/mtab ; then echo 1>&2 "Failed to create /etc/mtab." exit 1 fi rm -f $MTABFILE ) # Exit with status of this nutty pipeline. |
From: Erik A. H. <er...@he...> - 2002-07-30 15:36:59
|
On Sat, Jul 27, 2002 at 10:46:22PM -0300, Carlos Carvalho wrote: > Folks, > > I've just discovered bproc and it looks very interesting. However I > have a question that is crucial to our usage. > > At the moment we have a small cluster of identical machines. Our users > have FORTRAN programs that right now are not parallelized and run on a > single machine. If I understood the docs, bproc doesn't automatically > move jobs between nodes. That is correct. A process can only move itself. > How then can one distribute the load among the > nodes? The only way I see is that the users will have to launch their > programs via bpsh, specifying the node where they want the program to > run. However, how can one discover the load of each node? The only way > I see is to run > > % bpsh -aps 'cat /proc/loadavg|cut -f1' > > or something similar. Is there a better way? > > It'd be really nice if bproc did load balancing :-) I try to draw a pretty clear line between what BProc's job is and what the scheduler's job is. As I see it chosing which node to place a job on is the scheduler's job and actually putting it there is BProc's job. In other words, BProc doesn't make those decisions because I've decided that scheduling is a separate problem. - Erik P.S. I'm trying to get a simple BProc-oriented scheduler we've written out the door. |
From: Carlos C. <ca...@fi...> - 2002-07-28 01:46:39
|
Folks, I've just discovered bproc and it looks very interesting. However I have a question that is crucial to our usage. At the moment we have a small cluster of identical machines. Our users have FORTRAN programs that right now are not parallelized and run on a single machine. If I understood the docs, bproc doesn't automatically move jobs between nodes. How then can one distribute the load among the nodes? The only way I see is that the users will have to launch their programs via bpsh, specifying the node where they want the program to run. However, how can one discover the load of each node? The only way I see is to run % bpsh -aps 'cat /proc/loadavg|cut -f1' or something similar. Is there a better way? It'd be really nice if bproc did load balancing :-) |
From: Sadanand K. <sa...@ci...> - 2002-07-26 07:01:47
|
No, I am running bpslave on a different m/c than that of the one running bpmaster. Sadanand On Thu, 25 Jul 2002, Wilton Wong wrote: > > Are you trying to run bpslave on the same machine as bpmaster and the resto of > the bproc processes are running ? > > - Wilton > > ----[ Wilton William Wong ]--------------------------------------------- > 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX > Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions > T5X 1Y3, Canada URL: http://www.harddata.com > -------------------------------------------------------[ Hard Data Ltd. ]---- > > > > ------------------------------------------------------- > This sf.net email is sponsored by: Jabber - The world's fastest growing > real-time communications platform! Don't just IM. Build it in! > http://www.jabber.com/osdn/xim > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users > |
From: Wilton W. <ww...@ha...> - 2002-07-26 01:29:52
|
Are you trying to run bpslave on the same machine as bpmaster and the resto of the bproc processes are running ? - Wilton ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Erik A. H. <er...@he...> - 2002-07-25 23:12:18
|
On Thu, Jul 25, 2002 at 12:38:04AM -0500, Sadanand Kota wrote: > Hi, > Erik - thanks for reply. > I started the slave with -d option and its not exiting, but > the node status is still down. Also,where are the the log files. > /var/log/beowulf has no files. > > How do I check the "magic cookies"? It's tough to look at those. Both daemons will barf out with an error if there's a mistmatch though. It sounds like something weird is going on with your setup. At this point, I'd whip out strace and start looking at what's actually going on with connection establishment. - Erik > On Tue, 23 Jul 2002, Erik Arjan Hendriks wrote: > > > On Tue, Jul 23, 2002 at 03:28:41AM -0500, Sadanand Kota wrote: > > > Hi, > > > I have installed bproc on 2 of my linux systems taking the RPM from > > > Clustermatic. > > > I am able to run bpmaster and bpslave succesfully. But when I check the > > > status of the machines using bpstat, It always gives status as down. > > > If I try /etc/beowulf/node_up 0, the ouput is > > > > > > node_up: Setting system clock. > > > error moving to node 0: Invalid argument > > > Fatal error performing: /usr/lib/beoboot/bin/bdate 0 > > > > > > (The same with /etc/beowulf/node_up 1) > > > > > > Any idea how to change node status to up? > > > > "down" means that the slave is not connected to the master. That > > basically means that the slave either isn't connected with TCP or it > > hasn't sent the right magic cookies. > > > > There's no way to change "down" to anything else manually. Once the > > slave connects, the state will change to boot and the maste rdaemon > > will run /etc/beowulf/node_up. If that exits with status 0, the state > > will change to up. Otherwise the state will change to error. > > Manually setting the node state to "down" (with bpctl) will cause the > > slave to be disconnected. > > > > I'd check to make sure bpslave is actually connecting to the master. > > Try running bpslave with -d to make sure it's not just exiting with > > some error. Also, check the system logs. > > > > - Erik > > > > > > ------------------------------------------------------- > This sf.net email is sponsored by: Jabber - The world's fastest growing > real-time communications platform! Don't just IM. Build it in! > http://www.jabber.com/osdn/xim > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Larry B. <ba...@us...> - 2002-07-25 18:33:53
|
I have modified the bproc/beoboot node startup scripts to automatically = start the NFS RPC portmapper and status daemon so that an NFS mount = without the "nolock" option succeeds. I also fixed a couple annoyances = (i.e., tar failures), and added support for ext3 fstypes. Locking support is required for the MPI-2 parallel IO routines (I have = MPICH 1.2.4). I am not sure yet that it is completely working; while = things have improved, a couple of the MPICH IO test routines still fail. = I hope to track that down soon. I'm going to try to add a bit more user control of the file system setup = done in setup_fs through the /etc/beowulf/config configuration file. = For example, I'd like to add entries that list the names of directories = that get automatically created, such as /proc, /etc, /tmp, and /scratch. = These are hard-coded now. Also, I'd like to make support for the RPC = portmapper and status daemon optional (e.g., always, auto, never). = Finally, I'd like to get the contents for /etc/nsswitch.conf either from = a file in /etc/beowulf, or from /etc/beowulf/config. Below are the files I have modified/use: /etc/exports The NFS file systems exported by the master /etc/beowulf/fstab The file systems file for the nodes /etc/beowulf/config The bproc/beoboot configuration file /etc/beowulf/node_up The beoboot stub node startup script /usr/lib/beoboot/bin/setup_fs The beoboot node file system setup = script After rebooting, this is what /var/log/beowulf/node.0 looks like: node_up: Setting system clock. node_up: TODO set interface netmask. node_up: Configuring loopback interface. setup_fs: Configuring node filesystems... setup_fs: Using /etc/beowulf/fstab setup_fs: Checking 192.168.50.209:/bin (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/bin on /rootfs/bin... (type=3Dnfs; = options=3Dro,nolock,rsize=3D8192) setup_fs: Checking 192.168.50.209:/home (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/home on /rootfs/home... (type=3Dnfs; = options=3Drw,rsize=3D8192,wsize=3D8192,noac) setup_fs: Mount deferred until lock daemon running. setup_fs: Checking 192.168.50.209:/opt (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/opt on /rootfs/opt... (type=3Dnfs; = options=3Dro,nolock,rsize=3D8192) setup_fs: Checking 192.168.50.209:/sbin (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/sbin on /rootfs/sbin... (type=3Dnfs; = options=3Dro,nolock,rsize=3D8192) setup_fs: Checking 192.168.50.209:/usr (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/usr on /rootfs/usr... (type=3Dnfs; = options=3Dro,nolock,rsize=3D8192) setup_fs: Checking 192.168.50.209:/var/node.0 (type=3Dnfs)... setup_fs: Mounting 192.168.50.209:/var/node.0 on /rootfs/var... = (type=3Dnfs; options=3Drw,nolock,rsize=3D8192,wsize=3D8192) setup_fs: Checking none (type=3Dproc)... setup_fs: Mounting none on /rootfs/proc... (type=3Dproc; = options=3Ddefaults) setup_fs: Checking none (type=3Ddevpts)... setup_fs: Mounting none on /rootfs/dev/pts... (type=3Ddevpts; = options=3Dgid=3D5,mode=3D620) node_up: populating /dev and /etc node_up: Copying over device nodes. node_up: Copying over time zone info. node_up: Copy over nsswitch info. node_up: Node setup finished. /etc/beowulf/node_up: Copy files into /etc for /etc/nsswitch.conf. /etc/beowulf/node_up: Start the RPC portmapper and status daemon. /etc/beowulf/node_up: Complete deferred network mounts. /etc/beowulf/node_up: Soft link /tmp to /var/tmp. /etc/beowulf/node_up: Soft link /scratch to /home/node.0. Larry Baker US Geological Survey ba...@us... # # /etc/exports # # Read-only exports # /bin 192.168.50.209/255.255.255.224(ro) /opt 192.168.50.209/255.255.255.224(ro) /sbin 192.168.50.209/255.255.255.224(ro) /usr 192.168.50.209/255.255.255.224(ro) # # Private read-write exports # /var/node.0 192.168.50.210(rw,no_root_squash) /var/node.1 192.168.50.211(rw,no_root_squash) # # Shared read-write exports (MPICH 1.2.4, section 4.11.1: use "noac") # /home 130.118.45.45/255.255.252.0(rw) \ 192.168.50.209/255.255.255.224(rw,no_root_squash) # # /etc/beowulf/fstab # # This file is the fstab for nodes. # One difference is that we allow for shell variable expansions... # # Variables that will get substituted: # MASTER =3D IP address of the master node. (good for doing NFS = mounts) # NODE =3D slave's node no. # RAMDISK =3D device name (/dev/<ramdev>) of a device suitable for a = root fs # # A cooked version (with variable substitution) of this file will be = copied # to /etc/fstab on the slave node. # # The root file system is a tmpfs provided by the boot scripts. You # can mount something on / if you'd like but due to oddities in the file # caching code it's not recommended right now. # This is the default setup from beofdisk, once you setup your disks. #/dev/hda2 swap swap defaults 0 0 #/dev/hda3 / ext2 defaults 0 0 # These should always be added none /proc proc defaults 0 0 none /dev/pts devpts gid=3D5,mode=3D620 0 0 # NFS (for example and default friendliness) # Note: Mounts without the "nolock" option are deferred until the RPC = portmapper # and status daemons are running -- see the instructions in = /etc/beowulf/node_up # # Read-only mount points # $MASTER:/bin /bin nfs ro,nolock,rsize=3D8192 0 0 $MASTER:/opt /opt nfs ro,nolock,rsize=3D8192 0 0 $MASTER:/sbin /sbin nfs ro,nolock,rsize=3D8192 0 0 $MASTER:/usr /usr nfs ro,nolock,rsize=3D8192 0 0 # # Private read-write mount points # $MASTER:/var/node.$NODE /var nfs rw,nolock,rsize=3D8192,wsize=3D8192 0 0 # # Shared read-write mount points (MPICH 1.2.4, section 4.11.1: use = "noac") # $MASTER:/home /home nfs rw,rsize=3D8192,wsize=3D8192,noac 0 0 # # /etc/beowulf/config # # Sample Beowulf Configuration file # # $Id: config,v 1.7 2002/03/12 20:54:58 hendriks Exp $ # # # Default cluster configuration (uses eth1, and 192.168.1.0/24) # interface: internal cluster interface (the one connected to the = nodes) # # iprange: range of IP addresses for nodes. interface eth1 192.168.50.209 255.255.255.224 # Setup addresses in the cluster. The "nodes" line is REQUIRED here to = specify # cluster size. "iprange" and "ip" assign addresses to nodes. The "0" = in # iprange here tells it to start assigning at node zero. nodes 2 iprange 0 192.168.50.210 192.168.50.211 # Default libraries (These are the libraries which will automagically be = made # available to the slaves.) # No line continuation, multiple entries allowed libraries /lib /usr/lib /usr/X11R6/lib libraries /opt/intel/compiler60/ia32/lib /opt/intel/mkl/lib/32 # Default file system policies. fsck full mkfs if_needed # Default location of boot images bootfile /var/beowulf/boot.img kernelimage /boot/vmlinuz-2.4.18-lanl.16 kernelcommandline apm=3Dpower-off # Here we assign MAC addresses to nodes. Nodes can have multiple MAC # addresses. Here the optional "0" zero argument states that the = address # should be assigned to node zero. Node lines following that will = assign # addresses to nodes sequentially # D-Link DFE-500TX PCI card (DEC 21140-A chip) #node 0 00:40:05:36:66:83 #node 00:40:05:40:60:e7 # Onboard RealTek RTL8100BL chip) node 0 00:40:63:c0:5e:08 node 00:40:63:c0:5f:b4 #!/bin/sh # # /etc/beowulf/node_up # # This shell script is called automatically by BProc to perform any # steps necessary to bring up the nodes. This is just a stub script # pointing to the real script NODE=3D$1 MASTER=3D`bpstat -a master` BINDIR=3D/usr/lib/beoboot/bin PATH=3D/sbin:/usr/sbin:$PATH:$BINDIR # Standard location of statd database files #SMDIR=3D/var/lib/nfs # Location of statd database files on Red Hat Linux SMDIR=3D/var/lib/nfs/statd $BINDIR/node_up $* || exit 1 # At this point, all file systems in $NODE:/etc/fstab have been mounted, # except for network devices (host:export) without the "nolock" option. # The following sections finish preparing the node for "mount -a", = below. # (Currently, only the RPC portmapper and status daemon are started, if # necessary, for NFS file systems (fstype=3Dnfs). Other fstypes may = require # similar preparation.) # NFS devices without the "nolock" option require the RPC portmapper and # status daemons. The status daemon requires read/write access to the # /var/lib/nfs/statd/sm and .../sm.bak directories, which must exist and # be owned 700 by rpcuser (see http://nfs.sourceforge.net, item 17). # True if there are any NFS mounts in $NODE:/etc/fstab without the = "nolock" # option, i.e., that need the RPC portmapper and status daemon. if [ `bpsh -n $NODE cat /etc/fstab | \ while read line ; do if [ -n "${line}" -a "${line:0:1}" !=3D "#" ] ; then echo "${line}" | ( \ read device mountpt fstype options rest && \ echo ${fstype} | grep -q "nfs" && \ echo ${options} | grep -q -v "nolock" \ ) && echo "${line}" fi done | \ wc -l` > 0 ] ; then # Copy the files needed for the Name Service Switch (NSS) to /etc # (needed by getpwnam(), etc., in #include <pwd.h>, called by rpc.statd) echo "/etc/beowulf/node_up: Copy files into /etc for = /etc/nsswitch.conf." bpcp /etc/passwd $NODE:/etc bpcp /etc/group $NODE:/etc bpcp /etc/rpc $NODE:/etc # Replace the NSS config file cat << EOF | bpsh -n $NODE --stdout /etc/nsswitch.conf cat # # /etc/nsswitch.conf # hosts: bproc passwd: bproc files group: bproc files rpc: files EOF # Create /var/lib/nfs/statd/sm and .../sm.bak owned 700 by rpcuser (Red = Hat) bpsh -n $NODE mkdir -p $SMDIR/sm bpsh -n $NODE chmod 700 $SMDIR/sm bpsh -n $NODE mkdir -p $SMDIR/sm.bak bpsh -n $NODE chmod 700 $SMDIR/sm.bak if echo $SMDIR | grep -q "/statd" ; then bpsh -n $NODE chmod 700 $SMDIR bpsh -n $NODE chown rpcuser $SMDIR bpsh -n $NODE chgrp rpcuser $SMDIR bpsh -n $NODE chown rpcuser $SMDIR/sm bpsh -n $NODE chgrp rpcuser $SMDIR/sm bpsh -n $NODE chown rpcuser $SMDIR/sm.bak bpsh -n $NODE chgrp rpcuser $SMDIR/sm.bak fi # Start the RPC portmapper and status daemon echo "/etc/beowulf/node_up: Start the RPC portmapper and status = daemon." bpsh -n $NODE initlog -c portmap bpsh -n $NODE initlog -c rpc.statd fi # Mount the network devices that were deferred earlier echo "/etc/beowulf/node_up: Complete deferred network mounts." bpsh -n $NODE mount -a ##### Add commands here to complete the setup of the node ##### # Soft link /tmp to /var/tmp (NFS /var must be no_root_squash) echo "/etc/beowulf/node_up: Soft link /tmp to /var/tmp." bpsh -n $NODE rmdir --ignore-fail-on-non-empty /tmp bpsh -n $NODE mkdir -p /var/tmp bpsh -n $NODE ln -s /var/tmp /tmp bpsh -n $NODE chmod 1777 /var/tmp # Clean out /tmp every boot bpsh -n $NODE /bin/rm -r -f /var/tmp/* bpsh -n $NODE /bin/rm -r -f /var/tmp/.* 2>/dev/null # Soft link /scratch to /home/node.$NODE (NFS /home must be = no_root_squash) echo "/etc/beowulf/node_up: Soft link /scratch to /home/node.$NODE." bpsh -n $NODE rmdir --ignore-fail-on-non-empty /scratch bpsh -n $NODE mkdir -p /home/node.$NODE bpsh -n $NODE ln -s /home/node.$NODE /scratch bpsh -n $NODE chmod 1777 /home/node.$NODE exit 0 #!/bin/sh # # /usr/lib/beoboot/bin/setup_fs # # Erik Hendriks <hen...@la...> # # $Id: setup_fs,v 1.4 2001/11/30 17:52:40 hendriks Exp $ # # This bit of code is a first stab at understanding fstab for mount. # It's a lot like mount dealing with its own fstab. # Differences with just allowing mount to chew on an fstab: # We can do fsck checks before attempting to mount. # We can (re)create file systems before mounting. # We can create mount points before mounting. # #------------------------------------------------------------------------= -- # Generic functions to do operations on varUseful functions #------------------------------------------------------------------------= -- # Usage: fsckfs node device fstype do_safefsck() { case $2 in /dev/ram*) echo "setup_fs: Hmmm...This appears to be a ramdisk. " echo -n "setup_fs: I'm going to try to try checking the " echo "filesystem (fsck) anyway." echo -n "setup_fs: If it is a RAM disk the following will " echo "fail harmlessly." ;; esac case $3 in ext*) bpsh -n $1 e2fsck -p $2 ; ret=3D$? if [ "$ret" =3D 1 ] ; then ret=3D0; fi ;; swap) bpsh -n $1 chkswap $2 ; ret=3D$? ;; *) ret=3D0;; esac [ "$ret" =3D 0 ] } do_fsck() { echo "setup_fs: Checking $2 (type=3D$3)..." case $2 in /dev/ram*) echo "setup_fs: Hmmm...This appears to be a ramdisk. " echo -n "setup_fs: I'm going to try to try checking the " echo "filesystem (fsck) anyway." echo -n "setup_fs: If it is a RAM disk the following will " echo "fail harmlessly." ;; esac case $3 in ext*) bpsh -n $1 e2fsck -y $2 ; ret=3D$? if [ "$ret" =3D 1 ] ; then ret=3D0; fi ;; swap) bpsh -n $1 chkswap $2 ; ret=3D$? ;; *) ret=3D0;; esac [ "$ret" =3D 0 ] } # Usage: do_mkfs node device fstype fssize do_mkfs() { echo "setup_fs: Creating $3 on $2..." case $3 in ext2) bpsh -n $1 mke2fs -q $2 $4 ; ret=3D$? ;; ext3) bpsh -n $1 mke2fs -q -j $2 $4 ; ret=3D$? ;; swap) bpsh -n $1 mkswap $2 $4 ; ret=3D$? ;; *) ret=3D0;; esac [ "$ret" =3D 0 ] } # Usage: load_fs node fstype load_fs () { if [ -z "`bpsh -n $1 grep $2 /proc/filesystems`" ] ; then modprobe --node $1 $2 fi } # Usage: do_mount node device mountpt fstype options do_mount() { # Load file system module for all fstypes so they can be mounted later if [ "$4" !=3D "swap" ] ; then load_fs $1 $4 fi # Don't mount devices with the "noauto" option if [ -n "`echo $5 | grep noauto`" ] ; then return fi echo "setup_fs: Mounting $2 on $3... (type=3D$4; options=3D$5)" case $4 in swap) bpsh -n $1 swapon $2 ;; # Defer mounts of network devices (host:export) without the "nolock" = option *) if [ -z "`echo $2 | grep :`" -o \ -n "`echo $5 | grep nolock`" ] ; then if bpsh -n $1 mount -nt $4 -o $5 $2 $3 ; then if [ "${mountpt:0:1}" =3D=3D "/" ] ; then echo "$device $mountpt $fstype $options" >> = $MTABFILE fi fi else echo "setup_fs: Mount deferred until lock daemon = running." fi ;; esac } # Usage: beoconfig tag [config_file] beoconfig() { local FILE=3D$2 if [ -z "$FILE" ] ; then FILE=3D${CONFIG} ; fi if [ ! -f ${FILE} ] ; then echo "Warning: ${FILE} file not found." >&2 return fi # These sed bits: # - strip spaces # - strip leading + trailing space # - if line starts with $1, strip off $1 and print it. sed -ne "s/#.*//" < ${FILE} \ -e "s/^[[:space:]]\+//;s/[[:space:]]\+\$//" \ -e "/^$1[[:space:]]/{s/^$1[[:space:]]\+//;p;}" } #------------------------------------------------------------------------= -- # Argument sanity checking if [ "$1" =3D "" ] ; then echo "Usage: setup_fs <nodenumber>" exit 1 fi echo "setup_fs: Configuring node filesystems..." NODE=3D$1 PATH=3D/sbin:/usr/sbin:$PATH:/usr/lib/beoboot/bin CONFIG=3D/etc/beowulf/config MASTER=3D`bpstat -a master` RAMDISK=3D/dev/ram3 FSCK=3D`beoconfig fsck` MKFS=3D`beoconfig mkfs` # Select which FSTAB to use. FSTAB=3D/etc/beowulf/fstab.$NODE if [ ! -r $FSTAB ] ; then FSTAB=3D/etc/beowulf/fstab fi echo "setup_fs: Using $FSTAB" # XXX We need a way to pick up per-node commands! # Control flags # # FSCK =3D # 0 =3D Don't touch anything, just try to mount. # 1 =3D Ok to fsck but don't do anything if it fails. # 2 =3D fsck and do mkfs if it fails. # 3 =3D skip fsck go straight to mkfs # # Sanity check FSCK (default =3D 1) case $FSCK in "never"|"safe"|"full") ;; "") FSCK=3Dsafe ;; *) echo 1>&2 "Invalid value '$FSCK' for fsck tag in $CONFIG." exit 1 ;; esac case $MKFS in "never"|"if_needed"|"always") ;; "") MKFS=3Dif_needed ;; *) echo 1>&2 "Invalid value '$MKFS' for mkfs tag in $CONFIG." exit 1 ;; esac if [ ! -f $FSTAB ] ; then echo 1>&2 "setup_fs: $FSTAB (file system table) is missing." exit 1 fi # Ok... This is one big nasty pipe line... Here's what this mess does: # * Use sed to remove comments. (starting with #) # * Run it all though eval to do variable substitutions. # * Go through all the lines doing: # + Ignore the empty lines # + Remove trailing slashes from the mount points # + Prepend a number that will allow us to sort the mount points. # * Sort the mount points # * On each point point (depending on the FSCK policy): # + fsck the file system # + if bad, possibly recreate the file system. # + mount the file system (defer network mounts w/o the "nolock" = option) # * Create /etc/fstab for the new node. # * Create /etc/mtab for the new node. MTABFILE=3D/tmp/.setup_fs.mtab.$$ if ! rm -f $MTABFILE ; then echo 1>&2 "setup_fs: $MTABFILE already exists and can't remove." exit 1 fi touch $MTABFILE FSTABFILE=3D/tmp/.setup_fs.fstab.$$ if ! rm -f $FSTABFILE ; then echo 1>&2 "setup_fs: $FSTABFILE already exists and can't remove." exit 1 fi touch $FSTABFILE cat $FSTAB | \ while read line ; do if [ -z "$line" -o "${line:0:1}" =3D "#" ] ; then echo $line >>$FSTABFILE else line=3D`eval echo "$line"` echo $line >>$FSTABFILE echo $line fi done | \ while read device mountpt fstype options junk ; do if [ -z "$options" ] ; then echo 1>&2 "Ignoring incomplete line: $device $mountpt = $fstype $options $junk" continue fi # Sanitize mount point... (squeeze multiple slashes, remove # any trailing slashes) mountpt=3D`echo $mountpt | sed -e 's!/\+!/!g' -e 's!/\+$!!'` slashct=3D`echo $mountpt | tr -cd / | wc -c` if [ -z $mountpt ] ; then mountpt=3D/ ; fi echo $slashct $device $mountpt $fstype $options done | \ sort -n | \ (while read slashct device mountpt fstype options junk ; do if [ -z "$options" ] ; then if [ -n "$device" ] ; then echo 1>&2 "Ignoring incomplete line: $device $mountpt $fstype $options = $junk" fi continue fi # Get a file system size option if it's there... fssize=3D`echo $options | sed -e 's/.*fs_size=3D\([0-9]\+\).*/\1/p;d'` options=3D`echo $options | sed -e 's/fs_size=3D[0-9]\+//g'` if [ -z "$options" ] ; then options=3Ddefaults; fi =20 # Everything gets a "/rootfs" prefix at this stage. Also we create the # mount points as needed. This requires that people have their fstab # in some resonable order. (It might be hard for us to sort it....) # see to it that the device node exists on the remote machine if [ "${device:0:4}" =3D=3D "/dev" ] ; then (cd / ; tar cf - $device) | bpsh -n $NODE tar xf - fi mknewfs=3D0 if [ $MKFS =3D "always" ]; then mknewfs=3D1 else case $FSCK in "never") ;; # No FSCK! "safe") if ! do_safefsck $NODE $device $fstype ; then echo 1>&2 "setup_fs: RAM disks fail FSCK, that's OK" echo 1>&2 "setup_fs: FSCK failure. (OK for RAM disks)" mknewfs=3D1 fi ;; "full") if ! do_fsck $NODE $device $fstype ; then echo 1>&2 "setup_fs: FSCK failure. (OK for RAM disks)" mknewfs=3D1 fi ;; esac fi =20 if [ $MKFS !=3D "never" -a "$mknewfs" =3D 1 ] ; then if ! do_mkfs $NODE $device $fstype $fssize ; then echo 1>&2 "Failed to create $fstype file system on $device." exit 1 fi fi # See to it that the mount point exists before trying to mount. if echo $mountpt | grep -q '^/' ; then if ! bpsh -n $NODE mkdir -p /rootfs$mountpt ; then echo 1>&2 "Failed to create $mountpt." exit 1 fi fi if ! do_mount $NODE $device /rootfs$mountpt $fstype $options ; then echo 1>&2 "Failed to mount $device on $mountpt." exit 1 fi done # Create fstab on the remote node... if ! bpsh -n $NODE mkdir -p /rootfs/etc ; then echo 1>&2 "Failed to create /etc." exit 1 fi if ! bpcp $FSTABFILE $NODE:/rootfs/etc/fstab ; then echo 1>&2 "Failed to create /etc/fstab." exit 1 fi rm -f $FSTABFILE # Finally, create mtab on the remote node... if ! bpcp $MTABFILE $NODE:/rootfs/etc/mtab ; then echo 1>&2 "Failed to create /etc/mtab." exit 1 fi rm -f $MTABFILE ) # Exit with status of this nutty pipeline. |
From: Sadanand K. <sa...@ci...> - 2002-07-25 05:38:08
|
Hi, Erik - thanks for reply. I started the slave with -d option and its not exiting, but the node status is still down. Also,where are the the log files. /var/log/beowulf has no files. How do I check the "magic cookies"? Sadanand On Tue, 23 Jul 2002, Erik Arjan Hendriks wrote: > On Tue, Jul 23, 2002 at 03:28:41AM -0500, Sadanand Kota wrote: > > Hi, > > I have installed bproc on 2 of my linux systems taking the RPM from > > Clustermatic. > > I am able to run bpmaster and bpslave succesfully. But when I check the > > status of the machines using bpstat, It always gives status as down. > > If I try /etc/beowulf/node_up 0, the ouput is > > > > node_up: Setting system clock. > > error moving to node 0: Invalid argument > > Fatal error performing: /usr/lib/beoboot/bin/bdate 0 > > > > (The same with /etc/beowulf/node_up 1) > > > > Any idea how to change node status to up? > > "down" means that the slave is not connected to the master. That > basically means that the slave either isn't connected with TCP or it > hasn't sent the right magic cookies. > > There's no way to change "down" to anything else manually. Once the > slave connects, the state will change to boot and the maste rdaemon > will run /etc/beowulf/node_up. If that exits with status 0, the state > will change to up. Otherwise the state will change to error. > Manually setting the node state to "down" (with bpctl) will cause the > slave to be disconnected. > > I'd check to make sure bpslave is actually connecting to the master. > Try running bpslave with -d to make sure it's not just exiting with > some error. Also, check the system logs. > > - Erik > |
From: Erik A. H. <er...@he...> - 2002-07-23 16:13:20
|
On Tue, Jul 23, 2002 at 03:28:41AM -0500, Sadanand Kota wrote: > Hi, > I have installed bproc on 2 of my linux systems taking the RPM from > Clustermatic. > I am able to run bpmaster and bpslave succesfully. But when I check the > status of the machines using bpstat, It always gives status as down. > If I try /etc/beowulf/node_up 0, the ouput is > > node_up: Setting system clock. > error moving to node 0: Invalid argument > Fatal error performing: /usr/lib/beoboot/bin/bdate 0 > > (The same with /etc/beowulf/node_up 1) > > Any idea how to change node status to up? "down" means that the slave is not connected to the master. That basically means that the slave either isn't connected with TCP or it hasn't sent the right magic cookies. There's no way to change "down" to anything else manually. Once the slave connects, the state will change to boot and the maste rdaemon will run /etc/beowulf/node_up. If that exits with status 0, the state will change to up. Otherwise the state will change to error. Manually setting the node state to "down" (with bpctl) will cause the slave to be disconnected. I'd check to make sure bpslave is actually connecting to the master. Try running bpslave with -d to make sure it's not just exiting with some error. Also, check the system logs. - Erik |
From: Sadanand K. <sa...@ci...> - 2002-07-23 08:28:45
|
Hi, I have installed bproc on 2 of my linux systems taking the RPM from Clustermatic. I am able to run bpmaster and bpslave succesfully. But when I check the status of the machines using bpstat, It always gives status as down. If I try /etc/beowulf/node_up 0, the ouput is node_up: Setting system clock. error moving to node 0: Invalid argument Fatal error performing: /usr/lib/beoboot/bin/bdate 0 (The same with /etc/beowulf/node_up 1) Any idea how to change node status to up? Sadanand |
From: Wilton W. <ww...@ha...> - 2002-07-16 22:38:10
|
I'm currently having the same difficulty here with NFS locking, I can tell what the problem is and I can give you a "cheap and dirty" solution. The problem with NFS locking is not with lockd it's with rpc.statd (lockd is usually automagically started by the kernel anyways). The current rpc.statd (at least un my nfs-utils package tried to drop privaleges and run it self as "rpcuser" or "nobody" but unless you have ldap/nis running on your nodes getpwuid(rpcuser) and getpwuid(nobody) will return 0, and rpc.statd will silently exit failing to setuid(). Quickest fix: uses files in nsswitch.conf instead of bproc and copy over the passwd/group files from the master node to the cluster node. Quick fix: use nis/nis+/ldap on the master node and configure nsswitch.conf etc.. on the cluster node accordingly. Hack fix: Remove the drop-privs patches from the nfs-utils package, not a bad idea since we are supposedly running on a "secure" network anyways. Best fix: add functinality to beonss - Wilton PS. any other suggestions are welcome ;) On Tue, 16 Jul 2002, Larry Baker wrote: > Thank you for your help. > > I am neither a Unix nor a Linux expert, so I don't know how to determine > which features were compiled into the kernel. NFS works fine, so I assume > the nfs module is there. It is just NFS locking that does not work. I am > using the Clustermatic bproc kernel. Is there a list of modules that are > built into the kernel somewhere? After I find that, where is the list of > modules that must be added (using modprobe) to get full NFS locking support? > I can hack the Clustermatic setup_fs script from there. > > Larry Baker > > on 7/16/02 2:09 AM, Wilton Wong at ww...@ha... wrote: > > > > > Have you inserted the lockd/nfs/sunrpc modules on the node ? ie: "modprobe -N > > 0 > > nfs", then run portmap then try mounting without the "nolock" option ? > > > > "lockdsvc: Function not implemented" seems to indicate that the lockd module > > wasn't loaded or NFS file locking was not compiled into the kernel. > > > > - Wilton > > > > ----[ Wilton William Wong ]--------------------------------------------- > > 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX > > Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions > > T5X 1Y3, Canada URL: http://www.harddata.com > > -------------------------------------------------------[ Hard Data Ltd. ]---- > > > > -- > ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Erik A. H. <er...@he...> - 2002-07-16 14:47:13
|
On Tue, Jul 16, 2002 at 01:37:25AM -0500, Sadanand Kota wrote: > Also, if try to run bpmaster, it says > './bpmaster: BPROC_SYS_VERSION: Function not implemented'. That means the BProc module isn't loaded. > My /etc/bproc.conf is as follows > bind manager 2223 > node 127.0.0.1 127.0.0.2 This looks like some ANCIENT config file syntax. > Theres no specific reason why I am trying custom compiled kernel. My > partner( sitting right next to me now) is trying the clustematic binries > (i.e he download all rpms from clustermatic.org). He also has a problem as > follows - > On running the command /etc/rc.d/init.d/beowulf start it says > Configuring network interface (eth0):Error: No netmask given for interface > > In /etc/beowulf/config file, netfask for eth0 is properly defined as > follows > interface eth0 255.255.255.0. this is incorrect. The syntax is: interface eth0 IPaddress Netmask > Also, after installation of clustermatic RPMS, upon booting the machine > says, > etho - unknown hosts. > This problem was not there before installation of clustermatic RPM. > > Any idea for either(custom kernel or clustermatic RPM) are highly > appreciated. Another random note - you'll probably have to say depmod -a to get the dependencies rebuilt and the modules loaded after installing the clustermatic RPMs. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |
From: Erik A. H. <er...@he...> - 2002-07-16 14:30:01
|
On Tue, Jul 16, 2002 at 01:04:16AM -0600, Wilton Wong wrote: > > With Mac address, how do I set a single machine as both amster and > > slave ( for testing purposes)? > > I don't belive that this is possible. It is possible for a machine to be a master and slave for itself at the same time. It's a pretty confusing arrangement though since processes will show up several times in the process tree. You can't use the beoboot stuff (MAC addresses and so on) to do this though. You will have to run the slave daemon (bpslave) directly. I usually set this up on the loop-back interface. Tell the front end it is 127.0.0.1. Then give the slave nodes IPs in the range 127.0.0.2+ Then you can start like this "bpslave -s 127.0.0.2 127.0.0.1 2223". NOTE: Before doing this, you will want to comment out everything in the node setup script (/etc/beowulf/node_up) to avoid having the master try to set itself up. - Erik |
From: Erik A. H. <er...@he...> - 2002-07-16 14:26:41
|
On Tue, Jul 16, 2002 at 02:56:21AM -0500, Sadanand Kota wrote: > Our scheduling techniques involves selection of existing process' pids > based on certain criterion and move(migrate) that process to specified > node (also identified by our scheduling algorithm). Do you think this is > possible using BPROC ? Hope you undertand my question. Please let me know > if there is any confusion. ( As I understand bproc_execmove cannot be > used in this situation). There is no third party migration in BProc. In other words, a proces can not force another process to migrate to another node. A process can migrate itself at any time. Keep in mind that migration is NOT transparent - you will lose open files, etc when you migrate. - Erik |
From: Janez P. <jan...@fe...> - 2002-07-16 09:24:36
|
Wilton Wong wrote: > > With Mac address, how do I set a single machine as both amster and > > slave ( for testing purposes)? > > I don't belive that this is possible. I did not get the original question, however, I am successfuly running a bproc cluster with 6 machines, every one configured as master AND slave at the same time. No problems in bproc_moving whatsoever, both daemons coexist and perform their tasks simultaneously without conflicts. I admit that the process list is a mess. The cluster is used for teaching and demo purposes, so every machine can rfork processes to other 5 nodes, no crashes so far. Janez. |
From: Wilton W. <ww...@ha...> - 2002-07-16 09:09:51
|
Have you inserted the lockd/nfs/sunrpc modules on the node ? ie: "modprobe -N 0 nfs", then run portmap then try mounting without the "nolock" option ? "lockdsvc: Function not implemented" seems to indicate that the lockd module wasn't loaded or NFS file locking was not compiled into the kernel. - Wilton ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |
From: Wilton W. <ww...@ha...> - 2002-07-16 08:08:04
|
On Tue, 16 Jul 2002, Sadanand Kota wrote: > Our scheduling techniques involves selection of existing process' pids > based on certain criterion and move(migrate) that process to specified > node (also identified by our scheduling algorithm). Do you think this is > possible using BPROC ? Hope you undertand my question. Please let me know > if there is any confusion. ( As I understand bproc_execmove cannot be > used in this situation). As far as I understand BProc is just for centrally managed pid space, and it does not have provisions for shared memory/shared user space so I don't think this would be possible with the current BProc, best ask Erik about this (er...@he...). I belive if your library has hooks to somehow save the program state and move it to a different node you can use bproc to manage it. This would be a tough. - Wilton ----[ Wilton William Wong ]--------------------------------------------- 11060-166 Avenue Ph : 01-780-456-9771 High Performance UNIX Edmonton, Alberta FAX: 01-780-456-9772 and Linux Solutions T5X 1Y3, Canada URL: http://www.harddata.com -------------------------------------------------------[ Hard Data Ltd. ]---- |