Installing an Install Server makes your cluster scale. In this example I have 2 install servers. Each install server is an install server for 128 nodes. This for SLES10.1. If you're running another OS, mkinstall probably won't work for you. Use the default in /opt/xcat/share/xcat/netboot/rh.
Also note, that you'll need genimage.yum. If you use something else, then change the below script. This should be used as a reference and not copied directly.
Here is the code I used to make a diskless install image. To run it just run:
./mkinstall <new image name>
This is not tested anywhere but on my cluster. YMMV.
mkinstall
#!/bin/bash NAME=$1 GENIMAGEDIR=/opt/xcat/share/xcat/netboot/sles GENIMAGE=$GENIMAGEDIR/genimage.yum IMGROOT=/install/netboot/sles10.1/x86_64/$NAME/rootimg MASTERIP=$(grep `hostname` /etc/hosts | head -1 | awk '{print $1}') echo "master is: $MASTERIP" sleep 5 #MASTERIP=10.4.2.30 ERROR=0 if [ ! -f $GENIMAGEDIR/$NAME.pkglist ] then echo "Creating $GENIMAGEDIR/$NAME.pkglist" cat > "$GENIMAGEDIR/$NAME.pkglist" <EOF atftp bash bind bind-utils bzip2 compat-libstdc++ dbus-1 dbus-1-glib dhclient dhcp-relay dhcp dhcpcd expat gcc gcc-fortran hal iputils kernel kernel-smp ksh libgfortran libxml2 make nfs-utils ntp numactl openssh procps psmisc resmgr rpm rsh stunnel tar tcsh tk vim cron vsftpd wget perl-DBD-Pg apache2 xCATsn conserver expect fping ipmitool perl-XML-Parser perl-xCAT postgresql postgresql-server syslinux xCAT-client xCAT-nbkernel-x86_64 xCAT-nbroot-core-x86_64 xCAT-nbroot-oss-x86_64 xCAT-server xCATsn dhcp-server EOF fi if [ ! -f $GENIMAGEDIR/$NAME.exlist ] then echo "Creating $GENIMAGEDIR/$NAME.exlist" cat > "$GENIMAGEDIR/$NAME.exlist" <EOF2 ./usr/share/man* ./usr/share/locale* ./usr/share/i18n* ./var/cache/yum* ./usr/share/doc* ./usr/share/gnome* ./usr/share/zoneinfo* ./usr/share/cracklib* ./usr/share/info* ./usr/share/omf* ./usr/lib/locale* ./boot* EOF2 fi $GENIMAGE -i eth0 -n tg3,bnx2 -o sles10.1 -p $NAME chroot $IMGROOT insserv boot.localnet chroot $IMGROOT insserv haldaemon chroot $IMGROOT insserv dbus chroot $IMGROOT insserv network # change syslog echo "*.* @$MASTERIP" > $IMGROOT/etc/syslog.conf chroot $IMGROOT insserv syslog chroot $IMGROOT insserv portmap chroot $IMGROOT insserv sshd # NTP echo "server $MASTERIP" >> $IMGROOT/etc/ntp.conf cp /etc/localtime $IMGROOT/etc/ cp /etc/hosts $IMGROOT/etc/ chroot $IMGROOT insserv ntp chroot $IMGROOT insserv apache2 # stop dhcp from starting up until xCAT does it. chroot $IMGROOT chkconfig dhcpd off chroot $IMGROOT chkconfig dhcrelay off # copy head nodes sysctl for kernel params cp /etc/sysctl.conf $IMGROOT/etc/ # add more nfs threads perl -pi -e 's/USE_KERNEL_NFSD_NUMBER="4"/USE_KERNEL_NFSD_NUMBER="64"/g' $IMGROOT/etc/sysconfig/nfs # dhcp interface assignment perl -pi -e 's/DHCPD_INTERFACE=""/DHCPD_INTERFACE="eth0"/g' $IMGROOT/etc/sysconfig/dhcpd # NFS vodoo echo '/install *(ro,no_root_squash,sync,fsid=13)' >>$IMGROOT/etc/exports # FSTAB vodoo perl -pi -e 's/tmpfs/#tmpfs/g' $IMGROOT/etc/fstab echo "$NAME / tmpfs rw 0 1 " >>$IMGROOT/etc/fstab #echo "none /tmp tmpfs defaults,size=10m 0 2" >>$IMGROOT/etc/fstab #echo "none /var/tmp tmpfs defaults,size=10m 0 2" >>$IMGROOT/etc/fstab #HTTP fix mv $IMGROOT/etc/httpd/conf.d/xcat.conf $IMGROOT/etc/apache2/conf.d/ cp /etc/security/limits.conf $IMGROOT/etc/security cp /usr/bin/strace $IMGROOT/usr/bin echo "Packing Image..." packimage -o sles10.1 -p $NAME -a x86_64 echo "Install Server Image: $NAME has been created. Please remember to edit" #echo "the nodetype table: e.g.: tabedit nodetype" #echo "then run: nodeset service netboot" #echo "then reboot service nodes: rpower, or reboot" #echo "make sure tabedit site has installloc set to /install" #echo "you should also verify post install scripts."
I'll show you the important tables here. The rest are just normal.
#node,postscripts,comments,disable "service","servicenode,xcatclient,xcatserver,setupeth,restartxcat",,
As of xCAT 2.7, the servicenode postscript calls the xcatclient and xcatserver postscripts, so all three are not needed in the postscript table. The table would look like the following:
#node,postscripts,comments,disable "service","servicenode,setupeth,restartxcat",,
All the scripts here are included with xCAT. I added setupeth because I wanted to change the GbE cards. Then after I did that I needed to restart xCAT so a made a script to do that.
Here are the scripts:
setupeth
Here I have a 10GbE card that I needed to load the driver on. I also needed to change the way they were ordered because I wanted my 10GbE card to be eth0. So here is how I did it. (Notice this is all in SLES10)
insmod /xcatpost/myri10ge.ko ME=`hostname` MMM=`grep $ME-bmm /etc/hosts | awk '{print $1}'` TENGE=`grep $ME /etc/hosts | head -1 | awk '{print $1}'` # flip the udev's around cp /etc/udev/rules.d/30-net_persistent_names.rules /tmp/net_persistent_names.rules.ORIG perl -pi -e 's/eth0/ethX/g' /etc/udev/rules.d/30-net_persistent_names.rules perl -pi -e 's/eth2/eth0/g' /etc/udev/rules.d/30-net_persistent_names.rules perl -pi -e 's/eth1/eth2/g' /etc/udev/rules.d/30-net_persistent_names.rules perl -pi -e 's/ethX/eth1/g' /etc/udev/rules.d/30-net_persistent_names.rules echo "BOOTPROTO='static' IPADDR=$TENGE NETMASK=255.255.0.0 STARTMODE=auto MTU=1500 " >/etc/sysconfig/network/ifcfg-eth0 echo "BOOTPROTO='static' IPADDR=$MMM NETMASK=255.255.0.0 STARTMODE=auto " >/etc/sysconfig/network/ifcfg-eth1 service network stop rmmod bnx2 rmmod myri10ge modprobe bnx2 insmod /xcatpost/myri10ge.ko sleep 5 service network start #sleep 10
restartxcat
does just what it says it does... oh, and syslog too.
service syslog restart service xcatd restart
"service",,"pxe",,"10.1.2.30",,,,,,,"eth0","eth0","10.1.2.30",,,, "hgroup","dnh01","pxe","dnh01","dnh01",,,,,,,"eth0","eth0","dnh01",,,, "igroup","dni01","pxe","dni01","dni01",,,,,,,"eth0","eth0","dni01",,,,
Here my two install servers are dnh01 and dni01. They service the hgroup and the igroup. Each of these groups has about 128 nodes. The nodes are assigned in the nodelist table.
Looks like this:
#node,nameserver,dhcpserver,tftpserver,nfsserver,conserver,monserver,ldapserver,ntpserver,ftpserver,comments,disable "dnh01","0","1","1","1","0","1","0","1","1",, "dni01","0","1","1","1","0","1","0","1","1",
#node,os,arch,profile,nodetype,comments,disable "service","sles10.1","x86_64","s10","osi",,
Notice that when I ran mkinstall above I ran it like:
mkinstall s10
So since that's the image I made, I want to boot to it.
The rest is all just done using normal xCAT commands. Run: nodeset <servicenodes> netboot.
The biggest problem I had when doing this initially was that the 10GbE card runs at 9000MTU by default and my switch wasn't set to handle that. So when I added the MTU in there it made everything work fine.
Hopefully this helps someone else.