This document provides step-by-step instructions for setting up an example stateful or stateless cluster for a BladeCenter.
These steps prepare the Management Node for xCAT Installation.
Install one of the supported distros on the Management Node (MN). It is recommended to ensure that dhcp, bind (not bind-chroot), httpd, nfs-utils, and perl-XML-Parser are installed. (But if not, the process of installing the xCAT software later will pull them in, assuming you follow the steps to make the distro RPMs available.)
Hardware requirements for your xCAT management node are dependent on your cluster size and configuration. A minimum requirement for an xCAT Management Node or Service Node that is dedicated to running xCAT to install a small cluster ( < 16 nodes) should have 4-6 Gigabytes of memory. A medium size cluster, 6-8 Gigabytes of memory; and a large cluster, 16 Gigabytes or more. Keeping swapping to a minimum should be a goal.
For a list of supported OS and Hardware, refer to XCAT_Features.
To disable SELinux manually:
echo 0 > /selinux/enforce sed -i 's/^SELINUX=.*$/SELINUX=disabled/' /etc/selinux/config
Note: you can skip this step in xCAT 2.8 and above, because xCAT does it automatically when it is installed.
The management node provides many services to the cluster nodes, but the firewall on the management node can interfere with this. If your cluster is on a secure network, the easiest thing to do is to disable the firewall on the Management Mode:
For RH:
service iptables stop chkconfig iptables off
For SLES:
SuSEfirewall2 stop
If disabling the firewall completely isn't an option, configure iptables to allow the ports described in XCAT_Port_Usage.
The xCAT installation process will scan and populate certain settings from the running configuration. Having the networks configured ahead of time will aid in correct configuration. (After installation of xCAT, all the networks in the cluster must be defined in the xCAT networks table before starting to install cluster nodes.) When xCAT is installed on the Management Node, it will automatically run makenetworks to create an entry in the networks table for each of the networks the management node is on. Additional network configurations can be added to the xCAT networks table manually later if needed.
The networks that are typically used in a cluster are:
In our example, we only focus on the management network:
For a sample Networks Setup, see the following example: Setting_Up_a_Linux_xCAT_Mgmt_Node#Appendix_A:_Network_Table_Setup_Example
Configure the cluster facing NIC(s) on the management node.
For example edit the following files:
On RH: /etc/sysconfig/network-scripts/ifcfg-eth1 On SLES: /etc/sysconfig/network/ifcfg-eth1 DEVICE=eth1 ONBOOT=yes BOOTPROTO=static IPADDR=172.20.0.1 NETMASK=255.240.0.0
If the public facing NIC on your management node is configured by DHCP, you may want to set '''PEERDNS=no''' in the NIC's config file to prevent the dhclient from rewriting /etc/resolv.conf. This would be important if you will be configuring DNS on the management node (via makedns - covered later in this doc) and want the management node itself to use that DNS. In this case, set '''PEERDNS=no''' in each /etc/sysconfig/network-scripts/ifcfg-* file that has '''BOOTPROTO=dhcp'''.
On the other hand, if you '''want''' dhclient to configure /etc/resolv.conf on your management node, then don't set PEERDNS=no in the NIC config files.
The xCAT management node hostname should be configured before installing xCAT on the management node. The hostname or its resolvable ip address will be used as the default master name in the xCAT site table, when installed. This name needs to be the one that will resolve to the cluster-facing NIC. Short hostnames (no domain) are the norm for the management node and all cluster nodes. Node names should never end in "-enx" for any x.
To set the hostname, edit /etc/sysconfig/network to contain, for example:
HOSTNAME=mgt
If you run hostname command, if should return the same:
# hostname mgt
Ensure that at least the management node is in /etc/hosts:
127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 ### 172.20.0.1 mgt mgt.cluster
When using the management node to install compute nodes, the timezone configuration on the management node will be inherited by the compute nodes. So it is recommended to setup the correct timezone on the management node. To do this on RHEL, see http://www.redhat.com/advice/tips/timezone.html. The process is similar, but not identical, for SLES. (Just google it.)
You can also optionally set up the MN as an NTP for the cluster. See Setting_up_NTP_in_xCAT.
It is not required, but recommended, that you create a separate file system for the /install directory on the Management Node. The size should be at least 30 meg to hold to allow space for several install images.
Note: in xCAT 2.8 and above, you do not need to restart the management node. Simply restart the cluster-facing NIC, for example: ifdown eth1; ifup eth1
For xCAT 2.7 and below, though it is possible to restart the correct services for all settings, the simplest step would be to reboot the Management Node at this point.
There are two options to get the installation source of xCAT:
Pick either one, but not both.
Note:
1. Due to the packages "net-snmp-libs" and "net-snmp-agent-libs"(required by "net-snmp-perl" in xcat-dep) are updated in Redhat 7.1 iso, a xcat-dep branch for Redhat 7.0 is created. Thus, please use the repo under "xcat-dep/rh7.0" for Redhat 7.0 and use "xcat-dep/rh7" for other Redhat 7 releases.
2. for CentOS and ScientificLinux, could use the same xcat-dep configuration with RHEL. For example, CentOS 7.0 could use xcat-dep/rh7.0/x86_64 as the xcat-dep repo.
If not able to, or not want to, use the live internet repository, choose this option.
Go to the Download xCAT site and download the level of xCAT tarball you desire. Go to the xCAT Dependencies Download page and download the latest snap of the xCAT dependency tarball. (The latest snap of the xCAT dependency tarball will work with any version of xCAT.)
Copy the files to the Management Node (MN) and untar them:
mkdir /root/xcat2 cd /root/xcat2 tar jxvf xcat-core-2.*.tar.bz2 # or core-rpms-snap.tar.bz2 tar jxvf xcat-dep-*.tar.bz2
Point yum/zypper to the local repositories for xCAT and its dependencies:
[RH]:
cd /root/xcat2/xcat-dep/<release>/<arch>; ./mklocalrepo.sh cd /root/xcat2/xcat-core ./mklocalrepo.sh
[SLES 11, SLES12]:
zypper ar file:///root/xcat2/xcat-dep/<os>/<arch> xCAT-dep zypper ar file:///root/xcat2/xcat-core xcat-core
[SLES 10.2+]:
zypper sa file:///root/xcat2/xcat-dep/sles10/<arch> xCAT-dep zypper sa file:///root/xcat2/xcat-core xcat-core
When using the live internet repository, you need to first make sure that name resolution on your management node is at least set up enough to resolve sourceforge.net. Then make sure the correct repo files are in /etc/yum.repos.d.
You could use the official release or latest snapshot build or development build, based on your requirements.
[RH]:
wget http://sourceforge.net/projects/xcat/files/yum/<xCAT-release>/xcat-core/xCAT-core.repo
for example:
cd /etc/yum.repos.d wget http://sourceforge.net/projects/xcat/files/yum/2.8/xcat-core/xCAT-core.repo
[SLES11, SLES12]:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/<xCAT-release\>/xcat-core xCAT-core
for example:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/2.8/xcat-core xCAT-core
[SLES10.2+]:
zypper sa http://sourceforge.net/projects/xcat/files/yum/<xCAT-release\>/xcat-core xCAT-core
for example:
zypper sa http://sourceforge.net/projects/xcat/files/yum/2.8/xcat-core xCAT-core
[RH]:
wget http://sourceforge.net/projects/xcat/files/yum/<xCAT-release>/core-snap/xCAT-core.repo
for example:
cd /etc/yum.repos.d wget http://sourceforge.net/projects/xcat/files/yum/2.8/core-snap/xCAT-core.repo
[SLES11, SLES12]:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/<xCAT-release>/core-snap xCAT-core
for example:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/2.8/core-snap xCAT-core
[SLES10.2+]:
zypper sa http://sourceforge.net/projects/xcat/files/yum/<xCAT-release\>/core-snap xCAT-core
for example:
zypper sa http://sourceforge.net/projects/xcat/files/yum/2.8/core-snap xCAT-core
[RH]:
wget http://sourceforge.net/projects/xcat/files/yum/devel/core-snap/xCAT-core.repo
[SLES11, SLES12]:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/devel/core-snap xCAT-core
[SLES10.2+]:
zypper sa http://sourceforge.net/projects/xcat/files/yum/devel/core-snap xCAT-core
To get the repo file for xCAT-dep packages:
[RH]:
wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep/<OS-release>/<arch>/xCAT-dep.repo
for example:
wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep/rh6/x86_64/xCAT-dep.repo
[SLES11, SLES12]:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/xcat-dep/<OS-release>/<arch> xCAT-dep
for example:
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/xcat-dep/sles11/x86_64 xCAT-dep
[SLES10.2+]:
zypper sa http://sourceforge.net/projects/xcat/files/yum/xcat-dep/<OS-release\>/<arch\> xCAT-dep
for example:
zypper sa http://sourceforge.net/projects/xcat/files/yum/xcat-dep/sles10/x86_64 xCAT-dep
xCAT uses on several packages that come from the Linux distro. Follow this section to create the repository of the OS on the Management Node.
See the following documentation:
Setting Up the OS Repository on the Mgmt Node
[RH]: Use yum to install xCAT and all the dependencies:
yum clean metadata
or
yum clean all
then
yum install xCAT
[SLES]Use zypper to install xCAT and all the dependencies:
zypper install xCAT
Note:syslcone is not supported on SLES.
In xCAT 2.8.2 and above, xCAT supports cloning new nodes from a pre-installed/pre-configured node, we call this provisioning method as sysclone. It leverages the opensource tool systemimager. xCAT ships the required systemimager packages with xcat-dep. If you will be installing stateful(diskful) nodes using the sysclone provmethod, you need to install systemimager and all the dependencies:
[RH]: Use yum to install systemimager and all the dependencies:
yum install systemimager-server
[SLES]: Use zypper to install systemimager and all the dependencies:
zypper install systemimager-server
Add xCAT commands to the path by running the following:
source /etc/profile.d/xcat.sh
Check to see the database is initialized:
tabdump site
The output should similar to the following:
key,value,comments,disable "xcatdport","3001",, "xcatiport","3002",, "tftpdir","/tftpboot",, "installdir","/install",, . . .
If the tabdump command does not work, see Debugging xCAT Problems.
If you really encountered certain problem that xcat daemon failed to function, you can try to restart the xcat daemon.
[For xcat daemon is running on NON-systemd enabled Linux OS like rh6.x and sles11.x]
service xcatd restart
[For xcat daemon is running on systemd enabled Linux OS like rh7.x and sles12.x. And AIX.]
restartxcatd
Refer to the doc of restartxcatd to get the information why you need to use it for systemd enabled system.
If you want to restart xcat daemon but do not want to reconfigure the network service on the management (this will restart xcat daemon quickly for a large cluster).
[For xcat daemon is running on NON-systemd enabled Linux OS like rh6.x and sles11.x]
service xcatd reload
[For xcat daemon is running on systemd enabled Linux OS like rh7.x and sles12.x. And AIX.]
restartxcatd -r
If you want to rescan plugin when you added a new plugin, or you changed the subroutine handled_commands of certain plugin.
rescanplugins
If you need to update the xCAT RPMs later:
To update xCAT:
[RH]:
yum clean metadata or you may need to use yum clean all yum update '*xCAT*'
[SLES]:
zypper refresh zypper update -t package '*xCAT*'
Note: this will not apply updates that may have been made to some of the xCAT deps packages. (If there are brand new deps packages, they will get installed.) In most cases, this is ok, but if you want to make all updates for xCAT rpms and deps, run the following command. This command will also pick up additional OS updates.
[RH]:
yum update
[SLES]:
zypper refresh zypper update
Note: Sometimes zypper refresh fails to refresh zypper local repository. Try to run zypper clean to clean local metadata, then use zypper refresh.
Note: If you are updating from xCAT 2.7.x (or earlier) to xCAT 2.8 or later, there are some additional migration steps that need to be considered:
All networks in the cluster must be defined in the networks table. When xCAT was installed, it ran makenetworks, which created an entry in this table for each of the networks the management node is connected to. Now is the time to add to the networks table any other networks in the cluster, or update existing networks in the table.
For a sample Networks Setup, see the following example: Setting_Up_a_Linux_xCAT_Mgmt_Node/#appendix-a-network-table-setup-example.
The password should be set in the passwd table that will be assigned to root when the node is installed. You can modify this table using tabedit. To change the default password for root on the nodes, change the system line. To change the password to be used for the AMMs, change the blade line.
tabedit passwd
#key,username,password,cryptmethod,comments,disable
"system","root","cluster",,,
"blade","USERID","PASSW0RD",,,
To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:
chdef -t site forwarders=1.2.3.4,1.2.5.6
search cluster
nameserver 172.20.0.1
Run makedns
makedns -n
For more information about name resolution in an xCAT Cluster, see [Cluster_Name_Resolution].
You usually don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster facing NICs. For example:
chdef -t site dhcpinterfaces=eth1
Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:
makedhcp -n
The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.
Nothing to do here - the TFTP server is done by xCAT during the Management Node install.
makeconservercf
xCAT requires the AMM management module. It does not support MM's.
nodeadd amm1-amm5 groups=mm,all
nodeadd sw1-sw5 groups=nortel,switch,all
chdef -t group mm mgt=blade hwtype=mm nodetype=mm password=<mypw> mpa='/\z//' ip='|10.20.0.($1)|'
Notes:
For more info see [Listing_and_Modifying_the_Database].
Verify the attributes are set to what you want:
lsdef mm -l
makehosts mm
makedns mm
Use rspconfig to configure the network settings on the MM and for the switch module.
Note: Using rspconfig to set up the MMs' network is an optional step, it is necessary only when the MMs' network are not already setup correctly. ****
Set up all of the MMs' network configuration based on what's in the xCAT DB:
rspconfig mm network='*'
Alternatively, you can configure each MM individually and manually:
rspconfig amm1 network=10.20.0.1,amm1,10.20.0.254,255.255.255.0
...
Note: the network parameters in the command above are: ip,host,gateway,netmask
Set up the switch module network information for each switch:
rspconfig amm1 swnet=10.20.0.101,10.20.0.254,255.255.255.0
...
Note: the IP addresses in the commands above are: ip,gateway,netmask
After setting the network settings of the MM and switch module, then enable SNMP and password-less ssh between the xCAT management node and the AMM:
rspconfig mm snmpcfg=enable sshcfg=enable
rspconfig mm pd1=redwoperf pd2=redwoperf
rpower mm reset
Test the ssh set up to see if password-less ssh is enabled:
psh -l USERID mm info -T mm[1]
For SOL to work best telnet to each nortel switch (default pw is "admin") and run:
/cfg/port int1/gig/auto off
.
.
/cfg/port int14/gig/auto off
cd
apply
save
Do this for each port (I.e. int2, int3, etc.)
Be sure password-less ssh is enabled between the Manage Node and the AMM. See above.
Set the password on the MM (must be the same as in the xCAT DB):
rspconfig mm USERID=<mypw>
Alternatively, if you want to set it manually, then for each AMM:
ssh -l USERID amm1 "users -T mm[1]"
amm:
amm: system> users -T mm[1]
amm: **1. USERID**
amm: 3 active session(s)
amm: Password compliant
amm: Account active
amm: Role:supervisor
amm: Blades:1|2|3|4|5|6|7|8|9|10|11|12|13|14
amm: Chassis:1
amm: Modules:1|2|3|4|5|6|7|8|9|10
amm: Number of SSH public keys installed for this user: 3
amm: 2. <not used>
amm: 3. <not used>
amm: 4. <not used>
.
.
ssh -l USERID amm1 "users -T mm[1] **-1** -op PASSW0RD -p PASSW1RD"
amm:
amm: system> users -T mm[1] -1 -op PASSW0RD -p PASSW1RD
amm: OK
Verify the hardware control commands are working:
rvitals amm all
returns:
amm: Blower/Fan 1: 74% RPM Good state
amm: Blower/Fan 2: 74% RPM Good state
amm: Blower/Fan 3: % RPM Unknown state
amm: Blower/Fan 4: % RPM Unknown state
.
.
Updating AMM Firmware can be done through the web GUI or can be done in parallel with psh. To do it in parallel using psh:
Download Firmware from http://www-304.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=MIGR-5073383 to the management node:
cd /tftpboot/
unzip ibm_fw_amm_bpet36k_anyos_noarch.zip
Perform the update:
psh -l USERID mm "update -i 10.20.0.200 -l CNETCMUS.pkt -v -T mm[1]"
Note: 10.20.0.200 should be the IP address and the management node on the service network.
Reset the AMM, they will take a few minutes to come back online
psh -l USERID mm "reset -T mm[1]"
You can display the current version of firmware with:
psh -l USERID mm "info -T mm[1]" | grep "Build ID"
Note: For more information about the node attributes used in this section, see [<http://xcat.sourceforge.net/man7/node.7.html> node attributes].
nodeadd blade01-blade50 groups=blade,compute,all
chdef -t group blade mgt=blade cons=blade hwtype=blade nodetype=blade serialspeed=115200 serialport=1 netboot=xnba tftpserver=10.20.0.200
chdef -t group blade mpa='|amm(($1-1)/14+1)|' id='|(($1-1)%14+1)|' ip='|10.0.0.($1+0)|'
Notes:
if you are using JS blades, do not set serialspeed or serialport.
Verify the attributes are set to what you want by listing them for one blade:
lsdef blade20
makehosts blade
makedns blade
makeconservercf
rpower blade stat
Test rcons for a few nodes:
rcons blade01
If you have problems with conserver:
To check the bios version, check your docs ( for example for a hs21 blade):http://download.boulder.ibm.com/ibmdl/pub/systems/support/system_x_cluster/hs21-cmos-settings-v1.1.htm
If you want the MTM and serial numbers of the blades in the xCAT DB, you can run:
rscan mm -u
Collect the blade MACs from the MMs:
getmacs blade
Note: The getmacs will get the mac address of the first network interface, to get the mac address of the other network interfaces, use getmacs <nodename> -i eth<x>.
To verify the mac addresses are set in the DB:
lsdef blade -i mac -c
makedhcp blade
rbootseq <nodename> net,hd
This section describes deploying stateful nodes.
There are two options to install your nodes as stateful (diskful) nodes:
This section describes the process for setting up xCAT to install nodes; that is how to install an OS on the disk of each node.
The copycds command copies the contents of the linux distro media to /install/<os>/<arch> so that it will be available to install nodes with or create diskless images.
copycds <path>/RHEL6.2-*-Server-x86_64-DVD1.iso
copycds /dev/dvd # or whatever the device name of your dvd drive is
Tip: if this is the same distro version as your management node, create a .repo file in /etc/yum.repos.d with content similar to:
[local-rhels6.2-x86_64]
name=xCAT local rhels 6.2
baseurl=file:/install/rhels6.2/x86_64
enabled=1
gpgcheck=0
This way, if you need some additional RPMs on your MN at a later, you can simply install them using yum. Or if you are installing other software on your MN that requires some additional RPMs from the disto, they will automatically be found and installed.
The copycds command also automatically creates several osimage defintions in the database that can be used for node deployment. To see them:
lsdef -t osimage # see the list of osimages
lsdef -t osimage <osimage-name> # see the attributes of a particular osimage
From the list above, select the osimage for your distro, architecture, provisioning method (in this case install), and profile (compute, service, etc.). Although it is optional, we recommend you make a copy of the osimage, changing its name to a simpler name. For example:
lsdef -t osimage -z rhels6.2-x86_64-install-compute | sed 's/^[^ ]\+:/mycomputeimage:/' | mkdef -z
This displays the osimage "rhels6.2-x86_64-install-compute" in a format that can be used as input to mkdef, but on the way there it uses sed to modify the name of the object to "mycomputeimage".
Initially, this osimage object points to templates, pkglists, etc. that are shipped by default with xCAT. And some attributes, for example otherpkglist and synclists, won't have any value at all because xCAT doesn't ship a default file for that. You can now change/fill in any osimage attributes that you want. A general convention is that if you are modifying one of the default files that an osimage attribute points to, copy it into /install/custom and have your osimage point to it there. (If you modify the copy under /opt/xcat directly, it will be over-written the next time you upgrade xCAT.)
But for now, we will use the default values in the osimage definition and continue on. (If you really want to see examples of modifying/creating the pkglist, template, otherpkgs pkglist, and sync file list, see the section [Using_Provmethod=osimagename]. Most of the examples there can be used for stateful nodes too.)
Create a postscript file called (for example) updatekernel:
vi /install/postscripts/updatekernel
Add the following lines to the file:
#!/bin/bash
rpm -Uivh data/kernel-*rpm
Change the permission on the file:
chmod 755 /install/postscripts/updatekernel
Make the new kernel RPM available to the postscript:
mkdir /install/postscripts/data
cp <kernel> /install/postscripts/data
Add the postscript to your compute nodes:
chdef -p -t group compute postscripts=updatekernel
Now when you install your nodes (done in a step below), it will also update the kernel.
Alternatively, you could install your nodes with the stock kernel, and update the nodes afterward using updatenode and the same postscript above, in this case, you need to reboot your nodes to make the new kernel be effective.
By default, xCAT will install the operating system on the first disk and with default partitions layout in the node. However, you may choose to customize the disk partitioning during the install process and define a specific disk layout. You can do this in one of two ways:
You could create a customized osimage partition file, say /install/custom/my-partitions, that contains the disk partitioning definition, then associate the partition file with osimage, the nodeset command will insert the contents of this file directly into the generated autoinst configuration file that will be used by the OS installer.
The partition file must follow the partitioning syntax of the installer(e.g. kickstart for RedHat, AutoYaST for SLES, Preseed for Ubuntu).
Here are examples of the partition file:
RedHat Standard Partitions for IBM Power machines
# Uncomment this PReP line for IBM Power servers
#part None --fstype "PPC PReP Boot" --size 8 --ondisk sda
# Uncomment this efi line for x86_64 servers
#part /boot/efi --size 50 --ondisk /dev/sda --fstype efi
part /boot --size 256 --fstype ext4
part swap --recommended --ondisk sda
part / --size 1 --grow --fstype ext4 --ondisk sda
RedHat LVM Partitions
# Uncomment this PReP line for IBM Power servers
#part None --fstype "PPC PReP Boot" --ondisk /dev/sda --size 8
# Uncomment this efi line for x86_64 servers
#part /boot/efi --size 50 --ondisk /dev/sda --fstype efi
part /boot --size 256 --fstype ext4 --ondisk /dev/sda
part swap --recommended --ondisk /dev/sda
part pv.01 --size 1 --grow --ondisk /dev/sda
volgroup system pv.01
logvol / --vgname=system --name=root --size 1 --grow --fstype ext4
RedHat RAID 1 configuration
See Use_RAID1_In_xCAT_Cluster for more details.
x86_64 SLES Standard Partitions
<drive>
<device>/dev/sda</device>
<initialize config:type="boolean">true</initialize>
<use>all</use>
<partitions config:type="list">
<partition>
<create config:type="boolean">true</create>
<filesystem config:type="symbol">swap</filesystem>
<format config:type="boolean">true</format>
<mount>swap</mount>
<mountby config:type="symbol">path</mountby>
<partition_nr config:type="integer">1</partition_nr>
<partition_type>primary</partition_type>
<size>32G</size>
</partition>
<partition>
<create config:type="boolean">true</create>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<mount>/</mount>
<mountby config:type="symbol">path</mountby>
<partition_nr config:type="integer">2</partition_nr>
<partition_type>primary</partition_type>
<size>64G</size>
</partition>
</partitions>
</drive>
x86_64 SLES LVM Partitions
<drive>
<device>/dev/sda</device>
<initialize config:type="boolean">true</initialize>
<partitions config:type="list">
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<loop_fs config:type="boolean">false</loop_fs>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">65</partition_id>
<partition_nr config:type="integer">1</partition_nr>
<pool config:type="boolean">false</pool>
<raid_options/>
<resize config:type="boolean">false</resize>
<size>8M</size>
<stripes config:type="integer">1</stripes>
<stripesize config:type="integer">4</stripesize>
<subvolumes config:type="list"/>
</partition>
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<loop_fs config:type="boolean">false</loop_fs>
<mount>/boot</mount>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">131</partition_id>
<partition_nr config:type="integer">2</partition_nr>
<pool config:type="boolean">false</pool>
<raid_options/>
<resize config:type="boolean">false</resize>
<size>256M</size>
<stripes config:type="integer">1</stripes>
<stripesize config:type="integer">4</stripesize>
<subvolumes config:type="list"/>
</partition>
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<format config:type="boolean">false</format>
<loop_fs config:type="boolean">false</loop_fs>
<lvm_group>vg0</lvm_group>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">142</partition_id>
<partition_nr config:type="integer">3</partition_nr>
<pool config:type="boolean">false</pool>
<raid_options/>
<resize config:type="boolean">false</resize>
<size>max</size>
<stripes config:type="integer">1</stripes>
<stripesize config:type="integer">4</stripesize>
<subvolumes config:type="list"/>
</partition>
</partitions>
<pesize></pesize>
<type config:type="symbol">CT_DISK</type>
<use>all</use>
</drive>
<drive>
<device>/dev/vg0</device>
<initialize config:type="boolean">true</initialize>
<partitions config:type="list">
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">swap</filesystem>
<format config:type="boolean">true</format>
<loop_fs config:type="boolean">false</loop_fs>
<lv_name>swap</lv_name>
<mount>swap</mount>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">130</partition_id>
<partition_nr config:type="integer">5</partition_nr>
<pool config:type="boolean">false</pool>
<raid_options/>
<resize config:type="boolean">false</resize>
<size>auto</size>
<stripes config:type="integer">1</stripes>
<stripesize config:type="integer">4</stripesize>
<subvolumes config:type="list"/>
</partition>
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<loop_fs config:type="boolean">false</loop_fs>
<lv_name>root</lv_name>
<mount>/</mount>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">131</partition_id>
<partition_nr config:type="integer">1</partition_nr>
<pool config:type="boolean">false</pool>
<raid_options/>
<resize config:type="boolean">false</resize>
<size>max</size>
<stripes config:type="integer">1</stripes>
<stripesize config:type="integer">4</stripesize>
<subvolumes config:type="list"/>
</partition>
</partitions>
<pesize></pesize>
<type config:type="symbol">CT_LVM</type>
<use>all</use>
</drive>
ppc64 SLES Standard Partitions
<drive>
<device>/dev/sda</device>
<initialize config:type="boolean">true</initialize>
<partitions config:type="list">
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">false</format>
<loop_fs config:type="boolean">false</loop_fs>
<mountby config:type="symbol">device</mountby>
<partition_id config:type="integer">65</partition_id>
<partition_nr config:type="integer">1</partition_nr>
<resize config:type="boolean">false</resize>
<size>auto</size>
</partition>
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">swap</filesystem>
<format config:type="boolean">true</format>
<fstopt>defaults</fstopt>
<loop_fs config:type="boolean">false</loop_fs>
<mount>swap</mount>
<mountby config:type="symbol">id</mountby>
<partition_id config:type="integer">130</partition_id>
<partition_nr config:type="integer">2</partition_nr>
<resize config:type="boolean">false</resize>
<size>auto</size>
</partition>
<partition>
<create config:type="boolean">true</create>
<crypt_fs config:type="boolean">false</crypt_fs>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<fstopt>acl,user_xattr</fstopt>
<loop_fs config:type="boolean">false</loop_fs>
<mount>/</mount>
<mountby config:type="symbol">id</mountby>
<partition_id config:type="integer">131</partition_id>
<partition_nr config:type="integer">3</partition_nr>
<resize config:type="boolean">false</resize>
<size>max</size>
</partition>
</partitions>
<pesize></pesize>
<type config:type="symbol">CT_DISK</type>
<use>all</use>
</drive>
SLES RAID 1 configuration
See Use_RAID1_In_xCAT_Cluster for more details.
Ubuntu standard partition configuration on PPC64le
8 1 32 prep
$primary{ }
$bootable{ }
method{ prep } .
256 256 512 ext3
$primary{ }
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext3 }
mountpoint{ /boot } .
64 512 300% linux-swap
method{ swap }
format{ } .
512 1024 4096 ext3
$primary{ }
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext4 }
mountpoint{ / } .
100 10000 1000000000 ext3
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext4 }
mountpoint{ /home } .
Ubuntu standard partition configuration on X86_64
256 256 512 vfat
$primary{ }
method{ format }
format{ }
use_filesystem{ }
filesystem{ vfat }
mountpoint{ /boot/efi } .
256 256 512 ext3
$primary{ }
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext3 }
mountpoint{ /boot } .
64 512 300% linux-swap
method{ swap }
format{ } .
512 1024 4096 ext3
$primary{ }
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext4 }
mountpoint{ / } .
100 10000 1000000000 ext3
method{ format }
format{ }
use_filesystem{ }
filesystem{ ext4 }
mountpoint{ /home } .
If none of these examples could be used in your cluster, you could refer to the Kickstart documentation or Autoyast documentation or Preseed documentation to write your own partitions layout. Meanwhile, RedHat and SuSE provides some tools that could help generate kickstart/autoyast templates, in which you could refer to the partition section for the partitions layout information:
RedHat:
SLES
Ubuntu
chdef -t osimage <osimagename> partitionfile=/install/custom/my-partitions
nodeset <nodename> osimage=<osimage>
For Redhat, when nodeset runs and generates the /install/autoinst file for a node, it will replace the #XCAT_PARTITION_START#...#XCAT_PARTITION_END# directives from your osimage template with the contents of your custom partitionfile.
For Ubuntu, when nodeset runs and generates the /install/autoinst file for a node, it will generate a script to write the partition configuration to /tmp/partitionfile, this script will replace the #XCA_PARTMAN_RECIPE_SCRIPT# directive in /install/autoinst/<node>.pre. </node>
Create a shell script that will be run on the node during the install process to dynamically create the disk partitioning definition. This script will be run during the OS installer %pre script on Redhat or preseed/early_command on Unbuntu execution and must write the correct partitioning definition into the file /tmp/partitionfile on the node.
The purpose of the partition script is to create the /tmp/partionfile that will be inserted into the kickstart/autoyast/preseed template, the script could include complex logic like select which disk to install and even configure RAID, etc..
Note: the partition script feature is not thoroughly tested on SLES, there might be problems, use this feature on SLES at your own risk.
Here is an example of the partition script on Redhat and SLES, the partitioning script is /install/custom/my-partitions.sh:
instdisk="/dev/sda"
modprobe ext4 >& /dev/null
modprobe ext4dev >& /dev/null
if grep ext4dev /proc/filesystems > /dev/null; then
FSTYPE=ext3
elif grep ext4 /proc/filesystems > /dev/null; then
FSTYPE=ext4
else
FSTYPE=ext3
fi
BOOTFSTYPE=ext3
EFIFSTYPE=vfat
if uname -r|grep ^3.*el7 > /dev/null; then
FSTYPE=xfs
BOOTFSTYPE=xfs
EFIFSTYPE=efi
fi
if [ `uname -m` = "ppc64" ]; then
echo 'part None --fstype "PPC PReP Boot" --ondisk '$instdisk' --size 8' >> /tmp/partitionfile
fi
if [ -d /sys/firmware/efi ]; then
echo 'bootloader --driveorder='$instdisk >> /tmp/partitionfile
echo 'part /boot/efi --size 50 --ondisk '$instdisk' --fstype $EFIFSTYPE' >> /tmp/partitionfile
else
echo 'bootloader' >> /tmp/partitionfile
fi
echo "part /boot --size 512 --fstype $BOOTFSTYPE --ondisk $instdisk" >> /tmp/partitionfile
echo "part swap --recommended --ondisk $instdisk" >> /tmp/partitionfile
echo "part / --size 1 --grow --ondisk $instdisk --fstype $FSTYPE" >> /tmp/partitionfile
The following is an example of the partition script on Ubuntu, the partitioning script is /install/custom/my-partitions.sh:
if [ -d /sys/firmware/efi ]; then
echo "ubuntu-efi ::" > /tmp/partitionfile
echo " 512 512 1024 fat16" >> /tmp/partitionfile
echo ' $iflabel{ gpt } $reusemethod{ } method{ efi } format{ }' >> /tmp/partitionfile
echo " ." >> /tmp/partitionfile
else
echo "ubuntu-boot ::" > /tmp/partitionfile
echo "100 50 100 ext3" >> /tmp/partitionfile
echo ' $primary{ } $bootable{ } method{ format } format{ } use_filesystem{ } filesystem{ ext3 } mountpoint{ /boot }' >> /tmp/partitionfile
echo " ." >> /tmp/partitionfile
fi
echo "500 10000 1000000000 ext3" >> /tmp/partitionfile
echo " method{ format } format{ } use_filesystem{ } filesystem{ ext3 } mountpoint{ / }" >> /tmp/partitionfile
echo " ." >> /tmp/partitionfile
echo "2048 512 300% linux-swap" >> /tmp/partitionfile
echo " method{ swap } format{ }" >> /tmp/partitionfile
echo " ." >> /tmp/partitionfile
chdef -t osimage <osimagename> partitionfile='s:/install/custom/my-partitions.sh'
nodeset <nodename> osimage=<osimage>
Note: the 's:' preceding the filename tells nodeset that this is a script.
For Redhat, when nodeset runs and generates the /install/autoinst file for a node, it will add the execution of the contents of this script to the %pre section of that file. The nodeset command will then replace the #XCAT_PARTITION_START#...#XCAT_PARTITION_END# directives from the osimage template file with "%include /tmp/partitionfile" to dynamically include the tmp definition file your script created.
For Ubuntu, when nodeset runs and generates the /install/autoinst file for a node, it will replace the "#XCA_PARTMAN_RECIPE_SCRIPT#" directive and add the execution of the contents of this script to the /install/autoinst/<node>.pre, the /install/autoinst/<node>.pre script will be run in the preseed/early_command.</node></node>
The disk file contains the name of the disks to partition in traditional, non-devfs format and delimited with space “ ”, for example,
/dev/sda /dev/sdb
If not specified, the default value will be used.
chdef -t osimage <osimagename> -p partitionfile='d:/install/custom/partitiondisk'
nodeset <nodename> osimage=<osimage>
Note: the 'd:' preceding the filename tells nodeset that this is a partition disk file.
For Ubuntu, when nodeset runs and generates the /install/autoinst file for a node, it will generate a script to write the content of the partition disk file to /tmp/boot_disk, this context to run the script will replace the #XCA_PARTMAN_DISK_SCRIPT# directive in /install/autoinst/<node>.pre. </node>
The disk script contains a script to generate a partitioning disk file named "/tmp/boot_disk". for example,
rm /tmp/devs-with-boot 2>/dev/null || true;
for d in $(list-devices partition); do
mkdir -p /tmp/mymount;
rc=0;
mount $d /tmp/mymount || rc=$?;
if [[ $rc -eq 0 ]]; then
[[ -d /tmp/mymount/boot ]] && echo $d >>/tmp/devs-with-boot;
umount /tmp/mymount;
fi
done;
if [[ -e /tmp/devs-with-boot ]]; then
head -n1 /tmp/devs-with-boot | egrep -o '\S+[^0-9]' > /tmp/boot_disk;
rm /tmp/devs-with-boot 2>/dev/null || true;
else
DEV=`ls /dev/disk/by-path/* -l | egrep -o '/dev.*[s|h|v]d[^0-9]$' | sort -t : -k 1 -k 2 -k 3 -k 4 -k 5 -k 6 -k 7 -k 8 -g | head -n1 | egrep -o '[s|h|v]d.*$'`;
if [[ "$DEV" == "" ]]; then DEV="sda"; fi;
echo "/dev/$DEV" > /tmp/boot_disk;
fi;
If not specified, the default value will be used.
chdef -t osimage <osimagename> -p partitionfile='s:d:/install/custom/partitiondiskscript'
nodeset <nodename> osimage=<osimage>
Note: the 's:' prefix tells nodeset that is a script, the 's:d:' preceding the filename tells nodeset that this is a script to generate the partition disk file.
For Ubuntu, when nodeset runs and generates the /install/autoinst file for a node, this context to run the script will replace the #XCA_PARTMAN_DISK_SCRIPT# directive in /install/autoinst/<node>.pre. </node>
To support other specific partition methods such as RAID or LVM in Ubuntu, some additional preseed configuration entries should be specified, these entries can be specified in 2 ways:
'c:<the absolute path of the additional preseed config file>', the additional preseed config file
contains the additional preseed entries in "d-i ..." syntax. When "nodeset", the
#XCA_PARTMAN_ADDITIONAL_CFG# directive in /install/autoinst/<node> will be replaced with
content of the config file, an example:
d-i partman-auto/method string raid
d-i partman-md/confirm boolean true
's:c:<the absolute path of the additional preseed config script>', the additional preseed config
script is a script to set the preseed values with "debconf-set". When "nodeset", the
#XCA_PARTMAN_ADDITIONAL_CONFIG_SCRIPT# directive in /install/autoinst/<node>.pre will be replaced
with the content of the script, an example:
debconf-set partman-auto/method string raid
debconf-set partman-md/confirm boolean true
If not specified, the default value will be used.
Associate additional preseed configuration file by:
chdef -t osimage <osimagename> -p partitionfile='c:/install/custom/configfile'
nodeset <nodename> osimage=<osimage>
Associate additional preseed configuration script by:
chdef -t osimage <osimagename> -p partitionfile='s:c:/install/custom/configscript'
nodeset <nodename> osimage=<osimage>
If the partition script has any problem, the os installation will probably hang, to debug the partition script, you could enable the ssh access in the installer during installation, then login the node through ssh after the installer has started the sshd.
For Redhat, you could specify sshd in the kernel parameter and then kickstart will start the sshd when Anaconda starts, then you could login the node using ssh to debug the problem:
chdef <nodename> addkcmdline="sshd"
nodeset <nodename> osimage=<osimage>
For Ubuntu, you could insert the following preseed entries to /install/autoinst/<node> to tell the debian installer to start the ssh server and wait for you to connect:</node>
d-i anna/choose_modules string network-console
d-i preseed/early_command string anna-install network-console
d-i network-console/password-disabled boolean false
d-i network-console/password password cluster
d-i network-console/password-again password cluster
** Note: For the entry "d-i preseed/early_command string anna-install network-console",if there is already a "preseed/early_command" entry in /install/autoinst/<node>, the value "anna-install network-console" should be appended to the existed "preseed/early_command" entry carefully, otherwise, the former will be overwritten. </node>
The attributes “linuximage.addkcmdline” and “bootparams.addkcmdline” are the interfaces for the user to specify some additional kernel options to be passed to kernel/installer for node deployment.
The added kernel parameters can be 'OS deployment Only' or 'Reboot Only'(Added to the grub2.conf). A specific prefix 'R::' is defined to identify that this parameter is 'Reboot Only'. Otherwise, it's 'OS deployment Only'.
For example, to specify the redhat7 kernel option “net.ifnames=0” to be persistent (Reboot Only), that means it does take effect even after reboot:
chdef -t osimage -o rhels7-ppc64-install-compute -p addkcmdline="R::net.ifnames=0"
Note: The persistent kernel options with prefix 'R::' won't be passed to the OS installer for node deployment. So that means if you want a parameter to be available for both 'OS deployment' and 'Reboot', you need to specify the parameter twice with and without 'R::' prefix.
If there are quite a few(e.g. 12) network adapters on the SLES compute nodes, the os provisioning progress might hang because that the kernel would timeout waiting for the network driver to initialize. The symptom is the compute node could not find os provisioning repository, the error message is "Please make sure your installation medium is available. Retry?".
To avoid this problem, you could specify the kernel parameter "netwait" to have the kernel wait the network adapters initialization. On a node with 12 network adapters, the netwait=60 did the trick.
chdef <nodename> -p addkcmdline="netwait=60"
After the initial install of the distro onto nodes, if you want to update the distro on the nodes (either with a few updates or a new SP) without reinstalling the nodes:
copycds <path>/RHEL6.3-*-Server-x86_64-DVD1.iso
Or, for just a few updated rpms, you can copy the updated rpms from the distributor into a directory under /install and run createrepo in that directory.
chdef -t osimage rhels6.2-x86_64-install-compute -p pkgdir=/install/rhels6.3/x86_64
Note: the above command will add a 2nd repo to the pkgdir attribute. This is only supported for xCAT 2.8.2 and above. For earlier versions of xCAT, omit the -p flag to replace the existing repo directory with the new one.
updatenode compute -P ospkgs
This section describes how to install or configure a diskful node (we call it a golden-client), capture an osimage from this golden-client, then the osimage can be used to install/clone other nodes. See Using_Clone_to_Deploy_Server for more information.
Note: this support is available in xCAT 2.8.2 and above.
If you want to use the sysclone provisioning method, you need a golden-client. In this way, you can customize and tweak the golden-client’s software and configuration according to your needs, and verify it’s proper operation. Once the image is captured and deployed, the new nodes will behave in the same way the golden-client does.
To install a golden-client, follow the section Installing_Stateful_Linux_Nodes#Option_1:_Installing_Stateful_Nodes_Using_ISOs_or_DVDs.
To install the systemimager rpms on the golden-client, do these steps on the mgmt node:
Download the xcat-dep tarball which includes systemimager rpms. (You might already have the xcat-dep tarball on the mgmt node.)
Go to xcat-dep and get the latest xCAT dependency tarball. Copy the file to the management node and untar it in the appropriate sub-directory of /install/post/otherpkgs. For example:
(For RH/CentOS):
mkdir -p /install/post/otherpkgs/rhels6.3/x86_64/xcat
cd /install/post/otherpkgs/rhels6.3/x86_64/xcat
tar jxvf xcat-dep-*.tar.bz2
(For SLES):
mkdir -p /install/post/otherpkgs/sles11.3/x86_64/xcat
cd /install/post/otherpkgs/sles11.3/x86_64/xcat
tar jxvf xcat-dep-*.tar.bz2
(For RH/CentOS):
chdef -t osimage -o <osimage-name> otherpkglist=/opt/xcat/share/xcat/install/rh/sysclone.rhels6.x86_64.otherpkgs.pkglist
chdef -t osimage -o <osimage-name> -p otherpkgdir=/install/post/otherpkgs/rhels6.3/x86_64
updatenode <my-golden-cilent> -S
(For SLES):
chdef -t osimage -o <osimage-name> otherpkglist=/opt/xcat/share/xcat/install/sles/sysclone.sles11.x86_64.otherpkgs.pkglist
chdef -t osimage -o <osimage-name> -p otherpkgdir=/install/post/otherpkgs/sles11.3/x86_64
updatenode <my-golden-cilent> -S
On the mgmt node, use imgcapture to capture an osimage from the golden-client.
imgcapture <my-golden-client> -t sysclone -o <mycomputeimage>
Tip: when imgcapture is run, it pulls the osimage from the golden-client, and creates the image files system and a corresponding osimage definition on the xcat management node.
lsdef -t osimage <mycomputeimage> to check the osimage attributes.
The nodeset command tells xCAT what you want to do next with this node, rsetboot tells the node hardware to boot from the network for the next boot, and powering on the node usingrpower starts the installation process:
nodeset compute osimage=mycomputeimage
rpower compute boot
Tip: when nodeset is run, it processes the kickstart or autoyast template associated with the osimage, plugging in node-specific attributes, and creates a specific kickstart/autoyast file for each node in /install/autoinst. If you need to customize the template, make a copy of the template file that is pointed to by the osimage.template attribute and edit that file (or the files it includes).
It is possible to use the wcons command to watch the installation process for a sampling of the nodes:
wcons n1,n20,n80,n100
or rcons to watch one node
rcons n1
Additionally, nodestat may be used to check the status of a node as it installs:
nodestat n20,n21 n20: installing man-pages - 2.39-10.el5 (0%) n21: installing prep
Note: the percentage complete reported by nodestat is not necessarily reliable.
You can also watch nodelist.status until it changes to "booted" for each node:
nodels compute nodelist.status | xcoll
Once all of the nodes are installed and booted, you should be able ssh to all of them from the MN (w/o a password), because xCAT should have automatically set up the ssh keys (if the postscripts ran successfully):
xdsh compute date
If there are problems, see [Debugging_xCAT_Problems].
Note: this section describes how to create a stateless image using the genimage command to install a list of rpms into the image. As an alternative, you can also capture an image from a running node and create a stateless image out of it. See [Capture_Linux_Image] for details.
The copycds command copies the contents of the linux distro media to /install/<os>/<arch> so that it will be available to install nodes with or create diskless images.
If using an ISO, copy it to (or NFS mount it on) the management node, and then run:
copycds <path>/RHEL6.2-Server-20080430.0-x86_64-DVD.iso
If using a DVD, put it in the DVD drive of the management node and run:
copycds /dev/dvd # or whatever the device name of your dvd drive is
Tip: if this is the same distro version as your management node, create a .repo file in /etc/yum.repos.d with content similar to:
[local-rhels6.2-x86_64]
name=xCAT local rhels 6.2
baseurl=file:/install/rhels6.2/x86_64
enabled=1
gpgcheck=0
This way, if you need some additional RPMs on your MN at a later, you can simply install them using yum. Or if you are installing other software on your MN that requires some additional RPMs from the disto, they will automatically be found and installed.
Note: To use an osimage as your provisioning method, you need to be running xCAT 2.6.6 or later.
The provmethod attribute of your nodes should contain the name of the osimage object definition that is being used for those nodes. The osimage object contains paths for pkgs, templates, kernels, etc. If you haven't already, run copycds to copy the distro rpms to /install. Default osimage objects are also defined when copycds is run. To view the osimages:
lsdef -t osimage # see the list of osimages
lsdef -t osimage <osimage-name>
# see the attributes of a particular osimage
From the list found above, select the osimage for your distro, architecture, provisioning method (install, netboot, statelite), and profile (compute, service, etc.). Although it is optional, we recommend you make a copy of the osimage, changing its name to a simpler name. For example:
lsdef -t osimage -z rhels6.3-x86_64-netboot-compute | sed 's/^[^ ]\+:/mycomputeimage:/' | mkdef -z
This displays the osimage "rhels6.3-x86_64-netboot-compute" in a format that can be used as input to mkdef, but on the way there it uses sed to modify the name of the object to "mycomputeimage".
Initially, this osimage object points to templates, pkglists, etc. that are shipped by default with xCAT. And some attributes, for example otherpkglist and synclists, won't have any value at all because xCAT doesn't ship a default file for that. You can now change/fill in any osimage attributes that you want. A general convention is that if you are modifying one of the default files that an osimage attribute points to, copy it into /install/custom and have your osimage point to it there. (If you modify the copy under /opt/xcat directly, it will be over-written the next time you upgrade xCAT.) An important attribute to change is the rootimgdir which will contain the generated osimage files so that you don't over-write an image built with the shipped definitions. To continue the previous example:
chdef -t osimage -o mycomputeimage rootimgdir=/install/netboot/rhels6.3/x86_64/mycomputeimage
You likely want to customize the main pkglist for the image. This is the list of rpms or groups that will be installed from the distro. (Other rpms that they depend on will be installed automatically.) For example:
mkdir -p /install/custom/netboot/rh
cp -p /opt/xcat/share/xcat/netboot/rh/compute.rhels6.x86_64.pkglist /install/custom/netboot/rh
vi /install/custom/netboot/rh/compute.rhels6.x86_64.pkglist
chdef -t osimage mycomputeimage pkglist=/install/custom/netboot/rh/compute.rhels6.x86_64.pkglist
The goal is to install the fewest number of rpms that still provides the function and applications that you need, because the resulting ramdisk will use real memory in your nodes.
Also, check to see if the default exclude list excludes all files and directories you do not want in the image. The exclude list enables you to trim the image after the rpms are installed into the image, so that you can make the image as small as possible.
cp /opt/xcat/share/xcat/netboot/rh/compute.exlist /install/custom/netboot/rh
vi /install/custom/netboot/rh/compute.exlist
chdef -t osimage mycomputeimage exlist=/install/custom/netboot/rh/compute.exlist
Make sure nothing is excluded in the exclude list that you need on the node. For example, if you require perl on your nodes, remove the line "./usr/lib/perl5*".
The linuximage.pkgdir is the name of the directory where the distro packages are stored. It can be set to multiple paths. The multiple paths must be separated by ",". The first path is the value of osimage.pkgdir and must be the OS base pkg directory path, such as pkgdir=/install/rhels6.5/x86_64,/install/updates/rhels6.5/x86_64 . In the os base pkg path, there is default repository data. In the other pkg path(s), the users should make sure there is repository data. If not, use "createrepo" command to create them.
If you have additional os update rpms (rpms may be come directly the os website, or from one of the os supplemental/SDK DVDs) that you also want installed, make a directory to hold them, create a list of the rpms you want installed, and add that information to the osimage definition:
mkdir -p /install/updates/rhels6.5/x86_64 cd /install/updates/rhels6.5/x86_64 cp /myrpms/* .
OR, if you have a supplemental or SDK iso image that came with your OS distro, you can use copycds:
copycds RHEL6.5-Supplementary-DVD1.iso -n rhels6.5-supp
If there is no repository data in the directory, you can run "createrepo" to create it:
createrepo .
The createrepo command is in the createrepo rpm, which for RHEL is in the 1st DVD, but for SLES is in the SDK DVD.
NOTE: when the management node is rhels6.x, and the otherpkgs repository data is for rhels5.x, we should run createrepo with "-s md5". Such as:
createrepo -s md5 .
... myrpm1 myrpm2 myrpm3
Remember, if you add more rpms at a later time, you must run createrepo again.
chdef -t osimage mycomputeimage pkglist=/install/custom/install/rh/compute.rhels6.x86_64.pkglist
chdef -t osimage mycomputeimage -p pkgdir=/install/updates/rhels6.5/x86_64
OR, if you used copycds:
chdef -t osimage mycomputeimage -p pkgdir=/install/rhels6.5-supp/x86_64
Note: After making the above changes,
If you have additional rpms (rpms not in the distro) that you also want installed, make a directory to hold them, create a list of the rpms you want installed, and add that information to the osimage definition:
Create a directory to hold the additional rpms:
mkdir -p /install/post/otherpkgs/rh/x86_64
cd /install/post/otherpkgs/rh/x86_64
cp /myrpms/* .
createrepo .
NOTE: when the management node is rhels6.x, and the otherpkgs repository data is for rhels5.x, we should run createrepo with "-s md5". Such as:
createrepo -s md5 .
Create a file that lists the additional rpms that should be installed. For example, in /install/custom/netboot/rh/compute.otherpkgs.pkglist put:
myrpm1
myrpm2
myrpm3
Add both the directory and the file to the osimage definition:
chdef -t osimage mycomputeimage otherpkgdir=/install/post/otherpkgs/rh/x86_64 otherpkglist=/install/custom/netboot/rh/compute.otherpkgs.pkglist
If you add more rpms at a later time, you must run createrepo again. The createrepo command is in the createrepo rpm, which for RHEL is in the 1st DVD, but for SLES is in the SDK DVD.
If you have multiple sets of rpms that you want to keep separate to keep them organized, you can put them in separate sub-directories in the otherpkgdir. If you do this, you need to do the following extra things, in addition to the steps above:
In your otherpkgs.pkglist, list at least 1 file from each sub-directory. (During installation, xCAT will define a yum or zypper repository for each directory you reference in your otherpkgs.pkglist.) For example:
xcat/xcat-core/xCATsn
xcat/xcat-dep/rh6/x86_64/conserver-xcat
There are some examples of otherpkgs.pkglist in /opt/xcat/share/xcat/netboot/<distro>/service.*.otherpkgs.pkglist that show the format.
Note: the otherpkgs postbootscript should by default be associated with every node. Use lsdef to check:
lsdef node1 -i postbootscripts
If it is not, you need to add it. For example, add it for all of the nodes in the "compute" group:
chdef -p -t group compute postbootscripts=otherpkgs
Postinstall scripts for diskless images are analogous to postscripts for diskfull installation. The postinstall script is run by genimage near the end of its processing. You can use it to do anything to your image that you want done every time you generate this kind of image. In the script you can install rpms that need special flags, or tweak the image in some way. There are some examples shipped in /opt/xcat/share/xcat/netboot/<distro>. If you create a postinstall script to be used by genimage, then point to it in your osimage definition. For example:
chdef -t osimage mycomputeimage postinstall=/install/custom/netboot/rh/compute.postinstall
Note: This is only supported for stateless nodes in xCAT 2.7 and above.
Sync lists contain a list of files that should be sync'd from the management node to the image and to the running nodes. This allows you to have 1 copy of config files for a particular type of node and make sure that all those nodes are running with those config files. The sync list should contain a line for each file you want sync'd, specifying the path it has on the MN and the path it should be given on the node. For example:
/install/custom/syncfiles/compute/etc/motd -> /etc/motd
/etc/hosts -> /etc/hosts
If you put the above contents in /install/custom/netboot/rh/compute.synclist, then:
chdef -t osimage mycomputeimage synclists=/install/custom/netboot/rh/compute.synclist
For more details, see Sync-ing_Config_Files_to_Nodes.
You can configure any noderange to use this osimage. In this example, we define that the whole compute group should use the image:
chdef -t group compute provmethod=mycomputeimage
Now that you have associated an osimage with nodes, if you want to list a node's attributes, including the osimage attributes all in one command:
lsdef node1 --osimage
There are other attributes that can be set in your osimage definition. See the osimage man page for details.
If you are building an image for a different OS/architecture than is on the Management node, you need to follow this process: [Building_a_Stateless_Image_of_a_Different_Architecture_or_OS]. Note: different OS in this case means, for example, RHEL 5 vs. RHEL 6. If the difference is just an update level/service pack (e.g. RHEL 6.0 vs. RHEL 6.3), then you can build it on the MN.
If the image you are building is for nodes that are the same OS and architecture as the management node (the most common case), then you can follow the instructions here to run genimage on the management node.
Run genimage to generate the image based on the mycomputeimage definition:
genimage mycomputeimage
Before you pack the image, you have the opportunity to change any files in the image that you want to, by cd'ing to the rootimgdir (e.g. /install/netboot/rhels6/x86_64/compute/rootimg). Although, instead, we recommend that you make all changes to the image via your postinstall script, so that it is repeatable.
The genimage command creates /etc/fstab in the image. If you want to, for example, limit the amount of space that can be used in /tmp and /var/tmp, you can add lines like the following to it (either by editing it by hand or via the postinstall script):
tmpfs /tmp tmpfs defaults,size=50m 0 2
tmpfs /var/tmp tmpfs defaults,size=50m 0 2
But probably an easier way to accomplish this is to create a postscript to be run when the node boots up with the following lines:
logger -t xcat "$0: BEGIN"
mount -o remount,size=50m /tmp/
mount -o remount,size=50m /var/tmp/
logger -t xcat "$0: END"
Assuming you call this postscript settmpsize, you can add this to the list of postscripts that should be run for your compute nodes by:
chdef -t group compute -p postbootscripts=settmpsize
Now pack the image to create the ramdisk:
packimage mycomputeimage
Note: This procedure assumes you are using xCAT 2.6.1 or later.
The kerneldir attribute in linuximage table can be used to assign a directory containing kernel RPMs that can be installed into stateless/statelite images. The default for kernerdir is /install/kernels. To add a new kernel, create a directory named <kernelver> under the kerneldir, and genimage will pick them up from there.
The following examples assume you have the kernel RPM in /tmp and is using the default value for kerneldir (/install/kernels).
The RPM names below are only examples, substitute your specific level and architecture.
The RPM kernel package is usually named: kernel-<kernelver>.rpm.
For example, kernel-2.6.32.10-0.5.x86_64.rpm means kernelver=2.6.32.10-0.5.x86_64.
mkdir -p /install/kernels/2.6.32.10-0.5.x86_64
cp /tmp/kernel-2.6.32.10-0.5.x86_64.rpm /install/kernels/2.6.32.10-0.5.x86_64/
createrepo /install/kernels/2.6.32.10-0.5.x86_64/
Run genimage/packimage to update the image with the new kernel.
Note: If downgrading the kernel, you may need to first remove the rootimg directory.
genimage <imagename> -k 2.6.32.10-0.5.x86_64
packimage <imagename>
The RPM kernel package is usually separated into two parts: kernel-<arch>-base and kernel<arch>.
For example, /tmp contains the following two RPMs:
kernel-ppc64-base-2.6.27.19-5.1.x86_64.rpm
kernel-ppc64-2.6.27.19-5.1.x86_64.rpm
2.6.27.19-5.1.x86_64 is NOT the kernel version, 2.6.27.19-5-x86_64 is the kernel version.
The "5.1.x86_64" is replaced with "5-x86_64".
mkdir -p /install/kernels/2.6.27.19-5-x86_64/
cp /tmp/kernel-ppc64-base-2.6.27.19-5.1.x86_64.rpm /install/kernels/2.6.27.19-5-x86_64/
cp /tmp/kernel-ppc64-2.6.27.19-5.1.x86_64.rpm /install/kernels/2.6.27.19-5-x86_64/
Run genimage/packimage to update the image with the new kernel.
Note: If downgrading the kernel, you may need to first remove the rootimg directory.
Since the kernel version name is different from the kernel rpm package name, the -g flag MUST to be specified on the genimage command.
genimage <imagename> -k 2.6.27.19-5-x86_64 -g 2.6.27.19-5.1
packimage <imagename>
The kernel drivers in the stateless initrd are used for the devices during the netboot. If you are missing one or more kernel drivers for specific devices (especially for the network device), the netboot process will fail. xCAT offers two approaches to add additional drivers to the stateless initrd during the running of genimage.
genimage <imagename> -n <new driver list>
Generally, the genimage command has a default driver list which will be added to the initrd. But if you specify the '-n' flag, the default driver list will be replaced with your <new driver list>. That means you need to include any drivers that you need from the default driver list into your <new driver list>.
The default driver list:
rh-x86: tg3 bnx2 bnx2x e1000 e1000e igb mlx_en virtio_net be2net
rh-ppc: e1000 e1000e igb ibmveth ehea
sles-x86: tg3 bnx2 bnx2x e1000 e1000e igb mlx_en be2net
sels-ppc: tg3 e1000 e1000e igb ibmveth ehea be2net
Note: With this approach, xCAT will search for the drivers in the rootimage. You need to make sure the drivers have been included in the rootimage before generating the initrd. You can install the drivers manually in an existing rootimage (using chroot) and run genimage again, or you can use a postinstall script to install drivers to the rootimage during your initial genimage run.
Refer to the doc Using_Linux_Driver_Update_Disk#Driver_RPM_Package.
nodeset compute osimage=mycomputeimage
(If you need to update your diskless image sometime later, change your osimage attributes and the files they point to accordingly, and then rerun genimage, packimage, nodeset, and boot the nodes.)
Now boot your nodes...
rpower compute boot
Now that your basic cluster is set up, here are suggestions for additional reading:
Wiki: Cluster_Name_Resolution
Wiki: IBM_HPC_Stack_in_an_xCAT_Cluster
Wiki: Listing_and_Modifying_the_Database
Wiki: Managing_Ethernet_Switches
Wiki: Managing_the_Mellanox_Infiniband_Network
Wiki: Monitoring_an_xCAT_Cluster
Wiki: Setting_Up_a_Linux_Hierarchical_Cluster
Wiki: Using_Updatenode
Wiki: XCAT_AIX_POWER_Blade_Nodes
Wiki: XCAT_Cluster_with_IBM_BladeCenter
Wiki: XCAT_Linux_Statelite
Wiki: XCAT_Virtualization_with_KVM
Wiki: XCAT_Virtualization_with_RHEV
Wiki: XCAT_Virtualization_with_VMWare