The support of xCAT in SoftLayer is still in development.
There are some cases in which it is useful to use xCAT to manage SoftLayer bare metal servers or virtual machines (CCIs), when SoftLayer doesn't currently provide the provisioning or management options needed.
There are some things unique about the SoftLayer environment that make it a little challenging to use xCAT there. This document gives some tips about how to configure things so that xCAT can be used. For some of these procedures this document shows both an automated procedure and an manual procedure.
Currently, this document has only been written and validated for managing bare metal servers (not CCI virtual machines) and for provisioning SLES.
Use the SoftLayer portal to request either a VM (CCI) or bare metal server to run the xCAT management node software on. At a minimum, your xCAT management node should have these specs:
If you will be provisioning multiple physical servers simultaneously, have specific provisioning performance requirements, or will be using sysclone, you will need to increase the resources. A fully capable xCAT management node that will be able to handle just about all the provisioning requirements you will have is:
Depending on your needs, you may choose to configure your management node somewhere between these 2 extremes to balance performance and cost.
A centos 6.x management node was used when creating this document. If you choose to run a different distro on your mgmt node, some of the steps in this document will be slightly different.
To configure the management node, follow the 1st half of XCAT iDataPlex Cluster Quick Start to install and configure the xCAT management node. Install xCAT 2.8.5 or later. Stop before you get to the section "Node Definition and Discovery". SoftLayer uses Supermicro servers, not iDataPlex, but they are both x86_64 IPMI-controlled servers, so they are very similar from an xCAT stand point. You don't need to follow all of the steps in [XCAT_iDataPlex_Cluster_Quick_Start], so here is a summary of the steps from that document that you should perform:
Other useful xCAT documentation:
Now use the SoftLayer portal to request the bare metal servers that should be managed by xCAT. Request that they be loaded with the CentOS 6.x operating system. (Either centos or sles needs to be on the node for pushinitrd to function correctly.) It will be more convenient and it will probably perform a little better if you request that all of the servers be on the same private (backend) vlan as the xcat mgmt node. But this is not a requirement, xcat will work across vlans.
The utilities useful for running xCAT in a SoftLayer environment have been gathered into an RPM for convenience. This RPM is not installed by default when you install xCAT core, so you must explicitly install it now.
Note: currently the xCAT-SoftLayer RPM is only available in the development branch. You probably installed xCAT core from the stable branch (currently 2.8.x), so you won't find the xCAT-SoftLayer RPM there. But the development branch version (2.9) of the xCAT-SoftLayer RPM can be used with xCAT 2.8.x.
yum install xCAT-SoftLayer-*.rpm
The xCAT-SoftLayer rpm requires the perl-ExtUtils-MakeMaker, perl-CPAN, perl-Test-Harness, and perl-SOAP-Lite rpms, so yum will also install those (if they aren't already installed).
cd /usr/local/lib git clone https://github.com/softlayer/softlayer-api-perl-client
# Config file used by the xcat cmd getslnodes userid = SL12345 apikey = 1a2b3c4d5e6f1a2b3c4d5e6f1a2b3c4d5e6f1a2b3c4d5e6f apidir = /usr/loca/lib/softlayer-api-perl-client
Note: this config file will be used by the xCAT utility getslnodes (described in the next section).
cpan App::cpanminus # hit enter (taking the default "yes") a bunch of times cpanm XML::Hash::LX
To query all of the SL bare metal servers available to this account and display the xCAT node attributes that should be set:
getslnodes
To query a specific server or subset of servers:
getslnodes <hostname>
where <hostname> is the 1st part of one or more hostnames of the SL bare metal servers.
To create the xCAT node objects in the database, either copy/paste/run the commands output by the command above, or run:
getslnodes | mkdef -z
If your xCAT management node is also a bare metal server, this will create a node definition in the xCAT db for it too, which is probably not what you want. (xCAT does support having the mgmt node in the db and using xCAT to maintain software and config files on it, but that is probably not your main goal here, and you could accidentally make changes to your mgmt node that you might not intend.) If you want to remove your mgmt node from the db:
rmdef <mgmt-node>
Now add the nodes to the /etc/hosts file:
makehosts
Follow the steps in [Cluster_Name_Resolution] to set up name resolution for the nodes, but the quick steps are:
lsdef -t network -l
chdef -t site nameservers=<private-mn-ip> forwarders=<SL-name-servers> domain=<bm-domain>
search <domain> nameserver <private-mn-ip>
chdef -t site useflowcontrol=no
makedns -n # create named.conf and add all of the nodes
service dhcpd stop chdef -t site dhcpsetup=n
chdef -t site managedaddressmode=static
* Note: currently this site setting doesn't work correctly with the SLES support for configuring the private NIC into bond0.
If the provisioning of nodes doesn't work perfectly the 1st time, access to the console can be critical in figuring out and correcting the problem. There are 3 options for getting access to each nodes' console:
Choose the option that suits you, and follow the instructions below.
Use the BMC web interface to open a video console to a specific node. You must 1st install VNC and firefox.
On the xCAT mgmt node:
lsdef <node> -i serialport,serialspeed
yum install tigervnc-server firefox java icedtea-web fluxbox metacity xterm xsetroot
vncserver -geometry 1280x960 -AlwaysShared &
On your client machine (desktop or laptop) connect to the VNC server:
vncviewer <xcat-mn-public-ip>:1 &
From inside the VNC session:
metacity & sed -i s/twm/metacity/ /root/.vnc/xstartup
lsdef <node> -i bmc,bmcusername,bmcpassword
Setting up VPN from your laptop/desktop to your SoftLayer account gives you direct access to the private network of your SoftLayer servers, so for access to the node consoles this is an alternative to using VNC. There are a couple different ways to configure SSL VPN for SoftLayer:
Note: you can also VPN to SoftLayer using PPTP or Cisco AnyConnect, but i wasn't able to get that to work and you can only allow 1 userid from your SoftLayer account to do this.
Once you have a VPN connection to SoftLayer then there are a couple ways to get a console to a SoftLayer bare metal server:
xCAT has features to automatically configure additional NICs and routes on the nodes when they are being installed. In a SoftLayer environment, this can be convenient because there are usually additional NICs (other than the install NIC) and special routes that are needed. This section gives an example for nodes that have an eth1 NIC that is connected to the public VLAN and should be configured as part of bond1 and made the default gateway.
For more information about these features, see [Configuring_Secondary_Adapters] and the makeroutes man page.
lsdef -t network -l
mkdef -t network publicnet gateway=50.97.240.33 mask=255.255.255.240 mgtifname=eth1 net=50.97.240.32
* Note: in the networks table, mgtifname means the NIC on the xcat mgmt node that directly connects to that vlan. If the xcat mgmt node is not directly connected to this vlan (it reaches it via a router), then set mgtifname to "!remote!<nicname>" and add "!remote!" for "dhcpinterfaces" in site table.
chdef <node> nicips.eth1=50.2.3.4 nichostnamesuffixes.bond1=-pub
* Note: In this example, the "-pub" will be added to the end of the node name to form the hostname of the eth1 IP address. If you also want a completely different hostname (that doesn't start with the node name), set nicaliases.eth1. * Note: If you have a lot of nodes and your IP addresses follow a regular pattern, you can set them all at once, using xCAT's support for regular expressions. See [Listing and Modifying the Database](Listing_and_Modifying_the_Database/#using-regular-expressions-in-the-xcat-tables) for details.
makehosts <noderange> makedns <noderange>
allow-recursion { any; };
After editing /etc/named.conf, then run "service named restart". If you run "makedns -n" in the future, you will need to make this change to /etc/named.conf again (because it will be overwritten). This will be fixed in xcat in bug [#4144].
Normally, you would use the confignics postscript to configure eth1 at the end of the node provisioning. But since SoftLayer bare metal servers should have their NICs part of a bond, use the configbond postscript instead by adding it to the list of postscripts that should be run for these nodes:
chdef <noderange> -p postscripts='configbond bond1 eth1@eth3'
* Note: the -p flag adds the postscript to the end of the existing list.
updatenode <node> -P 'configbond bond1 eth1@eth3'
chdef <node> installnic=mac
* Note: there has been at least one case in which using installnic=mac (which results in the ksdevice kernel parameter (in RHEL) being set to the mac) doesn't work. We are still investigating it.
You can set up routes (both default gateway and more specific routes) to be configured on the nodes using the routes table, the routenames attribute and the setroute postscript. These 3 work together like this:
If you want to set the default gateway of the nodes to go out to the internet, create a route entry that points to the gateway IP address that SoftLayer defines for the public vlan for this node:
mkdef -t route def198_11_206 gateway=198.11.206.1 ifname=bond1 mask=0.0.0.0 net=0.0.0.0
Note: in this case, ifname is the NIC that the node will use to reach this gateway.
Add this route to the node definitions and add setroute to the postbootscripts list:
chdef <noderange> -p routenames=def198_11_206 chdef <node> -p postbootscripts='setroute'
If you are setting the node's default gateway to the public NIC, you will want a specific route for the private VLANs if you have servers in more than 1 private vlan:
mkdef -t route priv10_54_51 gateway=10.54.51.1 ifname=bond0 mask=255.0.0.0 net=10.0.0.0
chdef <noderange> -p routenames=priv10_54_51 chdef <node> -p postbootscripts='setroute'
chdef <node> xcatmaster=10.54.51.2
updatenode <node> -P 'setroute'
Because SoftLayer switches often respond to NIC state changes slowly (when the NICs are not bonded) and because bare metal nodes are often allocated on different vlans from the xCAT MN, it is necessary to use a different method for initiating the network installation of the node. (Normally, xCAT relies on PXE and DHCP broadcasts during the network installation process, which by default don't go across vlan boundaries.) The basic approach we will use is to copy to the node the kernel, initrd, and IP address that xCAT will use to install the node. After that, the xCAT node installation process will proceed like usual.
Using the xCAT scripted install method is covered more fully in [XCAT_iDataPlex_Cluster_Quick_Start]. Use this section here in this document as a supplement that is specific to the SoftLayer environment.
chdef -t osimage <osimagename> driverupdatesrc=dud:/install/drivers/sles11.3/x86_64/aacraid-driverdisk-1.2.1-30300-sled11-sp2+sles11-sp2.img
* Note: the aacraid rpm has an unusual format, so you can't use that with xCAT. * Note: details about adding drivers can be found in [Using_Linux_Driver_Update_Disk].
chdef -t osimage <osimagename> template=/opt/xcat/share/xcat/install/sles/compute.sles11.softlayer.tmpl
Note: so far only a template for SLES has been provided.
<drive> <device>XCATPARTITIONHOOK</device> <initialize config:type="boolean">true</initialize> <use>all</use> <partitions config:type="list"> <partition> <create config:type="boolean">true</create> <filesystem config:type="symbol">swap</filesystem> <format config:type="boolean">true</format> <mount>swap</mount> <mountby config:type="symbol">path</mountby> <partition_nr config:type="integer">1</partition_nr> <partition_type>primary</partition_type> <size>32G</size> </partition> <partition> <create config:type="boolean">true</create> <filesystem config:type="symbol">ext3</filesystem> <format config:type="boolean">true</format> <mount>/</mount> <mountby config:type="symbol">path</mountby> <partition_nr config:type="integer">2</partition_nr> <partition_type>primary</partition_type> <size>64G</size> </partition> </partitions> </drive>
Then use this file in your osimage:
chdef -t osimage <osimagename> partitionfile=/install/custom/my-partitions
rsetboot <noderange> hd rpower <noderange> boot
lsdef <noderange> -ci usercomment # note the pw of each node xdsh <node> -K # enter node pw when prompted
Note: you can skip this step if when you originally requested the servers from the SoftLayer portal, you gave it the xCAT management node's public key to put on the servers.
nodeset <noderange> osimage=sles11.2-x86_64-install-compute
pushinitrd <noderange>
rsetboot <noderange> hd xdsh <noderange> reboot
* Note: For some physical server types in softlayer, the rsetboot command fails. You can still proceed without it, it just means you will have to wait for the nodes to time out waiting for DHCP. * Note: Do not use rpower to reboot the node in this situation because that does not give the nodes a chance to sync the file changes (that pushinitrd made) to disk.
watch nodestat <noderange>
Some people prefer to use xCAT's sysclone method of capturing an image and deploying it to nodes, instead of using the scripted install method of node deployment (which is described in the previous chapter). The sysclone method (which uses the open source tool SystemImager underneath) enables you to install 1 golden node using the xCAT scripted install method, then further configure it exactly how you want it, capture the image, and then deploy that exact image on many nodes. If even enables you to subsequently capture updates to the golden node and push out just those deltas to the other nodes, making the updates much faster.
Using sysclone is covered more fully in [XCAT_iDataPlex_Cluster_Quick_Start]. Use this section here in this document as a supplement that is specific to the SoftLayer environment. These differences are needed because SoftLayer switches often respond to NIC state changes slowly (when the NICs are not bonded) and because bare metal nodes are often allocated on different vlans from the xCAT MN.
These steps only need to be done once, to prepare the xCAT mgmt node for using sysclone.
zypper install systemimager-server
service systemimager-server-rsyncd start chkconfig systemimager-server-rsyncd on
mkdir -p /install/post/otherpkgs/sles11.3/x86_64/xcat cd /install/post/otherpkgs/sles11.3/x86_64/xcat tar jxvf xcat-dep-*.tar.bz2
"Golden Node" is a term that means the server that you will configure the way you want many of your nodes to be and then take a snapshot of that image. You can have more than one golden node, if you have different types of nodes in your cloud. Normally, you should keep your golden nodes around long term, so that you can go back to them, apply updates, and capture the deltas.
Follow these steps to prepare a golden node and then capture its image. Some of these steps are described in more detail in XCAT_iDataPlex_Cluster_Quick_Start#Option_2:_Installing_Stateful_Nodes_Using_Sysclone, including its 2 subsections "Install or Configure the Golden Client" and "Capture image from the Golden Client". You should read the details in these 2 subsections, but the summary of what you will need to do is here, plus some additional steps.
chdef -t osimage -o <osimage-name> otherpkglist=/opt/xcat/share/xcat/install/rh/sysclone.sles11.x86_64.otherpkgs.pkglist chdef -t osimage -o <osimage-name> -p otherpkgdir=/install/post/otherpkgs/sles11.3/x86_64 updatenode <my-golden-cilent> -S
# These are files/dirs that are created automatically on the node, either by SLES, or by xCAT. /boot/grub /etc/grub.conf /etc/hosts /etc/udev/rules.d/* /etc/modprobe.d/bond0.conf /etc/modprobe.d/bond1.conf /etc/ssh /etc/sysconfig/syslog /etc/syslog-ng/syslog-ng.conf /opt/xcat /root/.ssh /var/cache /var/lib/* /xcatpost
imgcapture <my-golden-client> -t sysclone -o <myimagename>
This will rsync the golden node's file system to the xCAT mgmt node and put it under /install/sysclone/images/<image-name>.
Once the image has been captured, use these steps to deploy it to one or more nodes. All of these steps are performed on the xCAT mgmt node.
rsetboot <noderange> hd rpower <noderange> boot
lsdef <noderange> -ci usercomment # note the pw of each node xdsh <node> -K # enter node pw when prompted
Note: you can skip this step if when you originally requested the servers from the SoftLayer portal, you gave it the xCAT management node's public key to put on the servers.
nodeset <noderange> osimage=<captured-sysclone-image>
pushinitrd <noderange>
rsetboot <noderange> hd xdsh <noderange> reboot
* Note: For some physical server types in softlayer, the rsetboot command fails. You can still proceed without it, it just means you will have to wait for the nodes to time out waiting for DHCP. * Note: Do not use rpower to reboot the node in this situation because that does not give the nodes a chance to sync the file changes (that pushinitrd made) to disk.
watch nodestat <noderange>
If, at a later time, you need to make changes to the golden client (install new rpms, change config files, etc.), you can capture the changes and push them to the already cloned nodes. This process will only transfer the deltas, so it will be much faster than the original cloning.
imgcapture <my-golden-client> -t sysclone -o <myimagename>
If you are running xCAT 2.8.5 or later:
updatenode <noderange> -S
If you are running xCAT 2.8.4 or older:
xdsh <node> -s 'si_updateclient --server <mgmtnode-ip> --dry-run --yes'
xdsh <noderange> -s 'si_updateclient --server <mgmtnode-ip> --yes'
xdsh <noderange> -s mkinitrd # only valide for sles/suse, for red hat use dracut
If you want more information about the underlying SystemImager commands that xCAT uses, see the SystemImager user manual.
It is possible to configure the SoftLayer node's BMCs to work with the xCAT rcons command. Once this is set up, the rcons command is a very convenient way to view the node's consoles. But initially setting it up is not simple. Basically, you need to configure conserver on the xCAT mgmt node, and then determine each node's serial console port # and speed and configure the BMC accordingly.
To determine the console port number and speed that should be used, and to configure everything accordingly, follow this procedure. Because the different Supermicro server models that SoftLayer uses are not consistent about the COM port and speed, this process is currently a little bit trial and error.
On the xCAT management node:
chdef -t site consoleondemand=yes
chdef <node> cons=ipmi makeconservercf <node>
getslnodes <node> # note the node's password xdsh <node> -K # enter the password when prompted ssh <node> date # verify that the date command runs on the node w/o being prompted for a pw
rcons <node>
Now ssh to the node and do:
yum install ipmitool # if not already installed modprobe ipmi_devintf
ipmitool sol info 1 # note the speed that the bmc is currently using dmesg|grep ttyS # to see com ports avail (the deprecated msg is ok)
yum install screen # if not already installed screen /dev/ttyS1 115200 # try COM 2, use the speed the bmc is using
ipmitool sol set volatile-bit-rate 19.2 1 # use the speed the bios is using ipmitool sol set non-volatile-bit-rate 19.2 1
Back on the xCAT MN:
chdef <node> serialport=2 serialspeed=19200
rsetboot <node> hd; # set the next boot of the node to be the current OS on its local hard disk xdsh <node> reboot # to test rcons
If you want to manually copy the network installation files and settings to the nodes, instead of using the pushinitrd command, follow these steps:
nodeset <node> osimage=sles11.2-x86_64-install-compute
nodels <node> bootparams
scp /tftpboot/xcat/osimage/sles11.2-x86_64-install-compute/linux <node>:/boot/xcat-sles-kernel scp /tftpboot/xcat/osimage/sles11.2-x86_64-install-compute/initrd <node>:/boot/xcat-sles-initrd
rsetboot <node> hd xdsh <node> reboot
Note: because of the slowness of the switches to respond to NICs coming up, the installation process will probably hang at one point. On the console, autoyast will ask if you want to retry. Wait about 15 seconds and then retry and the process should continue.
See Setup xCAT High Available Management Node in SoftLayer for details.
Bugs: #4144
Wiki: Cluster_Name_Resolution
Wiki: Configuring_Secondary_Adapters
Wiki: Setup_xCAT_High_Available_Management_Node_in_SoftLayer
Wiki: Using_Linux_Driver_Update_Disk
Wiki: XCAT_Cluster_in_SoftLayer
Wiki: XCAT_Documentation
Wiki: XCAT_iDataPlex_Cluster_Quick_Start