
This cookbook provide information about initializing, discovering, defining, and managing the system Power 775 hardware. Everything described in this document is only supported in xCAT 2.6.6 and above. If you have other system p hardware, see [XCAT_System_p_Hardware_Management] .
If you are setting up a Power 775 cluster hierarchical cluster, it's better to get an overview about the hardware and software management of the SN and CN from the doc:
For the whole flow of setting up a Hierarchical Cluster, we can refer to the following two docs:
More information about the Power 775 related software can be found at:
The following terms will be used in this document:
xCAT DFM: Direct FSP Management is the name that we will use to describe the ability for xCAT software to communicate directly to the System p server's service processor without the use of the HMC for management.
Frame node: A node with hwtype set to frame represents a high end System P server 24 inch frame.
BPA node: is node with a hwtype set to bpa and it represents one port on one bpa (each BPA has two ports). For xCAT's purposes, the BPA is the service processor that controls the frame. The relationship between Frame node and BPA node from system admin's perspective is that the admin should always use the Frame node definition for the xCAT hardware control commands and xCAT will figure out which BPA nodes and their ip addresses to use for hardware service processor connections.
CEC node: A node with attribute hwtype set to cec which represents a System P CEC (i.e. one physical server).
FSP node: FSP node is a node with the hwtype set to fsp and represents one port on the FSP. In one CEC with redundant FSPs, there will be two FSPs and each FSP has two ports. There will be four FSP nodes defined by xCAT per server with redundant FSPs. Similar to the relationship between Frame node and BPA node, system admins will always use the CEC node for the hardware control commands. xCAT will automatically use the four FSP node definitions and their attributes for hardware connections.
For most operations, the Power 775 is managed directly by xCAT, not using the HMC. This requires the new xCAT Direct FSP Management plugin (xCAT-dfm-*.ppc64.rpm), which is not part of the core xCAT open source, but is available as a free download from IBM. You must download this and install it on your xCAT management node (and possibly on your service nodes, depending on your configuration) before proceeding with this document.
Download DFM and the prerequisite hardware server package from Fix Central :
Product Group: Power
Product: Cluster Software
Cluster Software: direct FSP management plug-in for xCAT
And
Product Group: Power
Product: Cluster Software
Cluster Software: HPC Hardware Server
The downloaded packages contains the installable rpm which has been tar-ed and zipped(compressed). After downloading these packages, uncompress, untar and install the hardware server package first and then do the same for the DFM package:
[RH]:
If you have been following the xCAT documentation, you should already have the yum repositories set up to pull in whatever xCAT dependencies and distro RPMs are needed (libstdc++.ppc, libgcc.ppc, openssl.ppc, etc.).
yum install xCAT-dfm-*.ppc64.rpm ISNM-hdwr_svr-*.ppc64.rpm
installp -d . -agQXY isnm.hdwr_svr
rpm -Uvh xCAT-dfm-*.ppc.rpm
When setting up a new cluster, you can use the xCAT commands xcatsetup and lsslp to specify the proper definition of all of the cluster hardware in the xCAT database, by automatically discovering and defining them. This is optional - you can define all of the hardware in the database by hand, but that can be confusing and error prone.
Note: this document focuses on the following environment:
If you want to discover and define older system p hardware, read [XCAT_System_p_Hardware_Management].
The hardware discovery should be performed after the Management Node is installed and configured with xCAT as indicated by this flow:
Starting from xCAT 2.6 and working in Power 775 cluster, there are two ways to initialize a network boot to the compute nodes: one way is that using xCAT rbootseq command to setup the boot device as network adapter for the compute nodes, and after that, you can issue xCAT rpower command to power on or reset the compute node to boot from network, another way is to use xCAT rnetboot command directly. Comparing these two ways, rbootseq/rpower commands don't require the console support and operate in the console, so it has a better performance. It is recommended to use rbootseq/rpower to setup the boot device to network adapter and initialize the network boot in Power 775 cluster.
rbootseq computenodes hfi
rpower computenodes boot
Here is the summary of the steps needed to discover the hardware and defined it properly in the database. Each step of the summary is explained in more detail in the subsequent sections.
Easy, right? Each step is explained in more detail below.
In the examples given below, it is assumed that you have redundant service LANs and that the xCAT management node has 2 NICs, each connected to one of the service LANs. The subnet of the first example service LAN is 10.230.0.0/255.255.0.0 and the subnet of the 2nd example service LAN is 10.231.0.0/255.255.0.0 .
xCAT uses SLP to discover the hardware components on the service networks. Before doing this, you must validate the following:
The P775 cluster may be using the Juniper ethernet switch to support the BPA/FSP HW service VLANS. The information below was provided by P775 network administrator on how to enable Juniper switch working with igmp-snooping for SLP support. We recommend to remove the HW service vlans from igmp-snooping protocol settings. The default configuration for igmp-snooping on the IBM J48E switch is to enable igmp-snooping for all VLANs:
protocols {
igmp-snooping {
vlan all;
Assuming there are multiple ethernet vlans being supported(HW service and management)on Juniper switch, the igmp-snooping configuration should enable igmp-snooping for only the cluster management VLAN and not the HW service VLANs.
protocols {
igmp-snooping {
vlan management;
The Juniper administrator commands to make this change are the following after entering into edit mode:
{master:0}[edit]
admin@j48052# edit protocols igmp-snooping
{master:0}[edit protocols igmp-snooping]
admin@j48052# show vlan all;
{master:0}[edit protocols igmp-snooping]
admin@j48052# set vlan management
{master:0}[edit protocols igmp-snooping]
admin@j48052# delete vlan all
{master:0}[edit protocols igmp-snooping]
admin@j48052# show vlan management;
{master:0}[edit protocols igmp-snooping]
admin@j48052# commit check
if clean, then
{master:0}[edit protocols igmp-snooping]
admin@j48052# commit synchronize (use the 'synchronize' if more than one switch configured in virtual chassis)
This section provide the commands used to setup the xCAT HW service VLANs into the the xCAT database and dhcp server environment.
Note: if you are adding new additional hardware to an existing cluster (not initially creating the cluster) then you can skip this section. If you are adding new networks to the cluster, then you will need to execute some steps in this section.
If you haven't already, configure with static IP addresses the management node's NICs that are connected to the service vlan and cluster management .
If you already had the management node's service LAN NICs configured when you installed xcat, it automatically ran "makenetworks" and created the necessary entries in the networks table. If not, run:
makenetworks
Now set the networks.dynamicrange attribute for each service LAN. For example the following represents 2 HW service VLANS :
chdef -t network 10_230_0_0-255_255_0_0 dynamicrange=10.230.200.1-10.230.200.200
chdef -t network 10_231_0_0-255_255_0_0 dynamicrange=10.231.200.1-10.231.200.200
Note: on AIX, the permanent IP addresses that the BPAs and FSPs will eventually be given must also be in the dynamic range. DHCP will give out dynamic addresses starting from the bottom of the dynamic range, so plan to have the permanent addresses be higher in the range such that there will be no collisions. For example, make the dynamic range 10.230.1.1-10.230.3.200, then plan to have the BPA permanent IP addresses in the 10.230.2. range and the FSPs in the 10.230.3.. Assuming you have less than 254 BPAs and FSPs, DHCP will initially give them dynamic addresses in the 10.230.1. range. On linux, the dynamic range should not* include the permanent IP addresses, and should therefore be big enough to just contain the dynamic addresses.
If you want the network definitions to have more user friendly names in the database, you can set them to anything you want. For example:
chdef -t network 10_230_0_0-255_255_0_0 -n servicelan1
chdef -t network 10_231_0_0-255_255_0_0 -n servicelan2
Set site.dhcpinterfaces to the list of NICs (on the management node and service nodes) that DHCP should listen on. For the management node, this is normally the NICs that are connected to the service LAN and the NICs connected to the cluster management LAN. For the service node it should only be the NIC connected to the compute node LAN:
chdef -t site clustersite dhcpinterfaces='mgmtnode|eth1,eth2,eth3,eth4;service|hf0'
Set the powerinterval in the site table to 30. This is needed to meet the demand of the large number LPARs in the Power 775 system. For more information see By default it is 0.
See [Hints_and_Tips_for_Large_Scale_Clusters].
chdef -t site clustersite powerinterval=30
On AIX, you have to stop the bootp daemon before starting dhcp, because they listen on the same port number:
Stop bootp from starting on reboot or restart of inetd by commenting out the bootps line in /etc/inetd.conf file:
#bootps dgram udp wait root /usr/sbin/bootpd bootpd /etc/bootptab
Restart inetd and kill bootp just to make sure:
refresh -s inetd # restart the inetd subsystem
kill `ps -ef | grep bootp | grep -v grep | awk '{print $2}' ` # stop the bootp daemon
Uncomment this line in /etc/rc.tcpip so that dhcpsd will start after a reboot.
start /usr/sbin/dhcpsd "$src_running" # start up the DHCP Server
Have xCAT configure the service network stanza for dhcpd and then start the daemon:
makedhcp -n
service dhcpd restart # linux
startsrc -s dhcpsd # AIX
Look at the DHCP configuration file on the xCAT management node to ensure that it contains only the networks you want:
cat /etc/dhcpd.conf # Linux, except for RHEL6
cat /etc/dhcp/dhcpd.conf # RHEL6
cat /etc/dhcpsd.cnf # AIX
If you need to make updates to the DHCP configuration file, you should stop the DHCP daemon, edit the DHCP configuration file, and then restart the DHCP daemon on your xCAT MN.
Before run any DFM hardware control commands in the large cluster, you should make sure the consoleondemand in the site table is set to yes. This is needed to meet the demand of the large number LPARs and CECs. In the Power 775 system, console is opened by fsp-api which sends command to the HWS. When set to 'no', all the consoles will be opened, and it will affect the performance of the DFM hardware control commands. When set to 'yes', conserver connects and creates the console output only when the user opens the console. Default is no on Linux, yes on AIX.
chdef -t site clustersite consoleondemand=yes
After you change the change the consoleondemand=yes, you should run makeconservercf to take effects.
makeconservercf
There is a new work item to support remote discovery and connectivity to the HMC from xCAT MN. This section is currently TBD, but will cover some of the following:
The xCAT admin will manually need to connect the HMC at this time.
Setting up the HMC network for use by xCAT
Reference the HMC website and documentation for more knowledge. The following are minimal steps required to Setup the HMC network for Static IP,and enable SLP and SSH ports working with HMC GUI.
This section will describe the hardware discovery of the HMCs and their requirement to support Service Focal Point (SFP) working with Power 775 clusters. You will execute xCAT commands "lsslp" and "mkdef" to define the HMC nodes in xCAT DB. See man page of lsslp for details.
Note: Even if you use xCAT Direct FSP Management, you still need to discover the HMC, and make the connections between HMC and the xCAT MN. The HMC will always be used for Service Focal Point, Service Repair and Verify procedures.
Run lsslp to locate the HMC information and write into a HMC stanza file. The IP address is the address assigned to the hardware service network on the EMS.
lsslp -s HMC -i 10.230.0.0,10.231.0.0 -z > /hmc/stanza/file
Review the HMC stanza file and make necessary modifications. You will want to include the username and password attributes, and update HMC ip attribute to the proper ip address of the service network for the target HMC node. Make sure that the ip address and host name is resolvable in the xCAT cluster name resolution (/etc/hosts, DNS).
Write the HMC stanza information into xCAT DB with xCAT command mkdef.
cat /hmc/stanza/file | mkdef -z
You will need to supply the current hscroot password to xCAT by defining it in the passwd table in the database.
chtab key=hmc passwd.username=hscroot passwd.password=<current password>
You will want to enable the SSH interface between the xCAT MN and HMC, so the xCAT commands will run without being prompted for passwords. Run the "rspconfig" command to do this:
rspconfig <HMC node> sshcfg=enable
After you setup the ssh keys to the HMC with the rspconfig command, xCAT will no longer need the hscroot password in the database and it can be removed. It will be needed in the future, if root's ssh keys are ever regenerated on the EMS. If ssh keys are regenerated, then the rspconfig <HMC node> sshcfg=enable command will have to be rerun, and the new password will need to be available in the database on the EMS.
If you change hscroot password on the HMC, you should update the xCAT database password file with the new password. See "Define the HMC hscroot password". You do not need to rerun rspconfig <HMC node> sshcfg=enable because changing the password does not affect ssh keys.
SLP gives xCAT a list of hardware components on the network, without telling it the physical location of each. This means that xCAT does not have a way to give each component a sensible name without getting a little bit of information from you: the mapping between the name you want each frame to have and its MTMS (machine type, model, and serial #).
To provide this information, first manually power on the frames. (If the frames are being powered on (EPO'ed), the BPAs will come up in rack standy mode. At this point there will not be any power to the CEC FSPs so they will not yet be able to be discovered. So we must first discover the frames, get them defined in the database, and make connections to them, so we can get them out of rack standby mode. This process will be accomplished in the next several sections.)
At this point, you should do one of the next 2 sections, but not both. If you want to use xcatsetup to define nodes, follow the steps in the green section entitled "Use xcatsetup to Create Initial Definitions in the Database". If you want to create the nodes manually, follow the steps in the blue section entitled "Creating Initial Node Definitions Manually". After following either the green or blue section, continue with the section "Discover the BPAs, Modify Their Network Information, and Connect To Them".
The xcatsetup command creates initial node definitions in the xCAT database, based on naming conventions and IP address ranges that you provide via a cluster configuration file. In a later step, xCAT will combine this information with the SLP information discovered on the service network to create a complete picture of your cluster hardware components. Note: If the xcatsetup command does not apply well to your cluster because your naming patterns have too many exceptions, you can instead create node definitions manually. See the section
XCAT_Power_775_Hardware_Management/#creating-initial-node-definitions-manually for instructions on how to do that.
Create a cluster config file with information about the hardware components that should be defined. Note that you are not only specifying the naming pattern for the HMCs, frames, and CECs, but also the permanent IP addresses you want the BPAs and FSPs to have. (When the BPAs and FSPs initially power on, they will get dynamic IP addresses from DHCP. Once you are done with this whole discovery chapter, DHCP will always provide the IP addresses you define in the cluster config file. We call these the "permanent" IP addresses.) For a detailed description of the cluster config file, see the xcatsetup man page. Here's a sample config file:
Have xCAT generate a stanza file of frame definitions (with MTMS) so you can easily give each one a name:
lsslp -s FRAME -i 10.230.0.0,10.231.0.0 --vpdtable > vpd-frame.stanza
Note: -m won't be used and multicast will be the one of the default way of lsslp from xCAT 2.7.3.
Edit the stanza file to give the desired node name to each frame object, identifying the frames by MTMS. (The node name is the identifier before the colon. See the xcatstanzafile man page for details.) The node names should help indicate frame position (e.g. frame01, frame02, etc.) because xCAT will use that information to understand the physical order of the hardware.
Create a file called supernodelist.txt that specifies the supernode numbers for all of the CECs. For example,
frame61: 0,1,16
frame62: 17,32
frame63: 33,48,49
Now create the cluster config file. Here is a simple configuration file example.
# A small cluster config file for a single 2 frame bldg block.
# Just the hmcs, frames, bpas, cecs, and fsps are created.
xcat-site:
use-direct-fsp-control = 1
xcat-hmcs:
hostname-range = hmc[1-3]
starting-ip = 40.0.0.110
xcat-frames:
hostname-range = frame[1-3]
num-frames-per-hmc = 1
vpd-file = vpd-frame.stanza
# This assumes you have 2 service LANs: a primary service LAN 40.x.y.z/255.0.0.0 that all of the port 0's
# are connected to, and a backup service LAN 41.x.y.z/255.0.0.0 that all of the port 1's are connected to.
# "x" is the frame number and "z" is the bpa/fsp id (1 for the first BPA/FSP in the Frame/CEC, 2 for the
# second BPA/FSP in the Frame/CEC). For BPAs "y" is always be 0 and for FSPs "y" is the cec id.
vlan-1 = 40
vlan-2 = 41
xcat-cecs:
hostname-range = f[1-3]c[01-12]
num-cecs-per-frame = 12
supernode-list = supernodelist.txt
Run xcatsetup to create the initial node definitions:
xcatsetup <config-file-name>
This writes the following essential attributes to the database (more attributes are written, but these are the attributes that are necessary for running lsslp later on):
Note: unlike most nodes in the xCAT database, the BPAs and FSPs will use their IP address (the permanent one) as their node name. The BPA and FSP nodes are also hidden by default and will normally not be displayed by the lsdef command. Use the -S option of lsdef to display the hidden nodes.
If the xcatsetup command does not apply well to your cluster because your naming patterns have too many exceptions, you can create node definitions manually to prepare for running lsslp. (If you used xcatsetup, skip this section.)
Before running lsslp ensure the BPAs picked up a temporary ip address from DHCP. Otherwise lsslp will return with no reponses.
For AIX DHCP environment, you can execute the "dadmin" command .
dadmin -s | grep -v Free # AIX only
For RH6 DHCP environment, you can check the "dhcpd.leases"
files located under /var/lib/dhcpd directory
cat /var/lib/dhcpd/dhcpd.leases
Run lsslp to produce a stanza file of the frames and BPAs:
lsslp -s FRAME -i 10.230.0.0,10.231.0.0 -z >frames.stanza
Edit the stanza file to give the frames the symbolic node names you want, and the BPAs the IP addresses you want. Adjust the parent and hcp attributes accordingly. Leave the otherinterfaces attribute set to the dynamic DHCP addresses. See the xcatstanzafile man page] for the syntax.
Note: the IP addresses you choose for BPAs should not be in the DHCP dynamic range.
Frame node: Here is an example of the attributes that should be set for the frame nodes:
frame14:
objtype=node
groups=frame,all
hcp=frame14
id=14
mgt=bpa
mtm=78AC-100
nodetype=ppc
hwtype=frame
parent=1
serial=BB50026
sfp=hmc1
In the above example, the attribute meanings are:
BPA node: Here is an example of the attributes that should be set for the BPA nodes:
10.230.2.14:
objtype=node
groups=bpa,all
side=A-0
nodetype=ppc
hwtype=bpa
parent=frame14
mac=00:09:6b:ad:07:b5
hidden=1
Create the frame and BPA objects in the xCAT database:
cat frames.stanza | mkdef -z
** Frame in Rack Standby **: If you are working with brand new frames, they may still be in "rackstandby" mode, which means there is no power to the CECs, which means the FSPs can not respond to the lsslp commands below at this time. If this is your situation, complete the next section XCAT_Power_775_Hardware_Management/#discover-the-bpas-modify-their-network-information-and-connect-to-them now to get the frames out of rackstandby mode. Then return to this section and complete the lsslp and mkdef commands for the CECs and FSPs.
Run lsslp to produce a stanza file of the CECs and FSPs:
lsslp -s CEC -i 10.230.0.0,10.231.0.0 -z >cecs.stanza
Edit the stanza file to give the CECs the symbolic node names you want, and the FSPs the IP addresses you want. Adjust the parent and hcp attributes accordingly. Leave the otherinterfaces attribute set to the dynamic DHCP addresses.
Note: the IP addresses you choose for FSPs should not be in the DHCP dynamic range.
CEC node: Here is an example of the attributes that should be set for the CEC nodes:
cec06:
objtype=node
groups=cec,all
hcp=cec06
id=6
mgt=fsp
mtm=9125-F2C
nodetype=ppc
hwtype=cec
parent=frame14
serial=02D8B25
sfp=hmc1
supernode=7,0
In above example, the attributes:
To go along with the CEC supernode numbers, set the HFI switch topology value in the xCAT site table. See the ISNM documentation for the correct value.
chdef -t site topology=32D
FSP node: Here is an example of the attributes that should be set for the FSP nodes:
10.230.4.6:
objtype=node
groups=all,fsp
nodetype=ppc
hwtype=fsp
parent=cec06
side=A-0
mac=00:09:6b:ad:07:b3
hidden=1
Create the CEC and FSP objects in the xCAT database:
cat cecs.stanza | mkdef -z
Discover the BPAs on the network, match them with the corresponding frames and BPAs in the database (using the mtms), and write additional attributes in the database (like mac addresses). The -i flag uses the ip addresses that is configured for dhcp.
lsslp -s FRAME -i 10.230.0.0,10.231.0.0 -w
Verify that the frame and BPA definitions in the database are correct.
lsdef frame
lsdef bpa -S # normally BPAs are hidden from output, so need the -S flag
Verify that:
Configure DHCP with the permanent ip/mac pairs so that it will always give the BPAs their permanent IP address from now on:
makedhcp bpa
Create nodename/ip mapping for BPAs:
makehosts bpa
Verify that the proper IP/MAC pairs were configured in dhcp:
cat /var/lib/dhcpd/dhcpd.leases # RHEL 6
cat /var/lib/dhcp/db/dhcpd.leases # SLES 11
cat /etc/dhcpsd.cnf # AIX
To enable xCAT to connect to the BPAs, you must add the current passwords for the frames/BPAs in the xCAT database. If the password for all of the frames is the same, you can set the username/password in the passwd table:
chtab key=bpa,username=HMC passwd.password=xxx
chtab key=bpa,username=admin passwd.password=yyy
chtab key=bpa,username=general passwd.password=zzz
If the passwords for some of the frames/BPAs are different, you can set the passwords for individual frames/BPAs in the ppcdirect table:
chdef frame1 passwd.HMC=xxx passwd.admin=yyy passwd.general=zzz
The BPAs will send their DHCP request to the DHCP server about every five minutes. So after the DHCP configured, the BPAs will get their permanent IP addresses about five minutes later. The pping can be a help to see if the BPAs has got new IP addresses. If the BPA has got its permanent IP address, the pping result will be: "bpa1:ping".
pping bpa
For the BPAs who can't refresh their IP addresses, the --resetnet option, the rspconfig command expects each BPA's otherinterfaces attribute to be set to the dynamic IP address that it currently has, and the node name of the BPA to be set to the permanent IP address you want it to have.
rspconfig bpa1 --resetnet
Have xCAT's DFM daemon (hdwr_server) on the xCAT EMS establish connections to all of the frames for the primary service VLAN: The default attributes for mkhwconn are -T lpar and --port 0 for the primary hardware service VLAN.
mkhwconn frame -t
The support for Dual hardware service VLANs is not currently supported for the P775 cluster in xCAT 2.6.6. If the admin needs to work with a 2nd hardware service VLAN, they need to specify the -T lpar and --port 1 options to establish connections to the second service VLAN for each BPA of the frames.
mkhwconn frame -t -T lpar --port 1
Verify the hardware connections were made successfully for the primary service VLAN. If the admin does allocate a second service VLAN for the cluster, these hardware connections may show a "LINE DOWN".
lshwconn frame
frame14: 40.14.0.1: side=a,ipadd=40.14.0.1,alt_ipadd=unavailable,state=LINE UP
frame14: 40.14.0.2: side=b,ipadd=40.14.0.2,alt_ipadd=unavailable,state=LINE UP
If the BPA passwords are still the factory defaults, you must change them before running any other commands to them:
rspconfig frame general_passwd=general,<newpd>
rspconfig frame admin_passwd=admin,<newpd>
rspconfig frame HMC_passwd=,<newpd>
Verify the hardware control setup is correct for the frames. If this is the initial frame setup or an EPO happened, the BPAs will at rack standby state. The normal state of the BPAs if not in "rack standby" should be at "Both BPAs at standby" state.:
rpower frame state
frame14: BPA state - Both BPAs at rack standby
lshwconn frame
frame14(40.14.0.2): resource_type=frame,side=b,ipaddr=192.168.200.239,alt_ipaddr=unavailable,state=Connected
frame14(40.14.0.1): resource_type=frame,side=a,ipaddr=192.168.200.247,alt_ipaddr=unavailable,state=Connected
frame14(20.0.0.167): Connection not found
frame14(20.0.0.168): Connection not found
There is a possibility that you will need to update the frame power code firmware as part of the P775 installation. Make sure you locate and down load the supported P775 power code and firmware from IBM Fix Central to a directory on your xCAT MN. The rflash command references this GFW directory and updates the frame power code for your frames. Make sure that there is space available in /tmp file system since rflash temporarily places GFW tracking files under /tmp/fwupdate. You will update the CECs firmware at a later time. Before do the firmware update, make sure the pending power on side of the Frames' BPAs are temp. If not, set it to temp.
rspconfig frame pending_power_on_side
rspconfig frame pending_power_on_side=temp
And then, start the update.
rflash frame -p <directory> --activate disruptive
(output to be added here)
rinv frame firm
Set the frame number in each frame object: You should first check to make sure you have setup the "id" attribute to match the proper frame number. If the frame id=0, you should update using the "chdef" for each frame object.
rspconfig frame 'frame=*'
Note: at this point in the process, the CECs should not have power to them yet. But if for some reason they do, they must be powered off before the frame number can be set.
Set the system name of the frame to match the node name in the xCAT database. This will cause the frame names displayed in the HMC to match the frame names in the xCAT database.
rspconfig frame 'sysname=*'
To enable xCAT to connect Frames to the target HMC, you must add the current HMC username password for the frame nodes in the xCAT database. If the password for all of the frames are the same, you can set the username/password in the passwd table; if unique passwords are used, they must be updated in the ppcdirect table.
chtab key=frame,username=HMC passwd.password=xxx
Associate the HMCs with the appropriate frames: You should first make sure that the frame objects have setup the "sfp" attribute to the proper HMC node object. This is needed to allow the frame to support connections to the BPA through DFM and the HMC. You also need to make sure that there is proper SSH connection from the EMS to the HMC.
mkhwconn frame -s
Inform the hardware installation team that the HMCs should now be able to recognize the frames and that they may begin to fill the water in the frames.
Note: the HMC may list the frames as incomplete.
Have the hardware installation team inform you when the water fill procedure is complete. Then use xCAT to move the frames out of rack standby mode so the final top-off procedure can be completed on the HMC:
rpower frame exit_rackstandby
Verify the state of the frames:
rpower frame state
frame14: BPA state - Both BPAs at standby
When the frames exited rackstandby mode in the end of the last section, the FSPs were powered on. This should have caused the FSPs to request and receive a dynamic IP address from DHCP. The MAC addresses cannot be collected by the lsslp command below until this has taken place.
Run lsslp to discover the CECs/FSPs, and match the discovered hardware with the corresponding objects in the database, and write additional attributes in the database. The -i flag specifies the ip subnet that is configured as the dynamic range in DHCP.
lsslp -s CEC -i 10.230.0.0,10.231.0.0 -w
For each FSP discovered on the network, the lsslp command uses the cage # and its parent (frame) MTMS to match the correct CEC entry in the database. The attributes that will be written to the database are:
You can confirm these settings by running:
lsdef -i mtm,serial cec
lsdef -S -i mac,otherinterfaces fsp
You can compare this information to the labels on the front of the CECs to verify that the matching worked correctly. Review/verify all of the attributes of the CECs and FSPs:
lsdef cec
lsdef fsp -S # normally FSPs are hidden from output, so need the -S flag
Verify that:
Configure DHCP with the permanent ip/mac pairs so that it will always give the FSPs their permanent IP address from now on:
makedhcp fsp
Create nodename/ip mapping for FSPs:
makehosts fsp
Verify that the proper IP/MAC pairs were configured in dhcp:
cat /var/lib/dhcpd/dhcpd.leases # RHEL 6
cat /var/lib/dhcp/db/dhcpd.leases # SLES 11
cat /etc/db_file.cr # AIX
The FSPs will send their DHCP request to the DHCP server about every five minutes. So after the DHCP configured, the FSPs will get their permanent IP addresses about five minutes later. The pping can be a help to see if the FSPs has got new IP addresses. If the FSP has got its permanent IP address, the pping result will be: "FSP1:ping".
pping fsp
For the FSPs who can't refresh their IP addresses, the --resetnet option, the rspconfig command expects each FSP's otherinterfaces attribute to be set to the dynamic IP address that it currently has, and the node name of the FSP to be set to the permanent IP address you want it to have.
rspconfig fsp1 --resetnet
If you want to verify that all the BPAs and FSPs now have the correct IP addresses and are defined in the database with the correct parents, you can run lsslp to have it match what it discovers on the network with what is defined in the database:
$ lsslp -i 10.230.0.0,10.231.0.0
BPA 78AC-100 9920035 A-0 40.11.0.1 f11c00bpca_a
BPA 78AC-100 9920035 B-0 40.11.0.2 f11c00bpcb_a
CEC 9125-F2C 02C4D86 f11c01
FSP 9125-F2C 02C4D86 A-0 40.11.1.1 f11c01fsp1_a
FSP 9125-F2C 02C4D86 B-0 40.11.1.2 f11c01fsp2_a
CEC 9125-F2C 02C4E06 f11c02
FSP 9125-F2C 02C4E06 A-0 40.11.2.1 f11c02fsp1_a
FSP 9125-F2C 02C4E06 B-0 40.11.2.2 f11c02fsp2_a
...
CEC 9125-F2C 02C5066 f12c12
FSP 9125-F2C 02C5066 A-0 40.12.12.1 f12c12fsp1_a
FSP 9125-F2C 02C5066 B-0 40.12.12.2 f12c12fsp2_a
FRAME 78AC-100 9920035 frame11
FRAME 78AC-100 9920033 frame12
To enable xCAT to connect to the FSPs, you must add the current passwords for the CEC/FSPs in the xCAT database. If the password for all of the CECs is the same, you can set the username/password in the passwd table:
chtab key=fsp,username=HMC passwd.password=xxx
chtab key=fsp,username=admin passwd.password=yyy
chtab key=fsp,username=general passwd.password=zzz
If the passwords for some of the CEC/FSPs are different, you can set the passwords for individual CEC/FSPs in the ppcdirect table:
chdef cec1 passwd.HMC=xxx passwd.admin=yyy passwd.general=zzz
Have xCAT's DFM daemon (called hw server) establish connections to all of the CECs:
mkhwconn cec -t
For Dual VLAN, need to specify the --port option to establish the two connections for each FSP of the CECs:
mkhwconn cec -t -T lpar
mkhwconn cec -t -T lpar --port 1
If the FSP passwords are still the factory defaults, you must change them before running any other commands to them:
rspconfig cec general_passwd=general,<newpd>
rspconfig cec admin_passwd=admin,<newpd>
rspconfig cec HMC_passwd=abc123,<newpd>
Set the system name of the CEC to match the node name in the xCAT database. This will cause the CEC names displayed in the HMC to match the CEC names in the xCAT database.
rspconfig cec 'sysname=*'
To enable xCAT to connect CECs to the target HMC, you must add the current HMC username password for the cec nodes in the xCAT database. If the password for all of the cecs are the same, you can set the username/password in the passwd table; if unique passwords are used, they must be updated in the ppcdirect table.
chtab key=cec,username=HMC passwd.password=xxx
Associate the HMCs with the appropriate CECs:
mkhwconn cec -s
Verify the connections were made successfully:
lshwconn cec
(output to be added here)
Verify the hardware control setup is correct for the CECs:
rpower cec state
(output to be added here)
lshwconn cec -s
cec1(192.168.200.239): resource_type=frame,side=b,ipaddr=192.168.200.239,alt_ipaddr=unavailable,state=Connected
cec1(192.168.200.247): resource_type=frame,side=a,ipaddr=192.168.200.247,alt_ipaddr=unavailable,state=Connected
cec1(20.0.0.167): Connection not found
cec1(20.0.0.168): Connection not found
The admin should plan to upgrade the firmware for both the Bulk Power Code (BPC), and the CEC firmware. This is accomplished by using the rflash xCAT command from the xCAT EMS. The admin should download the supported GFW from the IBM Fix central website, and place it in a directory that is available to be read by the xCAT EMS.
Use rinv command to get the current firmware levels of the frames and CECs:
rinv frame firm
rinv cec firm
(output to be added here)
Make sure the pending power on side of the CEC's FSPs are temp. If not, set it to temp.
rspconfig cec pending_power_on_side
rspconfig cec pending_power_on_side=temp
Use the rflash command to update the firmware levels for the CECs. Then validate that the new firmware is loaded:
rflash cec -p <directory> --activate disruptive
(output to be added here)
rinv cec firm
Verify that the CECs are healthy:
rpower cec state
rvitals cec lcds
You may want to check that the CNM switch configuration data is properly defined in the xCAT DB prior to powering up the CECs and working with the Octant/LPAR definitions. This activity may save you some CEC reboot time later.
Check that the correct HFI switch topology has been set in the site table. The topology definition is based on the the number of CECs and type of HFI network configured for your Power 775 cluster.
lsdef -t site -l -i topology # should be one of supported configs: 8D, 32D, 128D
Check to make sure that the CEC node objects have the proper "supernode" attribute defined. The supernode will specify the HFI configuration being used by the CEC. You should also make sure the cage id is properly defined where the "id" attribute matches the cage position for the CEC node. The CNM daemon and configuration commands will setup the Master ISR identifier for each CEC. This will allow the HFI communications to work within the Power 775 cluster.
lsdef cec # check supernode and id attribute for each cec object
The P775 admin can now power on the CECs, and validate they come up to working state. You can monitor the power up of the CECs using the rpower and rvitals command. You are looking for the CECs to be "Operating" with a finished good state. If they are not in the "Operating" state, additional hardware debug will be necessary to understand the failure.
rpower cec on
rvitals cec lcds
rpower cec state
When we power on all the cecs within a frame, we must put a 30 second delay between each CEC within one frame. We use the syspowerinterval attribute in the site table to control the cec boot up speed.
chdef -t site syspowerinterval=30
And then put the all the cecs within one frame as a group, and power them on:
rpower cecswithin_one_frame on
For 12 CECs within one Frame, the rpower will take about 5m33s.
Check the HMCs for any SFP service events that were generated during CEC boot. Then, make sure there are no unexpected deconfigured resources in the CECs:
rinv cec deconfig
You can define the LPAR nodes in different ways: use xcatsetup, or implement manually with xCAT commands. Using xcatsetup will be faster/easier for large clusters because it will generate the lpar configuration based on a cluster configuration file. Alternatively, the admin can define the lpars in the database using the xCAT rscan command to create a lpar stanza file. You need to then edit the stanza file to modify the node name of each LPAR. This approach is simpler for small clusters, but becomes tedious quickly for large clusters.
Follow either the green section entitled "Define LPAR Nodes with xcatsetup" or the blue section entitled "Define LPAR Nodes with rscan" (but not both). After that, continue on with the section "Splitting the Service Node Octant into Multiple LPARS".
Use the xcatsetup config file that you used earlier in this document to define the hardware components. To that file, add stanzas for xcat-lpars, xcat-service-nodes, xcat-storage-nodes, and xcat-compute-nodes. Here's an example:
# A small cluster config file for a single 2 frame bldg block.
# Just the hmcs, frames, bpas, cecs, and fsps are created.
xcat-site:
use-direct-fsp-control = 1
xcat-hmcs:
hostname-range = hmc[1-2]
xcat-frames:
hostname-range = frame[1-2]
num-frames-per-hmc = 1
vpd-file = vpd-frame.stanza
# This assumes you have 2 service LANs: a primary service LAN 40.x.y.z/255.0.0.0 that all of the port 0's
# are connected to, and a backup service LAN 41.x.y.z/255.0.0.0 that all of the port 1's are connected to.
# "x" is the frame number and "z" is the bpa/fsp id (1 for the first BPA/FSP in the Frame/CEC, 2 for the
# second BPA/FSP in the Frame/CEC). For BPAs "y" is always be 0 and for FSPs "y" is the cec id.
vlan-1 = 40
vlan-2 = 41
xcat-cecs:
hostname-range = cec[01-24]
num-cecs-per-frame = 12
xcat-building-blocks:
num-frames-per-bb = 2
num-cecs-per-bb = 24
xcat-lpars:
num-lpars-per-cec = 8
xcat-service-nodes:
num-service-nodes-per-bb = 1
cec-positions-in-bb = 1
# this is for the ethernet NIC on each SN
hostname-range = sn1
starting-ip = 10.250.1.1
# this value is the same format as the
# hosts.otherinterfaces attribute except
# the IP addresses are starting IP addresses
otherinterfaces = -hf0:10.251.1.1,-hf1:11.251.1.1,-hf2:12.251.1.1,-hf3:13.251.1.1,-ml0:14.251.1.1
# if you want the service nodes to route traffic
# between the MN and compute nodes,
# then provide the netmask that should be used
# for each compute network the service
# nodes are connected to. The netmask should
# limit the ip range to just the compute
# nodes served by this service node.
route-masks = -hf0:255.255.0.0,-hf1:255.255.0.0,-hf2:255.255.0.0,-hf3:255.255.0.0,-ml0:255.255.0.0
xcat-storage-nodes:
num-storage-nodes-per-bb = 2
cec-positions-in-bb = 12,24
hostname-range = stor1-stor2
starting-ip = 10.252.1.1
aliases = -hf0
otherinterfaces = -hf1:11.253.1.1,-hf2:12.253.1.1,-hf3:13.253.1.1,-ml0:14.253.1.1
xcat-compute-nodes:
hostname-range = n001-n189
starting-ip = 10.1.1.1
aliases = -hf0
# ml0 is for aix. For linux, use bond0 instead.
otherinterfaces = -hf1:11.1.1.1,-hf2:12.1.1.1,-hf3:13.1.1.1,-ml0:14.1.1.1
Now run xcatsetup with this config file, telling it to just process the new stanzas (since we already created the hardware components earlier):
xcatsetup -s xcat-lpars,xcat-service-nodes,xcat-storage-nodes,xcat-compute-nodes <config-file-name>
This will create definitions in the database for the service nodes, storage nodes, and compute nodes with the proper attributes and located in the proper LPARs/CECs. Use the lsdef command to display the node definitions to confirm they were created the way you want them.
By default, all of these LPARs already exist in the CECs themselves, with one exception: if you plan to split the service node octant into multiple LPARs (because you don't need all of the octant's resources for the service node), that must be done manually. An example of doing this will be covered in the next section.
The rscan command reads the actual LPAR configuration in the CEC and creates node definitions in the xCAT database to reflect them. Before use rscan, you should put the cec in operating or standby state. If you already used the xcatsetup command to create the LPAR node definitions, you can skip this section.
Run the rscan command against all of the CECs to create a stanza file of LPAR node definitions:
rscan cec -z >nodes.stanza
Edit the stanza file and give each LPAR definition the node name that you want it to have. Remember to name the service node and storage node LPARs the way you want them, and to set the servicenode and xcatmaster attributes of all the non-service node LPARs to the appropriate service node name. Then create the definitions in the database:
cat nodes.stanza | mkdef -z
Use the lsdef command to display the node definitions to confirm they were created the way you want them.
In most cases, the service nodes don't need all of the resources of the octant it is in. Normally, the service node only needs 25% of the CPUs and memory in the octant, unless you are also running the Loadleveler central manager on that service node, in which case we recommend 50% of the resources. This leaves the other 75% or 50% of the resources for an additional LPAR of your choosing: utility node, login node, etc. To create a second LPAR in a service node octant, follow these steps:
lsvm cec01
1: 520/U78A9.001.312M001-P1-C14/0x21010208/0/0
1: 514/U78A9.001.312M001-P1-C17/0x21010202/0/0
1: 513/U78A9.001.312M001-P1-C15/0x21010201/0/0
1: 512/U78A9.001.312M001-P1-C16/0x21010200/0/0
1: 569/U78A9.001.312M001-P1-C1/0x21010239/0/0
1: 568/U78A9.001.312M001-P1-C2/0x21010238/0/0
1: 561/U78A9.001.312M001-P1-C3/0x21010231/0/0
1: 560/U78A9.001.312M001-P1-C4/0x21010230/0/0
1: 553/U78A9.001.312M001-P1-C5/0x21010229/0/0
1: 552/U78A9.001.312M001-P1-C6/0x21010228/0/0
1: 545/U78A9.001.312M001-P1-C7/0x21010221/0/0
1: 544/U78A9.001.312M001-P1-C8/0x21010220/0/0
1: 537/U78A9.001.312M001-P1-C9/0x21010219/0/0
1: 536/U78A9.001.312M001-P1-C10/0x21010218/0/0
1: 529/U78A9.001.312M001-P1-C11/0x21010211/0/0
1: 528/U78A9.001.312M001-P1-C12/0x21010210/0/0
1: 521/U78A9.001.312M001-P1-C13/0x21010209/0/0
cec01: PendingPumpMode=1,CurrentPumpMode=1,OctantCount=8:
OctantID=0,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=1,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=2,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=3,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=4,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=5,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=6,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=7,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2
mkdef -t group util mgt=fsp cons=fsp netboot=yaboot nodetype=ppc,osi
chtab node=util ppc.nodetype=lpar # soon you can add hwtype=lpar to the mkdef cmd instead of this
mkdef util01 groups=util,all mgt=fsp hcp=cec01 parent=cec01 id=2 servicenode=sn01 xcatmaster=sn01-hf0
chvm sn01,util01 -i 1 -m non-interleaved -r 0:2
lsvm cec01
rpower cec01 off
rpower cec01 on
lsvm cec01
lsvm sn01 > resources.txt
1: 520/U78A9.001.312M001-P1-C14/0x21010208/0/0
1: 514/U78A9.001.312M001-P1-C17/0x21010202/0/0
1: 513/U78A9.001.312M001-P1-C15/0x21010201/0/0
1: 512/U78A9.001.312M001-P1-C16/0x21010200/0/0
1: 569/U78A9.001.312M001-P1-C1/0x21010239/0/0
1: 568/U78A9.001.312M001-P1-C2/0x21010238/0/0
1: 561/U78A9.001.312M001-P1-C3/0x21010231/0/0
1: 560/U78A9.001.312M001-P1-C4/0x21010230/0/0
1: 553/U78A9.001.312M001-P1-C5/0x21010229/0/0
1: 552/U78A9.001.312M001-P1-C6/0x21010228/0/0
1: 545/U78A9.001.312M001-P1-C7/0x21010221/0/0
1: 544/U78A9.001.312M001-P1-C8/0x21010220/0/0
1: 537/U78A9.001.312M001-P1-C9/0x21010219/0/0
1: 536/U78A9.001.312M001-P1-C10/0x21010218/0/0
1: 529/U78A9.001.312M001-P1-C11/0x21010211/0/0
1: 528/U78A9.001.312M001-P1-C12/0x21010210/0/0
1: 521/U78A9.001.312M001-P1-C13/0x21010209/0/0
rpower sn01,util01 off
1: 520/U78A9.001.312M001-P1-C14/0x21010208/0/0
1: 514/U78A9.001.312M001-P1-C17/0x21010202/0/0
1: 513/U78A9.001.312M001-P1-C15/0x21010201/0/0
1: 512/U78A9.001.312M001-P1-C16/0x21010200/0/0
1: 569/U78A9.001.312M001-P1-C1/0x21010239/0/0
1: 568/U78A9.001.312M001-P1-C2/0x21010238/0/0
1: 561/U78A9.001.312M001-P1-C3/0x21010231/0/0
1: 560/U78A9.001.312M001-P1-C4/0x21010230/0/0
1: 553/U78A9.001.312M001-P1-C5/0x21010229/0/0
1: 552/U78A9.001.312M001-P1-C6/0x21010228/0/0
1: 545/U78A9.001.312M001-P1-C7/0x21010221/0/0
1: 544/U78A9.001.312M001-P1-C8/0x21010220/0/0
1: 537/U78A9.001.312M001-P1-C9/0x21010219/0/0
1: 536/U78A9.001.312M001-P1-C10/0x21010218/0/0
2: 529/U78A9.001.312M001-P1-C11/0x21010211/0/0
2: 528/U78A9.001.312M001-P1-C12/0x21010210/0/0
2: 521/U78A9.001.312M001-P1-C13/0x21010209/0/0
cat resources.txt | chvm sn01,util01
lsvm sn01,util01
rpower sn01,util01 on
Note: this chapter is an overview of xCAT's DFM capabilities and not a list of steps to be performed during cluster set. This is not a continuation of the Discovery chapter.
xCAT has the capability to manage system p hardware by communicating directly with the FSPs and BPAs of the hardware, instead of using the HMC. This is called Direct FSP/BPA Management (DFM). (Note that the HMC is still used to collect hardware service events.) This section gives an overview of using these capabilities.
Several xCAT commands were modified to add DFM support:
DFM Support
| Command | Function |
|---|---|
| rpower | LPAR and Drawer Power |
| rpower | P7IH - Transition low power states |
| rcons | Remote Console |
| rflash | Firmware support for FSP and BPA |
| rinv | Get the firmware level of FSP and BPA; get the deconfigured resource of CEC |
| rvitals | Display LCD values; Get the rack environmental information |
| getmacs | Adapter information collection |
| rvitals | List environmental information |
| mkhwconn/rmhwconn | Make and remove FSP and BPA hardware connections |
| lshwconn | List hardware connection status |
| rnetboot | Remote network boot |
| lsvm, chvm | LPAR list, creation and removal; I/O slot assignment |
| rbootseq | sets the net or hfi device as the first boot device for the specified PPC LPARs |
| rspconfig | FSP and BPA password support; get and modify the frame number |
For the large clusters, we recommend that you set up the DFM hw ctrl hierarchically, so that each service node executes the hw ctrl operations for its nodes (instead of the EMS doing it for all nodes). Hhere is a summary of the procedure:
For the whole flow of setting up a Hierarchical Cluster, refer to the following two docs:
The xCAT administrator can set up the xCAT cluster to connect the Frames and CECs to selected HMCs, xCAT management node, and xCAT service nodes that are attached to the xCAT cluster service VLAN. They can also setup a security environment with passwords used with the HMC, Frame, and CEC. This section will describe what is needed to make the connections to the system p hardware and how to use the mkhwconn, lshwconn, and rmhwconn commands.
The passwords used with CEC/Frame userids 'HMC', 'general' and 'admin' needs to be set correctly in xCAT table ppcdirect or table passwd, if the cluster is not going to use the default passwords.
Here is an example of table ppcdirect:
#hcp,username,password,comments,disable
"frame1","HMC","abc123",,
"frame1","general","abc123",,
"frame1","admin","abc123",,
"cec16c1","HMC","abc123",,
"cec16c1","general","abc123",,
"cec16c1","admin","abc123",,
The passwords used with the HMC nodes working with userid hscroot is located in the xCAT table ppchcp. If you are using xCAT Direct Managment, you need to make the connections between CEC/Frames and xCAT management node, instead of HMC, so you need to set passwords for HMCs.
Here is an example of table ppchcp:
#hcp,username,password,comments,disable
"c76v1hmc02","hscroot","abc123",,
Since the HMC is required for our Service Focal Point (SFP) support, it needs to be connected to the designated Frames and be connected on the xCAT MN. To allow the xCAT MN to support DFM and HMC at the same time, xCAT has provided a new attribute "sfp" in the ppc table that can get assigned to each Power 775 Frame node.
The mkhwconn command allows the xCAT administrator to properly setup the BPA/FSP connection between the xCAT management node and Frames/Cecs working with DFM.
mkhwconn noderange -t [-T tooltype] [--port port_value]
Note: For xCAT 2.6.6 code support, we only support P775 HW connections working with the primary HW service VLAN. The --port value specifies which service VLAN will be used to create the connection to the FSP/BPA. The value could be 0 or 1. The default value for port will be 0 which is the primary service VLAN. It will be listed in the vpd table where the side column should be as A-0 and B-0; If the port value is 1 this will make HW connections to the back up HW service VLAN, and will represent the side column as A-1 and B-1.
This command will make the proper connections on the target xCAT management node if the Frame is not already connected. To work with xCAT DFM, you should have defined the HMC, admin, and general passwords for Frames in ppcdirect table, run mkhwconn to create the HW connections between xCAT management node and Frames using DFM.
mkhwconn frame1 -t -T lpar
For Dual VLAN (not supported in xCAT 2.6.6), we need to specify the --port option to create backup connections for each BPA:
mkhwconn frame1 -t -T lpar --port 1
To assign the sfp attribute for the HMC, execute the "chdef" command to the target Frame node object. The admin then executes the "mkhwconn" to connect the target Frame and known CEC to the HMC node object that was previously defined.
chdef frame1 sfp=c76v1hmc02
mkhwconn frame1 -s
will result with Frame node frame1 to be connected by HMC node c76v1hmc02.
See mkhwconn man page for details of this command
There is the lshwconn command that will provide the current Frame/CEC connection data that is specified on a target HMC, xCAT management node if using xCAT Direct Managementsupport. This information currently provides the Frame/CEC nodes, the FSP/BPA IP address, and the connection status of the BPA/FSP used for the target HMC/xCAT management node.
Run the following to locate the Frame servers working with connections.
lshwconn <HMC node>
See lshwconn man page for the details.
The xCAT admin can run the rspconfig command to modify the HMC, admin, and general userid passwords on the Frame/CEC servers. The Frame/CEC servers are pre-set by System P manufacturing using default passwords.
You can use the same password logic for all the System P frames and CECs in your xCAT cluster, or specify unique passwords for HMC, admin, and general userids for selected Frame or CEC server node. You can only execute one Frame/CEC userid one at a time with the rspconfig command in xCAT2.4 . The following contains the rspconfig changing the HMC userid password from access to abc123 used with the Frame and CEC.
rspconfig <frame> HMC_passwd=access,abc123
rspconfig <cec> HMC_passwd=access,abc123
Note:
The default password for userid HMC on Frame/CEC is empty, so if the frame or CEC is new or has been reset to manufactory setting, you can use the following command to initialize the userid HMC's password:
rspconfig <cec> HMC_passwd=,abc123
The defualt passwords for userids admin and general are admin and general, so there is no difference between initializing passwords and changing passwords for userids admin and general.
The xCAT administratorcan run the rspconfig command to specify the frame number information when working with 24 inch frames that contain the BPA logic. This information is helpful for large System P clusters where many frames are being used. The rspconfig command will allow the xCAT admin to list the current frame number, or can set Frame server node to a specific frame number. The admin can work with the ppc table to setup the frame number or execute one Frame server at a time. Setting the frame number is a disruptive command which requires all CECs to be powered off prior to issuing the command.
rspconfig <frame> frame (list current frame number)
rspconfig <frame> frame=4 (change Frame number to now be frame 4)
The huge page memory can be used to increase performance for certain applications in specific customer environments, such as the running DB2 on AIX and applications using large mapping on Linux.
You can use the same password logic for all the System P frames and CECs in your xCAT cluster, or specify unique passwords for HMC, admin, and general userids for selected Frame or CEC server node. You can only execute one Frame/CEC userid one at a time with the rspconfig command in xCAT2.4 . The following contains the rspconfig changing the HMC userid password from access to abc123 used with the Frame and CEC.
rspconfig <cec> huge_page
rspconfig <cec> huge_page=<NUM>
Note:
If no value specified, it means query huge page information for the specified CECs, if a CEC is specified, the specified huge_page value NUM will be used as the requested number of huge pages for the CEC, if CECs are specified, it means to request the same NUM huge pages for all the specified CECs.
Now that the definitions are in the database and the hardware connections have been made, the xCAT administrator can use the chvm and lsvm commands to define the LPARs within each CEC.
THIS SECTION IS STILL UNDER CONSTRUCTION AND WILL BE UPDATED WITH MORE DETAIL ON THE USE OF THESE COMMANDS
The Power 775 is configured with a default set of LPARs from manufacturing. These defaults are intended to support the HPC environment and are specific to the needs of the HPC clusters. The default rules are as follows:
These rules were defined to make the initial configuration match the cluster requirements for our HPC customers. This will allow the CECs with the disk drives and Ethernet adapters to be setup with the hardware needed to act as the service node while only using the minimal number of cores and memory. It also allows for the automatic assignment of the CEC with the external Disk drives to be assigned to a partition to be used as the GPFS NSD server. These defaults are intended to simplify the bring-up process by preconfiguring the CECs with a set of defaults which should meet most of our customers requirements. Should you need to change these default LPAR configurations we provide commands to do this which are discussed in this section.
Octant Overview
The Power 775 CEC contains up to eight octants with each octant containing up to four 8-core P7 processors, associated memory and a single Torrent hub. These octants are defined as octant 0 through octant 7.
Octant Partition Configurations
The octants can be logically partitioned into one of a preset number of configurations which define the split of resources for processors and memory per partition. The preset configurations are:
1. One partition containing all resources [100]
2. Two partitions with each containing equal resources [50, 50]
3. Three partitions with the first two partitions containing 25
percent of the resources and the third partition containing 50
percent of the resources [25, 25, 50]
4. Four partitions with each containing equal resources [25, 25, 25, 25]
5. Two partitions with the first partition containing 25 percent
of the resources and the second partition containing 75 percent
of the resources. [25, 75]
The lparid is fixed regardless of how many partitions are defined such that the first lpar in an octant will always have a fixed value. If other partitions are defined for the octant they are defined with the following lparid:
Octant ID 0 - lparid 1, lparid 2, lparid 3, lparid 4
Octant ID 1 - lparid 5, lparid 6, lparid 7, lparid 8
Octant ID 2 - lparid 9, lparid 10, lparid 11, lparid 12
Octant ID 3 - lparid 13, lparid 14, lparid 15, lparid 16
Octant ID 4 - lparid 17, lparid 18, lparid 19, lparid 20
Octant ID 5 - lparid 21, lparid 22, lparid 23, lparid 24
Octant ID 6 - lparid 25, lparid 26, lparid 27, lparid 28
Octant ID 7 - lparid 29, lparid 30, lparid 31, lparid 32
With the default configuration of eight octants with one partition per octant, the lpars are defined with lpar ids of 1, 5, 9, 13, 17, 21, 25 and 29.
Octant Memory
The memory of an octant can also be configured to use interleaved or non-interleaved memory. Non-interleaved mode means that memory allocations are only interleaved across the two memory controllers on the local chip in an octant. This is also known as 2MC mode. Interleaved means memory allocations are interleaved across all eight memory controllers in the octant. If an octant is to be partitioned then its memory interleaving value MUST BE non-interleaved. For more information on this please see the Firmware Pervasive Level Design Document for PERCS P7IH and Torrent IO Hub.
The Memory Interleaving Mode can be set to the following values:
1 - interleaved (also 8MC mode)
2 - non-interleaved (also 2MC mode)
You may see a value of "0" returned for the memory interleaving value in the lsvm output. This is the default value from the factory or may be seen after a firmware upgrade. If the current memory interleave value is set to 0 then the chvm command must be used to set it to either 1 or 2.
Pump Mode and memory interleaving
The pump mode determines what is allowed for the interleaving of memory per CEC. The pump mode has 2 valid values:
0x01 - Node Pump Mode
0x02 - Chip Pump Mode
The default Pump Mode on the CEC is 0x01 (Node Pump Mode). This value allows the memory interleave value to be set to either interleaved or non-interleaved for the octant. A pump mode of Chip Pump Mode forces the memory interleave mode to be non-interleaved for the octant. However, the pump mode value can not be changed by customers and should always be 0x01 - Node Pump Mode.
xCAT partition related commands for Power 775 The partition commands for Power 775 are different from P5 & P6 implementation based on the unique hardware configuration in the Power 775:
chvm is designed to set the Octant configure value to split the CPU and memory for partitions, and set Octant Memory interleaving value. The chvm will only set the pending attributes value. After chvm, the CEC needs to be rebooted manually for the pending values to be enabled. Before reboot the cec, the administrator can use chvm to change the partition plan. If the the partition needs I/O slots, the administrator should use chvm to assign the I/O slots.
chvm is also designed to assign the I/O slots to the new LPAR. Both the current IO owning LPAR and the new IO owning LPAR must be powered off before an IO assignment. Otherwise, if the I/O slot is belonged to an LPAR and the LPAR is power on, the command will return an error when trying to assign that slot to a different lpar.
syntax:
chvm [-V| --verbose] noderange -i id [-m memory_interleaving] -r partition_rule
chvm [-V| --verbose] noderange [-p profile]
options:
-i Starting numeric id of the newly created partitions. The id value only could be 1, 5, 9, 13, 17, 21, 25 and 29.
-m memory interleaving. The setting value only could be 1 or 2. 2 means non-interleaved mode, the memory cannot be shared across the processors in an octant. 1 means interleaved mode, the memory can be shared. The default value of memory interleaving in chvm is 1 .
-r partition rule.
If all the octants configuration value are same in one CEC, it will be " -r 0-7:value" .
If the octants use the different configuration value in one cec, it will be "-r 0:value1,1:value2,...7:value7", or "-r 0:value1,1-7:value2" and so on.
The octants configuration value for one Octant could be 1, 2, 3, 4, 5.
The meanings of the octants configuration value are as following:
1 - One partition with all cpus and memory of the octant
2 - Two partitions with a 50/50 split of cpus and memory
3 - Three partitions with a 25/25/50 split of cpus and memory
4 - Four partitions with a 25/25/25/25 split of cpus and memory
5 - Two partitions with a 25/75 split of cpus and memory
-p the I/O profile.
The administrator should use [lsvm](http://xcat.sourceforge.net/man1/lsvm.1.html) to get the profile content, and then edit the content, and add the node name with ":" manually before the I/O which will be assigned to the node. It looks like:
lparid1:bus_id1/physical_location_code/drc_index/owner_type/owner/description
lparid1:bus_id2/physical_location_code/drc_index/owner_type/owner/description
...
lparid2:bus_id/physical_location_code/drc_index/owner_type/owner/description
...
lparidn:bus_id/physical_location_code/drc_index/owner_type/owner/description
...
The chvm also supports that the file being piped to the command is in above profile format.
lsvm lists all partition I/O slots information for the partitions specified in noderange. If noderange is a CEC, it gets the CEC's pump mode value, octant's memory interleaving value, the all the octants configure value, and all the I/O slots information
syntax:
lsvm noderange [-l|--long]
If no option specify, the output is similar too:
lparid1:bus_id1/physical_location_code/drc_index/owner_type/owner/description
lparid1:bus_id2/physical_location_code/drc_index/owner_type/owner/description
...
lparid2:bus_id/physical_location_code/drc_index/owner_type/owner/description
...
lparidn:bus_id/physical_location_code/drc_index/owner_type/owner/description
cecname: octant configuration value
If option -l or --long specify, the output is similar too:
lpar_name1: lparid1: bus_id1/physical_location_code/drc_index/owner_type/owner/description: BSR_array_number1: Min1/Req1/Max1
lpar_name1: lparid1: bus_id2/physical_location_code/drc_index/owner_type/owner/description: BSR_array_number1: Min1/Req1/Max1
...
lpar_name2: lparid2: bus_id/physical_location_code/drc_index/owner_type/owner/description: BSR_array_number2: Min2/Req2/Max2
...
lpar_namen: lparidn: bus_id/physical_location_code/drc_index/owner_type/owner/description: BSR_array_numbern: Minn/Reqn/Maxn
cecname: octant configuration value
1. To create a new partition lpar1 on the first octant of the cec, lpar1 will use all the cpu and memory of the octant 0, enter:
mkdef -t node -o lpar1 mgt=fsp groups=all parent=cec01 nodetype=ppc,osi hwtype=lpar hcp=cec01
then:
chvm lpar1 -i 1 -m 1 -r 0:1
Output is similar to:
lpar1: Success
cec01: For Power 775, if chvm succeeds, please reboot the CEC cec01 before using chvm to assign the I/O slots
2. To create a new partition lpar1-lpar2 on the first octant of the cec, each lpar will use 50% cpu and 50% memory of the octant 0, enter:
mkdef -t node -o lpar1-lpar2 mgt=fsp groups=all parent=cec01 nodetype=ppc,osi hwtype=lpar hcp=cec01
then:
chvm lpar1-lpar2 -i 1 -m 2 -r 0:2
Output is similar to:
lpar1: Success
lpar2: Success
cec01: For Power 775, if chvm succeeds, please reboot the CEC cec01 before using chvm to assign the I/O slots
3. To create new partitions lpar1-lpar32 on the whole cec, each LPAR will use 25% cpu and 25% memory of each octant, enter:
mkdef -t node -o lpar1-lpar32 nodetype=ppc,osi hwtype=lpar mgt=fsp groups=all parent=cec01 hcp=cec01
then:
chvm lpar1-lpar32 -i 1 -m 2 -r 0-7:4
Output is similar to:
lpar1: Success
lpar10: Success
lpar11: Success
lpar12: Success
lpar13: Success
lpar14: Success
lpar15: Success
lpar16: Success
lpar17: Success
lpar18: Success
lpar19: Success
lpar2: Success
lpar20: Success
lpar21: Success
lpar22: Success
lpar23: Success
lpar24: Success
lpar25: Success
lpar26: Success
lpar27: Success
lpar28: Success
lpar29: Success
lpar3: Success
lpar30: Success
lpar31: Success
lpar32: Success
lpar4: Success
lpar5: Success
lpar6: Success
lpar7: Success
lpar8: Success
lpar9: Success
cec01: For Power 775, if chvm succeeds, please reboot the CEC cec01 before using chvm to assign the I/O slots
4. To create new partitions lpar1-lpar8 on the whole cec, each LPAR will use all the cpu and memory of each octant, enter:
mkdef -t node -o lpar1-lpar8 nodetype=ppc,osi hwtype=lpar mgt=fsp groups=all parent=cec01 hcp=cec01
then:
chvm lpar1-lpar8 -i 1 -m 1 -r 0-7:1
Output is similar to:
lpar1: Success
lpar2: Success
lpar3: Success
lpar4: Success
lpar5: Success
lpar6: Success
lpar7: Success
lpar8: Success
cec01: For Power 775, if chvm succeeds, please reboot the CEC cec01 before using chvm to assign the I/O slots
5. To create new partitions lpar1-lpar9, the lpar1 will use 25% CPU and 25% memory of the first octant, and lpar2 will use 75% CPU and memory of the first octant. lpar3-lpar9 will use all the cpu and memory of each octant. Note that the chvm command does not support both memory interleaving values in one call. Therefore the chvm must be entered twice, once for "-m 2" for partitioned octants and again for "-m 1" for single lpar octants. For example:
mkdef -t node -o lpar1-lpar9 mgt=fsp groups=all parent=cec1 nodetype=ppc,osi hwtype=lpar hcp=cec1
then:
chvm lpar1-lpar9 -i 1 -m 2 -r 0:5
chvm lpar3-lpar9 -i 5 -m 1 -r 1-7:1
Output is similar to:
lpar1: Success
lpar2: Success
cec1: For Power 775, if chvm succeeds, please reboot the CEC cec1 before using chvm to assign the I/O slots
lpar3: Success
lpar4: Success
lpar5: Success
lpar6: Success
lpar7: Success
lpar8: Success
lpar9: Success
cec1: For Power 775, if chvm succeeds, please reboot the CEC cec1 before using chvm to assign the I/O slots
6. To change the I/O slot profile for lpar4 using the configuration data in the file /tmp/lparfile, the I/O slots information is similar to:
4: 514/U78A9.001.0123456-P1-C17/0x21010202/2/1
4: 513/U78A9.001.0123456-P1-C15/0x21010201/2/1
4: 512/U78A9.001.0123456-P1-C16/0x21010200/2/1
then run the command:
cat /tmp/lparfile | chvm lpar4
7. To change the I/O slot profile for lpar1-lpar8 using the configuration data in the file /tmp/lparfile. Users can use the output of lsvm.and remove the cec information, and modify the lpar id before each I/O, and run the command as following:
chvm lpar1-lpar8 -p /tmp/lparfile
1. To list the I/O slot information of lpar1, enter:
lsvm lpar1
Output is similar to:
1: 514/U78A9.001.0123456-P1-C17/0x21010202/2/1
1: 513/U78A9.001.0123456-P1-C15/0x21010201/2/1
1: 512/U78A9.001.0123456-P1-C16/0x21010200/2/1
2. To list the I/O slot information and octant configuration of cec1, enter:
lsvm cec1
Output is similar to:
1: 514/U78A9.001.0123456-P1-C17/0x21010202/2/1
1: 513/U78A9.001.0123456-P1-C15/0x21010201/2/1
1: 512/U78A9.001.0123456-P1-C16/0x21010200/2/1
13: 537/U78A9.001.0123456-P1-C9/0x21010219/2/13
13: 536/U78A9.001.0123456-P1-C10/0x21010218/2/13
17: 545/U78A9.001.0123456-P1-C7/0x21010221/2/17
17: 544/U78A9.001.0123456-P1-C8/0x21010220/2/17
21: 553/U78A9.001.0123456-P1-C5/0x21010229/2/21
21: 552/U78A9.001.0123456-P1-C6/0x21010228/2/21
25: 569/U78A9.001.0123456-P1-C1/0x21010239/2/25
25: 561/U78A9.001.0123456-P1-C3/0x21010231/2/25
25: 560/U78A9.001.0123456-P1-C4/0x21010230/2/25
29: 568/U78A9.001.0123456-P1-C2/0x21010238/2/29
5: 521/U78A9.001.0123456-P1-C13/0x21010209/2/5
5: 520/U78A9.001.0123456-P1-C14/0x21010208/2/5
9: 529/U78A9.001.0123456-P1-C11/0x21010211/2/9
9: 528/U78A9.001.0123456-P1-C12/0x21010210/2/9
cec1: PendingPumpMode=1,CurrentPumpMode=1,OctantCount=8:
OctantID=0,PendingOctCfg=5,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=1,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=2,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=3,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=4,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=5,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=6,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
OctantID=7,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=1;
3. To list the detailed I/O slot information and octant configuration of cec1, enter:
lsvm cec1 -l
Output is similar to:
cec1:
lpar1: 1: 514/U78A9.001.0123456-P1-C17/0x21010202/2/1: 16: 0/3/3
lpar1: 1: 513/U78A9.001.0123456-P1-C15/0x21010201/2/1: 16: 0/3/3
lpar1: 1: 512/U78A9.001.0123456-P1-C16/0x21010200/2/1: 16: 0/3/3
lpar13: 13: 537/U78A9.001.0123456-P1-C9/0x21010219/2/13: 16: 0/3/3
lpar13: 13: 536/U78A9.001.0123456-P1-C10/0x21010218/2/13: 16: 0/3/3
lpar17: 17: 545/U78A9.001.0123456-P1-C7/0x21010221/2/17: 16: 0/0/0
lpar17: 17: 544/U78A9.001.0123456-P1-C8/0x21010220/2/17: 16: 0/0/0
lpar21: 21: 553/U78A9.001.0123456-P1-C5/0x21010229/2/21: 16: 0/0/0
lpar21: 21: 552/U78A9.001.0123456-P1-C6/0x21010228/2/21.16: 0/0/0
lpar25: 25: 569/U78A9.001.0123456-P1-C1/0x21010239/2/25.16: 0/0/0
lpar25: 25: 561/U78A9.001.0123456-P1-C3/0x21010231/2/25.16: 0/0/0
lpar25: 25: 560/U78A9.001.0123456-P1-C4/0x21010230/2/25.16: 0/0/0
lpar29: 29: 568/U78A9.001.0123456-P1-C2/0x21010238/2/29.8: 0/0/0
lpar5: 5: 521/U78A9.001.0123456-P1-C13/0x21010209/2/5.8: 0/3/3
lpar5: 5: 520/U78A9.001.0123456-P1-C14/0x21010208/2/5.8: 0/3/3
lpar9: 9: 529/U78A9.001.0123456-P1-C11/0x21010211/2/9.16: 0/3/3
lpar9: 9: 528/U78A9.001.0123456-P1-C12/0x21010210/2/9.16: 0/3/3
PendingPumpMode=1,CurrentPumpMode=1,OctantCount=8:
OctantID=0,PendingOctCfg=5,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=1,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=2,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=3,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=4,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=5,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=6,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
OctantID=7,PendingOctCfg=1,CurrentOctCfg=1,PendingMemoryInterleaveMode=2,CurrentMemoryInterleaveMode=2;
Number of BSR arrays: 256, Bytes per BSR array: 4096, Available BSR arrays: 0;
Available huge page memory(in pages): 0
Configurable huge page memory(in pages): 12
Page Size(in GB): 16
Maximum huge page memory(in pages): 24
Requested huge page memory(in pages): 15;
This section will outline the general xCAT DFM rpower support as well as new support for the P7IH servers. See the rpower man page for more detail for each option.
Here is a list of xCAT DFM supported rpower options.
rpower <noderange> [on|onstandby|off|reset|stat|state|boot|of|sms|lowpower|rackstandby|exit_rackstandby]
Administrators will need to control power for different hardware boundaries depending on the tasks being performed. During an initial startup of a cluster they will be issuing commands to the BPA to bring up the frame and start the FSPs. Once the FSPs are up they will issue commands to IPL the CEC and later commands to the FSP to start each LPAR. During normal operation they may wish to reboot LPARs to cause them to use an updated OS level. Each of these commmands require the administrator to select the appropriate Frame, CEC, or LPAR as the noderange.
The following actions are supported for a nodelist of Frame
rpower <noderange> [rackstandby|exit_rackstandby|stat|state]
The following actions are supported for a nodelist of CEC
rpower <noderange> [on|off|stat|state|lowpower]
For lowpower , turn CEC to low power state (state EOCT). This is a disruptive operation which requires the CEC to be powered off prior to entering low power mode. And we can use "power off" command to get out of lowepower state.
The following actions are supported for a nodelist of LPAR
rpower <noderange> [on|off|reset|stat|state|boot|of|sms]
The xCAT DFM supoprt includes the ability to apply firmware updates to the Frame and CEC using the rflash command
The xCAT DFM firmware related commands are rspconfig, rinv and rflash. And the noderange MUST be CEC and Frame.
1. use rspconfig to set the pending power on side
rspconfig <noderange> pending_power_on_side
rspconfig <noderange> pending_power_on_side={temp|perm}
2. Use rinv with the option firm to get the firmware level of the CECs or Frames
rinv <noderange> firm
3. use the rflash command to perform firmware updates.
rflash <noderange> -p directory --activate <disruptive|deferred> [-d data_directory]
rflash <noderange> [--commit|--recover]
The rflash command initiates Firmware updates for the CECs and Frames.
The flash chip system P managed system or power subsystem stores firmware in two locations, referred to as the temporary side and the permanent side. By default, most system P systems boot from the temporary side of the flash. When the rflash command updates code, the current contents of the temporary side are written to the permanent side, and the new code is written to the temporary side. The new code is then activated. Therefore, the two sides of the flash will contain different levels of code when the update has completed.
In Direct FSP/BPA Management, there is -d <data_directory> option. The default value is /tmp. When do firmware update, rflash will put some related data from rpm packages in <data_directory> directory, so the execution of rflash will require available disk space in <data_directory> for the command to properly execute:
For one GFW rpm packages and power code rpm package , if the GFW rpm package size is gfw_rpmsize, and the Power code rpm package size is power_rpmsize, it requires that the available disk space should be more than:
1.5*gfw_rpmsize + 1.5*power_rpmsize
The --activate flag determines how the affected systems activate the new code. In currently Direct FSP/BPA Management, our rflash doesn't support concurrent option, and only support disruptive and deferred.
The disruptive option will cause any affected systems that are powered on to be powered down before installing and activating the update. So we require that the systems should be powered off before do the firmware update.
The deferred option will load the new firmware into the T (temp) side, but will not activate it like the disruptive firmware. The customer will continue to run the Frames and CECs working with the P (perm) side and can wait for a maintenance window where they can activate and boot the Frame/CECs with new firmware levels. Refer to the section Perform_Deferred_Firmware_upgrades_for_frameFCEC_on_Power_775 below to get more details.
Note: When doing the firmware update on the frame's BPAs, rflash flashes the LIC code into the BPAs, and then reboots the BPAs to activate the new firmware, and then issues the ACDL action at last. ACDL means Auto Code Down Load of the Power/Thermal subsystem which updates other parts in the frame. All the actions of ACDL are done within the frame automatically. In most of the time, about 10 minutes later, the ACDL will finish automatically. During the ACDL, it doesn't affect any hardware control operations on the related CECs, and ACDL only pertains to power /thermal FRUs. That's the expected behavior.
The --commit flag is used to copy the contents of the temporary side of the flash to the permanent side. This flag should be used after updating code and verifying correct system operation.
The --recover flag is used to copy the permanent side of the flash chip back to the temporary side. This flag should be used to recover from a corrupt flash operation, so that the previously running code can be restored.
For Power 775, the rflash command takes effect on the primary and secondary FSPs or BPAs almost in parallel.
The detailed information will be added to the xCAT rflash manpage.
To prepare for the firmware upgrade download the Microcode update package and associated XML file from the IBM Web site:
http://www-933.ibm.com/support/fixcentral/ .
Go to Fix Central and use the following options:
Product Group = Power
Product = Firmware, SDMC, and HMC
Machine type-model = 9125-F2C
Check firmware level
rinv <noderange> firm
Update the firmware
Download the Microcode update package and associated XML file from the IBM Web site:
http://www-933.ibm.com/support/fixcentral/ .
Go to Fix Central and use the following options:
Product Group = Power
Product = Firmware, SDMC, and HMC
Machine type-model = 9125-F2C
Create the /tmp/fw directory, if necessary, and copy the downloaded files to the /tmp/fw directory.
Run the rflash command with the --activate flag to specify the update mode to perform the updates.Please see the "rflash" manpage for more information.
rflash <noderange> -p /tmp/fw --activate disruptive
Notes:
System Power 775 firmware updates can require time to complete and there is no visual indication that the command is proceeding.
It takes more than 2 hours to finish the disruptive firmware update in a large P775 cluster. To reduce the down time of the cluster, customers may want to flash new firmware levels while the cecs are up and running, The deferred firmware update will load the new firmware into the T (temp) side, but will not activate it like the disruptive firmware. The customer will continue to run the Frames and CECs working with the P (perm) side and can wait for a maintenance window where they can activate and boot the Frame/CECs with new firmware levels.
The deferred firmware update includes 2 parts: The first part (1) is to apply the firmware to the T (temp) sides of Frames' BPAs and CECs' FSPs when the cluster is up and running. The second part (2) is to activate the new firmware on the Frames and Cecs at a scheduled time.
The default setting is that the CEC/FSPs are working from the temp side (current_power_on_side). During part(1) of the deferred firmware update implementation, the CEC will continue to run on the perm side while the rflash of the new firmware levels will installed to the temp side. It is very important that the perm side contains the current stable version of firmware. The perm side is usually only used as a recovery environment when working with firmware updates.
When executing a reboot the CEC (FSPs), the CEC will run on the side which the pending_power_on_side attribute is set. After we finish the part (1), the admin will want to make sure the pending_power_on_side attribute is set to "perm" if the CECs want to be rebooted working with the older stable firmware. When you are ready to activate the new firmware and reboot the Cecs,you will want to make sure the pending_power_on_side attribute is set to "temp".
Before starting the deferred firmware update, the admin should first make sure that the most recent stable firmware level has been applied to the P (perm) side. We should note that T-side firmware will be moved over to the P-side automatically when we execute the rflash of the new firmware into the T (temp) side.
1. Apply the firmware for Frame and CECs
In this part, rflash command with --activate deferred is used to load the firmware to the Frame and CECs, while the Frame and CECs are in running state. The admins should make sure that the Frame power code is loaded first and that GFW levels are compatible with power code.
1.1 Check the Firmware level of the Frames or CECs:
rinv <noderange> firm
Then, download the suitable power code for Frame or GFW code for CEC. For example, if the Release Level of the CEC is '01AFxxx', the code witch has the same perfix '01AF' is suitable for updating.
To prepare for the firmware upgrade download the Microcode update package and associated XML file from the IBM Web site:
http://www-933.ibm.com/support/fixcentral/ .
Go to Fix Central and use the following options:
Product Group = Power
Product = Firmware, SDMC, and HMC
Machine type-model = 9125-F2C
1.2 Apply the power code into the Frames's BPAs
rflash <frame> -p <rpm_directory> --activate deferred
1.3 Apply the GFW code into the CECs's FSPs
rflash <cec> -p <rpm_directory> --activate deferred
1.4 Check to make sure the proper Firmware levels have been loaded into the temp side (new) and the perm side (previous) for the Frames or CECs. The rflash working with "deferred" should now specify Current Power on side to now be "perm":
rinv <noderange> firm
2. Setup Cecs pending power to Perm (needed for CEC reboot -- power off/on)
In part 1, the new firmware is now loaded on the temp side. If you need to keep the Frame and CECs active for a period of time (such as several days) we need to make sure we are working with previous firmware level, which is running on the P-side. You should change the pending_power_on_side attribute from temp to perm. The expectation is that the admin should only need to set the CECs pending_power_on_side, but the admin can also set the frame environment if needed.
rspconfig <cec> pending_power_on_side
If not, set CEC's the pending power on side to P-side:
rspconfig <cec> pending_power_on_side=perm
Note: The P775 system should continue to run working on the P-side with previous level firmware until you are ready to activate the new firmware , since the pending_power_on_side has been set to perm side, this will make sure that the CECs will be powered up working with the previous firmware level. This may be necessary if the CECs have an issue, or that the admin may need to reboot one of the CECs prior to the scheduled outage.
3.Activate the new firmware at schedule time
The new firmware level has been loaded on the temp side, and it is time to activate the Frame and CECs with new firmware level. The admin should make sure the pending_power_on_side is now set back from perm to temp.
3.1 Check if the pending power on side for Frame and CEC are T-side
rspconfig <frame> pending_power_on_side
rspconfig <cec> pending_power_on_side
If not, set the pending power on side to T-side
rspconfig <frame> pending_power_on_side=temp
rspconfig <cec> pending_power_on_side=temp
3.2 Power off all CECs
rpower cec off
3.3 reboot the service processors and do ACDL for Frames
When all of the CECs are powered off, the admin should reboot the BPAs to activate the new power code in the BPAs, and run Auto Code Down Load (ACDL) of the Power/Thermal subsystem which updates other parts in the Frame. The ACDL really a small code update of the firmware on all the miscellaneous electrical components, such as fans and pumps, that are part of the Frame and are controlled by the BPC.
At the end of the BPAs rebooting, it will run the ACDL automatically, but we want to issue a manual ACDL to make sure the automatic ACDL was successful. The manual ACDL is a 'double-check' to make sure ACDL runs properly.
3.3.1 Reboot the Frames' BPAs
rpower <frame> resetsp
Wait for 5-10 minutes for BPA to restart. When the connections become LINE_UP again, the BPAs have finished the reboot.
lshwconn <frame>
3.3.2 Execute the manual ACDL
rflash <frame> --bpa_acdl
3.4 Verify that the frame updates are the new power code level and that they are using the temp side for the current_power_on_side .
rinv <frame> firm
3.5 Reboot the service processor for the CECs
rpower <cec> resetsp
Wait for 5-10 minutes for FSPs to restart. When the connections become LINE_UP again, the FSPs have finished the reboot.
lshwconn <cec>
3.6 Verify that the cec updates are the new firmware level and that they are using the temp side for the current_power_on_side .
rinv <cec> firm
3.7 Power on the CECs and bring up the Power 775 cluster .
Note: Before doing the rpower on the CECs, make sure all P775 software is synchronized with CEC firmware. The Power 775 admin should check that the CNM/HFI software environment is updated on the EMS and that the HPC software is updated in the diskful and diskless images to properly work with the new firmware on the CECs.
rpower <cec> on
If the power code/firmware failed to be loaded or be stopped in purpose, refer to the section Recover_the_system_from_a_PP_situation_because_of_the_failed_firmware_update to recover the system.
1. Make sure the pending_power_on_side be temp
rspconfig <cec> pending_power_on_side
If not, set the pending power on side to T-side
rspconfig <cec> pending_power_on_side=temp
2. Copy the P-side firmware to T-side
rflash <cec> --recover
And then the CECs will be running the T-side with the original firmware from P-side.
Check firmware level
Refer to the environment setup in the section 'Firmware upgrade for CEC on Power 775' to make sure the firmware version is correct.
Commit the firmware LIC
Run the rflash command with the commit flag.
rflash <noderange> --commit
Notes:
When the --commit or --recover two flags is used, the noderange is CEC for Power 775, and it will take effect for managed systems. If it is frame for Power 775, and will take effect for power subsystems only.
Before running the following steps, make sure the connections to both the primary FSP and the secondary FSP(or both the BPC A and the BPC B) are LINE UP.
lshwconn cec01
cec01: 40.17.1.1: sp=primary,ipadd=40.17.1.1,alt_ipadd=unavailable,state=LINE UP
cec01: 40.17.1.2: sp=secondary,ipadd=40.17.1.2,alt_ipadd=unavailable,state=LINE UP
The following steps could be used to recover the system from the Current Boot Side P/P situation, if the system is on P/P side because of the failed firmware update. All the steps should be run:
(1) check if the current power on side is perm(P-side):
rinv cec01 firm
If yes, switch step(2); otherwise, NOT do the following steps because the current boot side is T-side.
(2) check if the pending power on side is T:
rspconfig cec01 pending_power_on_side
If not, set the pending power on side to T:
rspconfig cec01 pending_power_on_side=temp
(3) remove the current connection
rmhwconn cec01
(4) recreate the connections:
mkhwconn cec01 -t
(5) make sure the connections states are LINE UP:
lshwconn cec01
(It's required that the states are LINE.UP. Maybe need to wait for several minutes)
(6) recover the system
rflash cec01 --recover
(7) Check the result:
rinv cec01 firm
The current power on side will be temp(T-side).
xCAT DM rcons supports the capability of opening a remote console to one or more LPARs. This section will outline this capability and the use of rcons to open consoles using xCAT DFM.
See the rcons man page for details and other options.
rcons <noderange>
By changing the nodelist specification you can use the rcons command to open a coneole to the BPA or FSP ASM menu. It can also be used to open a console to an LPAR OS prompt. This support is useful to access the ASM menus and also to access the console for an LPAR. Common uses for rconsole include opening a console to monitor node installation or boot processing. rcons may be used when other network access to an LPAR is not available and to understand the state of the OS on the LPAR.
Depending on the class of your systems, the dev user may or may not be enabled by default. P5 (Squadrons) systems have dev enabled, and by default have a dynamic password, P6 (Eclipz) and P7 (Apollo) have dev disabled. The celogin is always enabled and has a dynamic password by default. You should access http://w3.pok.ibm.com/organization/prodeng/pw/ to enter the Password Request page to get the dynamic password. And then put the username .celogin. and password in the passwd table, or input the username .celogin. and the password of CEC/Frame into the ppcdirect table. Before using our rspconfig command, we should make sure that the username .celogin. and password are valid.
Because this function is implemented through ASMI, the users should make sure the "enableASMI=yes" in the site table. If not, please run the following commmand:
chdef -t site enableASMI=yes
And then, check 'dev' and 'celogin1' state for CEC/Frame
rspconfig <noderange> dev
rspconfig <noderange> celogin1
If needed, run the following command to enable or disable the 'dev' and 'celogin1' accounts.
rspconfig <noderange> dev={enable|disable}
rspconfig <noderange> celogin1={enable|disable}
After this operation, we require users to set the enableASMI=no in the site table:
chdef -t site enableASMI=no
For information on Hardware Discover Directly from the xCAT MN. See
XCAT_Power_775_Hardware_Management/#appendix-system-p-hardware-discovery-directly-from-xcat-mn
xCAT delivered a renergy command to manage the energy related functions for System p hardware. Basically, renergy command supports to query the power consumption, power capping, temperature, CPU frequency of hardware and set the status to enable/disable the power saving, power capping.
The renergy command can only operate against CEC or fsp objects. For general attributes, the CEC objects can be managed by HMC or direct managed through fsp. But for the FFO functions, only the direct fsp managed CEC/fsp can be accepted.
Power consumption: Query the average AC and DC consumption for a CEC. For certain type of CEC, the AC value is getting for the whole frame.
renergy CEC1 cappingstatus cappingmaxmin cappingvalue cappingsoftmin
renergy CEC1 ambienttemp exhausttemp
renergy CEC1 CPUspeed
renergy CEC1 savingstatus dsavingstatus fsavingstatus
If turning on the static power saving, the processor frequency and voltage will be dropped to a fixed value to save energy.
renergy CEC1 savingstatus=on
If turning on the dynamic power saving, the processor frequency and voltage will be dropped dynamically based on the core utilization. It supports two modes for turn on state: on-norm - means normal, the processor frequency cannot exceed the nominal value; on-maxp - means maximum performance, the processor frequency can exceed the nominal value.
renergy CEC1 dsavingstatus=on-norm
Set the CPU frequency to a fixed value to save the power consumption.
renergy CEC1 fsavingstatus=on
Set the maximum power consumption for a CEC.
renergy CEC1 cappingstatus=on
renergy CEC1 cappingwatt=2500
For the System p, the renergy command depends on the Energy Management Plug-in xCAT-pEnergy to communicate with server. xCAT-pEnergy can be downloaded from the IBM web site: http://www.ibm.com/support/fixcentral/. (Other Software -> EM)
Product Group: Power
Product: Cluster Software
Cluster Software: System p Energy Management plug-in for xCAT (EM)
8203-E4A, 8204-E8A, 9125-F2A, 8233-E8B, 8236-E8C, 9125-F2C
Note: Not all the attributes are available for every type of hardware, refer to the man page of renergy for the support list for each hardware type.
Note: this section is only recommended for very large clusters. For most clusters it is much simpler to follow XCAT_System_p_Hardware_Management/#system-p-setup-with-hmc-discovery.
This chapter will introduce how the xCAT MN can discover HMCs, System P frames with their BPAs, and CECs with their FSPs working with xCAT lsslp command. The System P hardware will be discovered on the xCAT service network, and then added to xCAT database as node attributes.
Before performing hardware discovery, users should confirm the following database setup:
tabdump site table
Make sure the following attributes in site table are set to match your xCAT cluster site environment:
domain
nameservers
ntpservers
dhcpinterfaces
chtab key=hmc passwd.username=hscroot passwd.password=abc123
Note: The username and password for xCAT to access an HMC can also be assigned directly to the HMC node object using the mkdef or chdef commands. This assignment is useful when a specific HMC has a username and/or password that is different from the default one specified in the passwd table. For example, to create an HMC node object and set a unique username or password for it:
mkdef -t node -o hmc1 groups=hmc,all nodetype=ppc hwtype=hmc mgt=hmc username=hscroot password=abc1234
or to change it if the HMC definition already exists:
chdef -t node -o hmc1 username=hscroot password=abc1234
The xCAT Management Node needs to be properly connected to the xCAT service network which is used with all HMCs, System P frames and CECs being used in the xCAT cluster. This service network should be located on a private subnet to allow the xCAT MN DHCP server to communicate with HMCs, BPAs (frame), and FSPs (CECs) in your cluster.
In a larger cluster where Service Nodes are being used for hardware management and OS deployment, then these Service Nodes also need to be connected to the private service network to communicate to each frame BPA and CEC FSP.
The hardware management function with HMC connection is currently supported for System P hardware (P5,P6,P7) and xCAT 2.3.4 or later releases.
HMCs should be configured with static ip addresses working in the HW service VLAN, so that they can communicate with xCAT MN. Because the xCAT MN runs the DHCP server on the service VLAN, the DHCP service on the HMCs should be turned off prior to performing the xCAT HW discovery function. (By default, the DHCP service is disabled for all network interfaces on HMC.)
The DHCP service can be run from different server that is connected to the xCAT service VLAN, instead of the xCAT MN. In this case, users need to configure the DHCP service manually on the DHCP server, and skip the step "Setup DHCP service on MN" .
The following are minimal steps required to Setup the HMC network for Static IP,and enable SLP and SSH ports working with HMC GUI. Reference the HMC website and documentation for more details.
The Frame and CEC should be configured to use dynamic IPs by default, so that the DHCP server can properly assign hardware IP addresses in the xCAT service VLAN. (If the administrator wants to use static ip addresses with the BPA/FSP, they must use the proper service VLAN subnet address range specified by the DHCP server.)
The xCAT administrator needs to make sure that BPA/FSP ip addresses and server node names are planned out and are properly defined when working with the xCAT Database, and the DHCP environment. There should be no issues, if this is a new xCAT System P cluster installation, where the frames and CECs are being specified in the xCAT database and HMC for the first time.
For existing xCAT 2.5 clusters setup with a HMC DHCP server environment where BPA/FSPs are already acknowledged by HMC and xCAT DB, it is important that they use the same existing BPA/FSP network ip addresses and server node names. This includes setting up the DHCP server dynamic address ranges to match the current subnets used by the BPA/FSPs.
If the service network requires changes to the BPA/FSP ip addresses, the administrator should plan to cleanup the current BPA/FSP environment. It includes doing cleanup for both the HMC and the xCAT Database for any IP addresses and server node name changes.
For the HMC, the administrator should plan to remove the existing frames and servers that will require new HW IP addresses or server hostnames working in the xCAT service VLAN. This will allow the xCAT mkhwconn command to reinitialize the frame and CECs using the xCAT DB information to make new HW connections to the HMC.
For the xCAT Management Node (MN), the administrator should review the xCAT database using lsdef and tabdump commands to check for any existing HMC/frame/BPA/CEC/FSP node objects that require updates. The xCAT chdef command can be used to modify server node attributes. The rmdef command can be used to remove the HMC/frame/BPA/CEC/FSP node objects to get to a clean state. It is important that the xCAT administrator also clean up Domain Name Service (DNS) and the /etc/hosts file make sure the HMC/frame/Server IP addresses and host names are matching the proper settings required for their xCAT cluster.
This section describes the xCAT DB tables and commands used to work with xCAT HW discovery. It will properly define the xCAT support requirements for the HW service network, DHCP server, and how ip addresses are defined for the BPAs and FSPs.
All the FSPs and BPAs need to receive their dynamic ip addresses from the DHCP server. The first step is to create an xCAT network object in the xCAT DB using with mkdef command for the service VLAN used by xCAT cluster.
Here is an example mkdef stanza for creating a network object:
vlan1:
objtype=network
dhcpserver=192.168.200.205
gateway=192.168.200.205
mask=255.255.255.0
mgtifname=en0
net=192.168.200.0
dynamicrange=192.168.200.1-192.168.200.224
In this example, the xCAT MN connects to the service vlan1 on network interface name en0. The "192.168.200.1-192.168.200.224" field indicates the dynamic ip range that is used by DHCP to give dynamic IP addresses to the BPA/FSPs on the service network. The IP address 192.168.200.205 is the DHCP server, which is also the xCAT MN.
The xCAT command makenetworks is executed on the MN when xCAT is install and populates the xCAT networks table, but this command will not specify the dynamic range field. Use the following lsdef command to see if an entry for the service network has already been created:
lsdef -t network -l
If so, then you only need to set the dynamicrange attribute in the service network object using the xCAT chdef command.
For AIX clusters, there is a bootp service daemon on the xCAT MN that is used by default for AIX node installations. If the DHCP server for the service network is the xCAT management node, you will need to disable the bootp service and enable dhcpsd in rc.tcpip so that the dhcp service will start during system boot up.
Stop bootp from rebootting by commenting out the bootps line in /etc/inetd.conf file:
#bootps dgram udp wait root /usr/sbin/bootpd bootpd /etc/bootptab
Restart the inetd subsystem:
refresh -s inetd
Stop bootp deamon:
ps -ef | grep bootp
kill the bootp process
Start up the DHCP Server
start /usr/sbin/dhcpsd "$src_running"
Stop and restart the tcpip group
stopsrc -g tcpip
startsrc -g tcpip
It is necessary to configure a static IP address for each network interface that is used by DHCP on the xCAT Management Node to communicate on the service networks. This is necessary so that the DHCP server can provide service automatically after a reboot.
The following examples use eth0 as the DHCP network interface:
DEVICE=eth0
BOOTPROTO=static
HWADDR=00:14:5E:5F:20:90
IPADDR=192.168.200.205
NETMASK=255.255.255.0
ONBOOT=yes
DEVICE=eth0
BOOTPROTO=static
HWADDR=00:14:5E:5F:20:90
IPADDR=192.168.200.205
NETMASK=255.255.255.0
STARTMODE=onboot
mktcpip -a 192.168.200.205 -i en0 -m 255.255.255.0
In the site table, the attribute dhcpinterfaces should be set to the network interfaces being used for hardware discovery. Assuming DHCP will be used for node installations, the network interface on the compute network should also be included here, but can be added at a later date.
chdef -t site clustersite dhcpinterfaces=en0
The xCAT command makedhcp can be used with the -n flag to create the dhcp service configuration file based on attributes found in the xCAT site and networks tables. In this configuration file, the dynamic address range IP pool is created based on the field dynamicrange in networks table.
makedhcp -n
If there are no definitions listed in the networks table and dhcpinterfaces is blank, the makedhcp command will try to generate a DHCP service for all active subnets found on xCAT MN, even if there are no dynamic IP ranges specified. Verify the DHCP configuration files on the xCAT MN to ensure that they contain only the networks you want.
cat /etc/dhcpd.conf # (Linux)
cat /etc/dhcpsd.cnf # (AIX)
After physical installation and checkout of the Frames/CECs has been completed, power them on. All the FSPs and BPAs should get dynamic IP addresses from the DHCP server.
Note: If the frame was already powered on when dhcp was configured/started, you will have to restart the slp daemon on each fsp before running lsslp:
$ killall slpd
$ netsSlp
Create skeleton definitions for the hardware components. They will be used by the lsslp command.
mkdef frame01-frame16 groups="all,frame" hcp=hmc
mkdef f[01-16]cec[01-12] groups="all,cec" hcp=hmc
mkdef f[01-16]fsp[01-12] groups="all,fsp" # do we really need to create the fsps??? Won't lsslp do that?
Before doing the full lsslp discovery, you must specify a few vpd and ppc attributes in the skeleton definitions so that lsslp can associate the hardware components discovered on the network with the definitions in the xCAT database.
For the high end servers such as POWER 595/575 that exist in 24 inch frames, you only need to specify the Frame MTMS information, since the CECs (FSPs) will be automatically located by the BPA. For the System P low end servers that exist in 19 inch frames such as POWER 520, the CEC MTMS information must also be specified. To specify the VPD information, we will use lsslp to help us create a stanza file:
lsslp -m -s FRAME -z -i 192.168.200.205 > /tmp/frame.stanza
lsslp -m -s CEC -z -i 192.168.200.205 > /tmp/cec.stanza # only needed for low-end servers
For high end servers environment, the "vpd" table needs to be updated to include the Frame MTMS information, and "ppc" table needs to be updated to include the CEC's parent which is the Frame node that is controlling it.
For the low end servers environment, the "vpd" table needs to be updated to only include the CEC MTMS information.
After collecting the Frame MTMS information for high end servers and CEC MTMS information for low end servers, and assign the proper nodenames in the stanza file, issue chdef to write them into xCAT DB.
cat /stanza/file/Framepath > chdef -z
or
cat /stanza/file/CECpath > chdef -z
The vpd table is looks like the following after writing the BPA/FSP MTMS information into xCAT DB.
For high end servers:
#node,serial,mtm,side,asset,comments,disable
"frame1","99200G1","9A00-100",,,,
... ...
"frame16","99410D1","9A00-100",,,,
Note: The frames' MTMS information can be obtained from the BPA stanza file.
For low end servers:
#node,serial,mtm,side,asset,comments,disable
"f1c1","100538P","8233-E8B",,,,
... ...
"f16c16","100496P","8233-E8B",,,,
For System P low end servers, the FSP MTMS information is found from the FSP stanza file that was created by lsslp command.
For high end CECs working with BPA(frame), the xCAT "ppc" table also needs to be updated to include the "cage" location information, the frame "parent" information for each CEC, and "nodetype" to indicate the hardware type for Frame and CECs. For low end CECs, the "ppc" table needs to be updated to include the "nodetype" attribute for CECs. The following command can help to write the "cage id" into ppc table:
High end servers:
chdef frame1 nodetype=frame
.
chdef frame16 nodetype=frame
chdef f1c1 id=1 parent=frame1 nodetype=cec
chdef f1c2 id=2 parent=frame1 nodetype=cec
.
chdef f16c2 id=2 parent=frame16 nodetype=cec
Low end servers:
chdef f1c1 nodetype=cec
.
chdef f16c2 nodetype=cec
The ppc table will looks like:
#node,hcp,id,pprofile,parent,supernode,comments,disable
"f1c1",,"1",,"frame1a",,,
"f1c2",,"2",,"frame1a",,,
.
"f16c1",,"1",,"frame16a",,,
"f16c2",,"2",,"frame16a",,,
The xCAT command lsslp is used to discovery the HMC/frame/CECs, to reference the hardware and network information from the DHCP server . It can then write the discovered information into xCAT DB . It can generate output in different format, including RAW, XML and stanza format. We recommend working with the -z flag to create stanza files, so the administrator can review the HW data prior to placing in the xCAT DB. The lsslp command does support the -w flag which can directly update the HW discovery data directly into the xCAT DB if the administrator does not need to make any changes.
See man page of lsslp for details.
Note: If you work with xCAT Direct FSP Management, you still need to discover the HMC below and make the connections between HMC and the hardware. The HMC will always be used for Service Focal Point an for Srvice Repair and Verify procedures. You always need to discover the Frame and CECs, and make their connections to xCAT management node or service node.
Issue lsslp to locate the HMC information and write into a HMC stanza file. You will need to execute the -m (multicast) flag to reference later supported HMC V7R35x/V7R71 levels.
lsslp -m -s HMC -z -i 192.168.200.205 > /hmc/stanza/file
Review the HMC stanza file and make modifications if necessary.
You will need to include the username and password attributes being used by the target HMC node. Make sure that the HMC host name and ip address is resolvable in the xCAT cluster name resolution (/etc/hosts, DNS).
Write the HMC stanza information into xCAT DB with xCAT command mkdef.
cat /hmc/stanza/file | mkdef -z
Issue lsslp command to reference the Frame information and write into the Frame stanza file .
lsslp -s BPA -z -i 192.168.200.205 > /frame/stanza/file
Review the frame stanza file and make modifications if necessary. You may want to update the frame server hostnames and/or BPA ip addresses to match your planned xCAT configuration.
Write the Frame stanza information into xCAT DB with xCAT command mkdef
cat /frame/stanza/file | mkdef -z
Issue lsslp to get CEC information and write into the CEC stanza file.
lsslp -s FSP -z -i 192.168.200.205 > /CEC/stanza/file
Review the CEC stanza file and make modifications if necessary. You may want to update the CEC hostnames and/or FSP ip addresses to match your planned xCAT configuration.
Write the CEC stanza information into xCAT DB with xCAT command mkdef.
cat /CEC/stanza/file | mkdef -z
You should only write directly into the xCAT DB if you are certain that the BPA/FSP server data specified by lsslp command is correct. This is used by experienced xCAT administrators, or if they are adding new System P servers into an existing xCAT cluster. The lsslp -w flag will update existing xCAT DB data, if the new HW discovery finds any BPA/FSP node contentions. The lsslp -n flag is used to locate only new found System P BPA/FSP servers during HW discovery. It can work with -z stanza, or the -w flag, but will only reference or add new BPA/FSP servers into the xCAT DB. The xCAT administrator always has the option to use the chdef command to add or modify attributes to the HMC/BPA/FSP node objects.
When the FSPs and BPAs are powered up for the first time, the MAC addresses for the FSPs and BPAs are not known by the DHCP server or the admin. We can only work a dynamic ip range in DHCP configuration file, so each FSP or BPA will get an dynamic ip address. The random ip address for each FSP or BPA could change when the DHCP client on FSP or BPA restarts. We recommend using a large enough dynamic ip range to avoid the DHCP ip addresses reuse. Be aware that using dynamic ip addresses will increase the maintenance effort, because you can not guarantee BPA/FSP server ip address. Using dynamic DHCP ip addresses opens the possibility that FSPs/BPAs ip addresses may be changed during the FSPs/BPAs server reboot. The FSPs/BPAs ip addresses changing will result in HMC and xCAT DFM connection being lost, where you have to do some administrator steps to recover. The dynamic DHCP ip addresses solution should be able to work well for most of the scenarios when the proper dynamic range is used on the Cluster service network.
There is a way provided by xCAT to make the dynamic IP address to be permanent to avoid the above issue. If the DHCP client MAC address and ip address mapping is specified in the DHCP AIX configuration file /etc/dhcpsd.cnf, RHEL lease file /var/lib/dhcpd/dhcpd.leases or SLES lease file /var/lib/dhcp/db/dhcp.leases, the BPA/FSP will consistently receive the same ip address from the DHCP server. The BPA/FSPs' IP address will not be changed during the BPA/FSPs' reboot.
Issue the makedhcp -a command to write the dynamic ip addresses, server host names, and MAC address of the BPA and FSP in the xCAT DB into the DHCP AIX configuration file/etc/dhcpsd.cnf, RHEL lease file /var/lib/dhcpd/dhcpd.leases, SLESlease file /var/lib/dhcp/db/dhcp.leases.. See more details in man page of makedhcp:
makedhcp -a
You can also execute lsslp command with --makedhcp option to update the DHCP configuration with the dynamic ip address, server name and MAC address of BPA/FSP. This will use the information from xCAT DB and update the DHCP AIX configuration file /etc/db_file.cr, RHEL lease file /var/lib/dhcpd/dhcpd.leases, or SLES lease file /var/lib/dhcp/db/dhcp.leases also.
The following command will create an xCAT node definition for an HMC with a host name of hmc1. The groups, nodetype, mgt, username, and password attributes will be set.
mkdef -t node -o hmc1 groups=hmc,all nodetype=ppc hwtype=hmc mgt=hmc username=hscroot password=abc123
to change and add new groups:
chdef -t node -o hmc1 groups=hmc,rack1,all
to verify your data:
lsdef -l hmc1
If xCAT Management Node is in the same service network with HMC, you will be able to discover the HMC and create an xCAT node definition for the HMC automatically.
lsslp -w -s HMC
To check for the hmc name added to the nodelist:
tabdump nodelist
The above xCAT command lsslp discovers and writes the HMCs into xCAT database, but we still need to set HMCs' username and password.
chdef -t node -o <hmcname from lsslp> username=hscroot password=abc123
Change the hcp and mgt of Frames/CECs
chdef frame hcp=hmc1 mgt=hmc
Have HMC establish connections to all of the frames:
mkhwconn frame -t
Verify the connections were made successfully:
lshwconn frame
frame14: connected
frame14: connected
If the BPA passwords are still the factory defaults, you must change them before running any other commands to them:
rspconfig frame general_passwd=general,<newpd>
rspconfig frame admin_passwd=admin,<newpd>
rspconfig frame HMC_passwd=,<newpd>
The definition of a node is stored in several tables of the xCAT database.
You can use rscan command to discover the HCP to get the nodes that managed by this HCP. The discovered nodes can be stored into a stanza file. Then edit the stanza file to keep the nodes which you want to create and use the mkdef command to create the nodes definition.
Run the rscan command to gather the LPAR information. This command can be used to display the LPAR information in several formats and can also write the LPAR information directly to the xCAT database. In this example we will use the "-z" option to create a stanza file that contains the information gathered by rscan as well as some default values that could be used for the node definitions.
To write the stanza format output of rscan to a file called "node.stanza" run the following command. We are assuming, for our example ,that the hmc name returned from lsslp was hmc1.
rscan -z hmc1 > node.stanza
This file can then be checked and modified as needed. For example you may need to add a different name for the node definition or add additional attributes and values.
Note'': The stanza file will contain stanzas for things other than the LPARs. This information must also be defined in the xCAT database. ''The stanza will repeat the same bpa' information for multiple fsp(s). 'It is not necessary to modify the non-LPAR stanzas in any way.
The stanza file will look something like the following.
Server-9117-MMA-SN10F6F3D:
objtype=node
nodetype=fsp
id=5
model=9118-575
serial=02013EB
hcp=hmc01
pprofile=
parent=Server-9458-10099201WM_A
groups=fsp,all
mgt=hmc
pnode1:
objtype=node
nodetype=lpar,osi
id=9
hcp=hmc1
pprofile=lpar9
parent=Server-9117-MMA-SN10F6F3D
groups=lpar,all
mgt=hmc
cons=hmc
pnode2:
objtype=node
nodetype=lpar,osi
id=7
hcp=hmc1
pprofile=lpar6
parent=Server-9117-MMA-SN10F6F3D
groups=lpar,all
mgt=hmc
cons=hmc
Note'': The ''rscan'' command supports an option ( -w) to automatically create node definitions in the xCAT database. To do this the LPAR name gathered by ''rscan'' is used as the node ''name and the command sets several default values. If you use the -w option, make sure the LPAR name you defined will be the name you want used as your node name.
For a node which was defined correctly before, you can use the following commands to export the definition into the node.stanza, then edit the node.stanza file and then update the database with changes.
lsdef -z nodename > node.stanza
cat node.stanza | chdef -z
The information gathered by the rscan command with the -z options creates a stanza format and also can be used to create xCAT node definitions by running the following command:
cat node.stanza | mkdef -z
Verify the data:
lsdef -t node -l all
For P7/IH, Create Cluster Config File - The customer creates an hw/cluster configuration data file that contains the info enumerated below. The purpose of this file is for the customer to describe how all the discovered hw components should be logically arranged, ordered, and configured. This is because, during the discovery phase (step 5), we will 1st SLP discover all of the raw hw components (HMCs, BPAs, FSPs) on the service network. But this only gives us basic info about each component (MTMS, MAC, etc). We don't know the physical arrangement of the components, and we don't know the IP/hostnames the customer wants for each one. Therefore, the customer below provides that info. You can think of this as a cluster plan or blueprint - used to automate the configuration of the system and used to verify the cluster configuration. The cluster config file also allows the customer to provide some basic info about the other HPC products so that a basic set up of them can be accomplished throughout the cluster. The format of the file will be typical stanza file format. See the cluster config file mini-design, Cluster_config_file, for more details and the xcatsetup man page http://xcat.sourceforge.net/man8/xcatsetup.8.html... for the exact format and keywords in the files.
xcatsetup <cluster_config_file>
The following are limitations of HW discovery working with xCAT 2.4 +
lsslp -s FSP -i 192.168.200.205 -t 5 -c 3000,3000,3000,3000,3000
See lsslp man page for the details.
For HMC with V7R350 and V7R340 release, we had experienced some HMC discovery issues "lsslp -m" in different layer2/layer3 ethernet switch environments. In this case, the xCAT admin may have to manually create the HMC node objects using xCAT command mkdef.
If you run xCAT command lsslp with flag "-w" to auto discover BPA/FSP and create BPA/FSP nodes in xCAT DB, there are some types of Frame/CEC that cannot resolve the user-defined BPA/FSP system names to xCAT. This is because the node name created by lsslp is not consistent as the system name that is known by HMC. This limitation will not block most functions of xCAT. If system admins want to sync the user-defined system names used by the HMC to xCAT DB, run rscan with -u option to update the FSP/BPA node names in the xCAT database. The rscan -u command should only be executed after the running of the mkhwconn command.
Wiki: DFM_Service_Node_Hierarchy_support
Wiki: Hints_and_Tips_for_Large_Scale_Clusters
Wiki: Power_775_Cluster_Documentation
Wiki: Power_775_Cluster_Recovery
Wiki: Power_775_Cluster_on_MN
Wiki: Setting_Up_a_Linux_Hierarchical_Cluster
Wiki: Setting_Up_an_AIX_Hierarchical_Cluster
Wiki: Setup_HA_Mgmt_Node_With_DRBD_Pacemaker_Corosync
Wiki: Setup_HA_Mgmt_Node_With_RAID1_and_disks_move
Wiki: Setup_HA_Mgmt_Node_With_Shared_Data
Wiki: Setup_HA_Mgmt_Node_With_Shared_Disks
Wiki: Setup_for_Power_775_Cluster_on_xCAT_MN
Wiki: XCAT_AIX_Diskless_Nodes
Wiki: XCAT_System_p_Hardware_Management_for_HMC_Managed_Systems
Wiki: XCAT_System_p_Hardware_Management_for_hmc_managed_systems
Wiki: XCAT_pLinux_Clusters_775