Note: this is an xCAT design document, not an xCAT user document. If you are an xCAT user, you are welcome to glean information from this design, but be aware that it may not have complete or up to date procedures.
Note: See [Multiple_Zone_Support2]
Two new customer requirements are covered by this design.
The first requirement is to be able to take an xCAT Cluster managed by one xCAT Management Node and divide it into multiple zones. The nodes in each zone will share common root ssh keys. This allows the nodes in a zone to be able to ssh to each other without password, but cannot do the same to any node in another zone. You might even call them secure zones.
Note:These zones share a common xCAT Management Node and database including the site table, which defines the attributes of the entire cluster.
There will be no support for AIX.
The multiple zone support requires several enhancements to xCAT.
Currently xCAT changes root ssh keys on the service nodes (SN) and compute nodes (CN) that are generated at install time to the root ssh keys from the Management node. It also changes the ssh hostkeys on the SN and CN to a set of pre-generated hostkeys from the MN. Putting the public key in the authorized-keys file on the service nodes and compute nodes allows passwordless ssh to the Service Nodes (SN) and the compute nodes from the Management Node (MN). This setup also allowed for passwordless ssh between all compute nodes and servicenodes. The pre-generated hostkey makes all nodes look like the same to ssh, so you are never prompted for updates to known_hosts
Having zones that cannot passwordless ssh to nodes in other zones requires xCAT to generate a set of root ssh keys for each zone and install them on the compute nodes in that zone. In addition the MN public key must still be put in the authorized_keys file on all the nodes in the non-hierarchical cluster or the Service Node public key on all the nodes that it services for hierarchical support.
We will still use the MN root ssh keys on any service nodes. Service Nodes would not be allowed to be a member of a zone. Service nodes must be able to ssh passwordless to each other, especially to support Service nodes pools.
We will still use the MN root ssh keys on any devices, switches, hardware control. All ssh access to these is done from the MN or SN, so they will not be part of any zone.
To support multiple zones we have the proposed changes:
A new table zone will be created.
key:zone name
sshkeydir - directory containing root ssh RSA keys.
For this implementation we are proposing we can do the following:
Move, I think this is very complex and out of the scope of being supported in 2.8.4. This can be debated.
Note: these command will be packaged in xCAT-server rpm. They must run on the Linux Management Node. There will be no support for AIX in this release. I think checking that the MN is Linux is enough for now when the command runs. We could check if any node in the noderange is AIX ( mixed clusters).
mkzone will be used to do the following:
mkzone will have the following interface:
mkzone <noderange> <zonename> | --defaultzone> [-k <full path to the ssh private key>]
Note:-k optional, -n or --default provided
It will do the following:
rmzone -n <zonename>
rmzone will be used to do the following:
chzone -n <zonename> [-k <full path to the ssh private key>] [-K] [-a <noderange> [-r <noderange>]
chzone will be used to do the following:
lszone [ -n <zonename> ]
lszone will be used to do the following:
Note: nodels zonename will display all the nodes assigned.
This support affects several existing xCAT components:
Some of the issues discussed:
If a node is not defined in a zone, root ssh keys and passwords must work as today. This makes sure that a xCAT upgrade does not disrupt an existing xCAT installation.
Would like to have all customers using a generated root ssh key. I think with this support documented, the mkzone command gives them the ability to switch from using root/.ssh keys to a new generated key. They can define their zone as all the compute nodes in their cluster. They can use the --defaultzone option. This leaves the change under their control.
We would need a new document on setting this type of cluster up and managing it. Hierarchy adds even more complexity.
Wiki: Multiple_SubCluster_support
Wiki: Multiple_Zone_Support2