[SSI-users] README.CVIP - updates for multi-interface scenarios

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Updated readme that will be distributed in rc6 and be on the Web.
Note that this is specific to RH and we need someone to 
adapt it for Debian.

bruce

		Configuring and Debugging CVIP/LVS on OpenSSI

   There are 2 fundamental configurations for your cluster networking (with of
course some variations of those).  Below we look first at configuration A,
which is the simple configuration where each node only has one ethernet
interface.  Configuration B is where each node has an interface on an
internal cluster interconnect and some or all nodes have a second interface
visible outside the cluster. 

Configuration A - Each node only has one ethernet interface.

   Before setting up LVS/CVIP one can look at the following pieces to 
understand the networking setup.  There are several utilities one can use 
to confirm their configuration and the output of some of these commands is 
useful if there are difficulties.
a. a "cat /etc/clustertab" will show you the ip address each node is using as a
   cluster interconnect.  
b. On each node, run ifconfig and you should see the corresponding address 
   associated with an eth device (say eth0).
c. Run "host <ipaddr> for each node's ip address and you should get a name that 
   matches /etc/nodename for that node (eg. onall cat /etc/nodename should give
   you the unique names for each node in the cluster).
d. If nodes in the cluster are to PXE boot, each node's MAC address and ip
   address as in /etc/clustertab should also be in /etc/dhcpd.conf.
e. Each node should have a default route to enable it to talk to machines 
   outside the cluster.  "onnode -p # route" can be used to ensure
   each node has a route.

Configuring LVS/CVIP in this configuration is pretty straightforward.
However, it is not the same as non-OpenSSI clusters and one SHOULD NOT
run ipvsadm except with the -L option.
There is a writeup on ipvs/LVS/HA-LVS in /usr/share/doc/openssi*/README.ipvs.
A summary of the Important points, assuming you are on RH, is:

a. pick an ip address for cluster virtual ip address (cvip);  it must
     be on the same subnet as the ip address on 
     the physical interface you want the traffic to come in to;
     You can make it the same address as that already on eth0 but if you do
     that, you won't be able to failover the address if that node fails
     (i.e. no HA-LVS capability).
b. /etc/clustername should be updated to reflect the DNS name assigned to
   the address selected in step a.
c. edit /etc/cvip.conf to put in the cvip and set up the director (or
   directors, if you want failover) and server nodes;  the garp and
   sync interfaces will be eth0.  A sample copy of the contents (with 
   cvip address of a.b.c.d) is included below.
d. make sure the ha-lvs service is enabled (using chkconfig);
e. edit the "setport-weight" command in the /etc/init.d/ha-lvs script
   to designate the ports you want load levelled
   (I just do 1 to 80 so I get ssh, rsh, telnet, http).
f. reboot is a good idea at this point to be safe but I'm sure it is not
   absolutely necessary.

   During the reboot you should see the ha-lvs service started up on all
   nodes although it will exit after doing it's thing on non-director nodes;  

After you are back up, you can confirm the LVS/CVIP in several ways:
a. do an onall ifconfig and you should see your director node with an
   eth0:xxx entry with the ip address of the CVIP.  Other nodes should have
   a lo:xxx entry with the ip address of the CVIP.
b. as root, cat /proc/cluster/lvs and you should see the CVIP and the 
   director node;  you can also cat /proc/cluster/ip_vs_portweight to see
   what ports are to be distributed.
c. as root, do ipvsadm -L on the director node and you might see the
   ports that are being load levelled (the output has limited columns for
   hostname:service so sometimes the service doesn't show up; run ipvsadm -Ln
   if necessary).
d. for some service/port you configured (like ssh), start  some connections
   and check the ipvsadm -L output to see the active connections column
   is spreading the load between the servers for the service.
Note that if the service (like sshd) is not running on a node it won't
show up in the ipvsadm -L output and of course no connections will be sent
to that node for that service.  To determine what services are running where,
you can execute service --status-all and check them out.

Note that you can "service ha-lvs stop" and "service ha-lvs start" to reset
things BUT, there is a current restriction that you must restart xinetd
(and any other services with ports you want load levelled) after you
restart ha-lvs  (we have all forgotten to do that from time to time and
of course nothing gets load levelled).

Example Configuration file format for /etc/cvip.conf
----------------------------
<?xml version="1.0"?>
<cvips>
        <cvip>
                <ip_addr>a.b.c.d</ip_addr>
                <director_node>
                        <node_num>1</node_num>
                        <garp_interface>eth0</garp_interface>
                        <sync_interface>eth0</sync_interface>
                </director_node>
                <director_node>
                        <node_num>2</node_num>
                        <garp_interface>eth0</garp_interface>
                        <sync_interface>eth0</sync_interface>
                </director_node>
                <real_server_node>
                        <node_num>1</node_num>
                </real_server_node>
                <real_server_node>
                        <node_num>2</node_num>
                </real_server_node>

        </cvip>
</cvips>

In the above configuration is for a two node cluster with both the
nodes acting as director and real server

Configuration B - Each node has an cluster interconnect interface and some
   or all nodes also have an interface visible outside the cluster.
   Presumably the "external" interfaces will be on a different subnet than 
   the internal interfaces.  Within this configuration there are 2 variations 
   which are distinguied below - the internal network is on the "private"
   IP subnet (10.x.x.x or 192.168.x.x, etc.?) or the internal network is public.

   Before setting up LVS/CVIP one can look at the following pieces to 
understand the networking setup.  There are several utilities one can use 
to confirm ones configuration and the output of some of these commands is 
useful if there are difficulties.
a. a "cat /etc/clustertab" will show you the ip address each node is using as a
   cluster interconnect.  
b. On each node, run ifconfig and you should see the corresponding address 
   associated with an eth device (say eth0).
c. Run "host <ipaddr> for each node's cluster interconnect ip address and 
   you should get a name that matches /etc/nodename for that node (eg. 
   onall cat /etc/nodename should give
   you the unique names for each node in the cluster).
d. If nodes in the cluster are to PXE boot, each node's MAC address and ip
   address as in /etc/clustertab should also be in /etc/dhcpd.conf.
e. Each node should have a default route to enable it to talk to machines 
   outside the cluster.  "onnode -p # route" can be used to ensure
   each node has a route.  If there are nodes which only have internal
   interfaces, they may need to route thru another node in the cluster.
   In addition to ensuring their route is correct, it is necessary for
   the gateway node to have IP forwarding turned on.

   The "external" interfaces (interfaces not used within the cluster but 
used to communicate outside the cluster) may already be configured or may
need to be configured.  The can be configured to be on a different subnet
than that being used for the cluster interconnect.  Configuring them
can be done via redhat-configure-network.  Once they are configured one
can run "onall ifconfig" to see that nodes with 2 interfaces have an eth0
and an eth1.  

   In the multiple-interface, multiple-subnet case, one must take into 
account routing.  If some of the cluster nodes only have an interface
on the internal network (whether private or not), they will need a route 
through one of the other clusternodes that does have an external interface 
so they can communicate with the outside environment.  (and the gateways 
nodes must have IP forwarding turned on).  You can run netstat -r
on each node to determine the route table on that node.   If the nodes are
on a private network, you must use LVS-NAT to allow those nodes to be
servers and you must set up NAT to allow the internal nodes to 
communicate with the outside environment 

Configuring LVS/CVIP in this configuration is pretty straightforward.
However, it is not the same as non-OpenSSI clusters and one SHOULD NOT
run ipvsadm except with the -L option.
There is a writeup on ipvs/LVS/HA-LVS in /usr/share/doc/openssi*/README.ipvs.
A summary of the important points, assuming you are on RH, is:

a. pick an ip address for cluster virtual ip address (cvip);  it must
     be on the same subnet as the ip address on 
     the physical interface you want the traffic to come in to (in this
     case the external network, say eth1);  It should be a different address
     than the existing eth1 address to allow for failover.
b. /etc/clustername should be updated to reflect the DNS name assigned to
   the address selected in step a.
c. edit /etc/cvip.conf to put in the cvip and set up the director (or
   directors, if you want failover) and server nodes;  the garp interface
   must be eth1 (interface CVIP traffic is to come in on) and
   sync interfaces will be eth0 (interface nodes in the cluster talk to
   each other).  A sample copy of the contents (with 
   cvip address of a.b.c.d) is included below.
d. make sure the ha-lvs service is enabled (using chkconfig);
e. edit the "setport-weight" command in the /etc/init.d/ha-lvs script
   to designate the ports you want load levelled
   (I just do 1 to 80 so I get ssh, rsh, telnet, http).
f. reboot is a good idea at this point to be safe but I'm sure it is not
   absolutely necessary.
g. If there are server nodes on an internal network that is not private, 
   they must have a route back to the external environment and the gateway 
   you specify for them must have IP forwarding on.
h. If there are server nodes on an internal network that is private, 
   you must use LVS-NAT and set up NAT on the director nodes.  In 
   addition, each server node must have a route back to the external 
   environment and the gateway you specify for them must have IP forwarding on.

   During the reboot you should see the ha-lvs service started up on all
   nodes although it will exit after doing it's thing on non-director nodes;  

After you are back up, you can confirm the LVS/CVIP in several ways:
a. do an onall ifconfig and you should see your director node with an
   eth1:xxx entry with the ip address of the CVIP.  Other nodes should have
   a lo:xxx entry with the ip address of the CVIP.
b. as root, cat /proc/cluster/lvs and you should see the CVIP and the 
   director node;  you can also cat /proc/cluster/ip_vs_portweight to see
   what ports are to be distributed.
c. as root, do ipvsadm -L on the director node and you might see the
   ports that are being load levelled (the output has limited columns for
   hostname:service so sometimes the service doesn't show up; run ipvsadm -Ln
   if necessary).
d. for some service/port you configured (like ssh), start  some connections
   and check the ipvsadm -L output to see the active connections column
   is spreading the load between the servers for the service.
e. If you are using the director node as a gateway for some of the
   server nodes, you can check that it is doing IP forwarding by
   executing "cat /proc/sys/net/ipv4/ip_forward" and seeing a "1".
Note that if the service (like sshd) is not running on a node it won't
show up in the ipvsadm -L output and of course no connections will be sent
to that node for that service.  To determine what services are running where,
you can execute service --status-all and check them out.

Note that you can "service ha-lvs stop" and "service ha-lvs start" to reset
things, BUT, there is a current restriction that you must restart xinetd
(and any other services with ports you want load levelled) after you
restart ha-lvs  (we have all forgotten to do that from time to time and
of course nothing gets load levelled).  Check a-d above after any ha-lvs 
stop/start.

Example Configuration file format for /etc/cvip.conf
----------------------------
<?xml version="1.0"?>
<cvips>
        <cvip>
                <ip_addr>a.b.c.d</ip_addr>
                <director_node>
                        <node_num>1</node_num>
                        <garp_interface>eth1</garp_interface>
                        <sync_interface>eth0</sync_interface>
                </director_node>
                <director_node>
                        <node_num>2</node_num>
                        <garp_interface>eth1</garp_interface>
                        <sync_interface>eth0</sync_interface>
                </director_node>
                <real_server_node>
                        <node_num>2</node_num>
                </real_server_node>
                <real_server_node>
                        <node_num>3</node_num>
                </real_server_node>

        </cvip>
</cvips>

In the above configuration is for a three node cluster with nodes 1 and 2 
acting as director and failover director and nodes 2 and 3 as the real 
servers.