Re: [SSI] [Design] Cluster wide TCP and LVS

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Aneesh,
  I believe we can use a variant of the ipvsadm --sync without
CLMS help and without keepalive, for now.

I have a few questions about how the
current HA LVS works.  Based on the answers, I can interact on how we
can/should get something at least as good on SSI/CI.

1. My understanding is that the primary replicates all the connect/disconnect
   requests to the secondary.  Failing over just means sending out the
   gratuitous arp on the secondary node.  Is that right?
2. Is it multicast so you can have multiple secondaries?
3. If so, how do you specify who the secondary is to the primary and how
   do you tell the secondary he is the secondary?
4. Can you populate a secondary after the primary has already begun?
5. Can you failback to the original primary after it comes back after
   a crash?

Since we don't have the full keepalive, here are some suggestions.

A. Put the LVS config information in a file /etc/lvs.config which would
  look something like:
       VIP1  primary secondary server:port, server:port, ...
       VIP2  primary secondary server:port, server:port, ...
  where primary, secondary and server are ip addresses
B. Run a daemon on every node.
  The daemon reads the file and sets up things on that node accordingly
  using the ipvsadm command.  It must be able to determine what ip addresses it has
  locally, which is possible.  It can do the ip alias for the VIP.
  It also has the SIGCLUSTER and if
  it is currently a redirector it does the nodedown, etc.
  It also has the SIGCLUSTER code to do takeover if it is a secondary and
  primary fails.  This way we don't really need keepalive.

  One tricky part for a random server is to know which node is currently
  the redirector (the primary or secondary) so it can add it's 
    onnode ldirectord ipvsadm -a -t VIIP:22 -r PIP1:22 ?
  My suggestion is to have the active director put his node number in
    /etc/lvs.VIP.active.

what do you think?
   bruce

p.s. I'll comment later on the bind/connect stuff.

> Hi, 
> 	Following are the methods which i think we can use for a failover LVS.
> Please correct me if I am wrong. 
> 
>  (1)Start ipvsadm in the synchronization mode. 
>  (2)Use arping binary which is available with all the distribution for
> gracious ARP  
> 			OR
>  (2)send_arp binary distributed along with heartbeat. 

not sure what the difference is.  I would just put the code in the daemon described
above.

>  (3)have node events handling  server in the LVS remove the entry for
> the nodes that has gone down.
> 
> 
> I can have a server running on LVS Director node and this server  will
> be monitored by  our keep alive subsystem. This also will have a signal
> handler for SIGCLUSTER. This handler will take care of step3. When this
> handler is started it will remove the entry  pertaining to the node that
> went down. 
> 
> 
> -aneesh 
> 
> 
> 
> 
> 
> On Fri, 2002-04-12 at 19:53, Bruce Walker wrote:
> > > Hi, 
> > > 
> > >   Can i get some info regarding the work already done with respect to
> > > SSI. 
> > > 
> > >  -aneesh 
> > 
> > Below are a bunch of notes of current status/thoughts etc.
> > Key thing we need to do initially is provide some form of failover
> > for LVS (may just be a short keepalive script).
> > 
> > bruce
> > 
> > 
> 

> Hi, 
> 
> Putting some of the  issues related to LVS and how to make use of LVS in
> cluster wide TCP/IP.
> 
> AIM: IP based load balancing and cluster wide port space. 
> 
> we can use LVS for load balancing.
> Achieving high availability for LVS. 
> 	(1) Use the  synchronization method with ipvsadm
> 	(2)  Write a  CLMS key service for rebuilding the LVS routing table. 
> 
> Issues with the above two methods:
> 
> PROBLEM WITH THE  ipvsadm --sync 
>  Decrease the performance because of muliticasting. 
> 
> PROBLEM WITH THE KEY SERVICE:
>  As of now only master are marked as able to run key services.this will
> make LVS director to run on masters which will increase the load on
> master. Again it will prevent the cluster from having different CVIP for
> different subnets. 
> 
> Example.
> 	|------	Node1		   Node 3---------|
> 	|	      \		   /		  |
> 	|	       \	  /		  |
> 	|		---------/		  |
> 	|		|	|		  |
> 	|		|  CI	|		  |
> 	|		|	|		  |
> 	|		---------\ 		  |	
> 	|		/         \		  |
> 	|	       /           \		  |
> 	|---------- Node2	  Node 4----------|
> 	|					  |
> 
>      subnet 1				      subnet 2 
> 
> 
> 	Node1 primary master and Node2 secondary master. Now since the nodes
> are also a part of different network, we need to have different CVIP for
> different subnets. Again if you use  key service for rebuilding the LVS
> table then the LVS director should be running on master and potential
> masters only. In the above case we need the CVIP as 
> 
> 	CVIP1 -> LVS director should be on 1 or 2 
> 	CVIP2 -> LVS director should be on 3 or 4. (This is not possible if you
> use key service.I guess NSC allow CLMS to run on nodes other than
> master, and i remember Kai saying that that part of the code is not yet
> moved to SSI. )
> 
>     Theres should be a mechanism for the user level daemon to know which
> node LVS is rebuilt so that he can issue a gracious ARP  for CVIP with
> that nodes IP.( Will be explained below ) 
> 
> In short currently we cannot rebuild  LVS table  using key service and
> if can ( by taking the code from NSC that allow key service to run on
> nodes other than masters) there is an added complexity with the user
> level daemon knowing which node is now acting as a LVS director. ( WE ca
> assume that there will be only two masters. One primary and other
> secondary :))
> 
> REMOVING THE ENTRY WITH RESPECT TO THE NODES WHEN THE NODES GOES DOWN:
> For this we will have one user level daemon running on LVS director.
> This daemon have two jobs. 
> 
> 	(1). Register a signal handler for SIGCLUSTER which will do the above
> job. ( remove entries corresponding to  down nodes  )
> 	(2). Send the Gracious ARP messages when started.
> This daemon will be monitored by the keepalive subsystem and will be
> restarted on the LVS secondary .
> 
> 	We can use send_arp distributed along with the heartbeat or arping
> which is a standard utility along with iputils for sending gracious ARP.
> 
> 
> 				To be continued in the next mail.:)
> 
> -aneesh 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> ssic-linux-devel mailing list
> ssi...@li...
> https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel