Re: [SSI] [Design] Cluster wide TCP and LVS
Brought to you by:
brucewalker,
rogertsang
From: Bruce W. <br...@ka...> - 2002-04-29 22:13:55
|
Aneesh, I believe we can use a variant of the ipvsadm --sync without CLMS help and without keepalive, for now. I have a few questions about how the current HA LVS works. Based on the answers, I can interact on how we can/should get something at least as good on SSI/CI. 1. My understanding is that the primary replicates all the connect/disconnect requests to the secondary. Failing over just means sending out the gratuitous arp on the secondary node. Is that right? 2. Is it multicast so you can have multiple secondaries? 3. If so, how do you specify who the secondary is to the primary and how do you tell the secondary he is the secondary? 4. Can you populate a secondary after the primary has already begun? 5. Can you failback to the original primary after it comes back after a crash? Since we don't have the full keepalive, here are some suggestions. A. Put the LVS config information in a file /etc/lvs.config which would look something like: VIP1 primary secondary server:port, server:port, ... VIP2 primary secondary server:port, server:port, ... where primary, secondary and server are ip addresses B. Run a daemon on every node. The daemon reads the file and sets up things on that node accordingly using the ipvsadm command. It must be able to determine what ip addresses it has locally, which is possible. It can do the ip alias for the VIP. It also has the SIGCLUSTER and if it is currently a redirector it does the nodedown, etc. It also has the SIGCLUSTER code to do takeover if it is a secondary and primary fails. This way we don't really need keepalive. One tricky part for a random server is to know which node is currently the redirector (the primary or secondary) so it can add it's onnode ldirectord ipvsadm -a -t VIIP:22 -r PIP1:22 ? My suggestion is to have the active director put his node number in /etc/lvs.VIP.active. what do you think? bruce p.s. I'll comment later on the bind/connect stuff. > Hi, > Following are the methods which i think we can use for a failover LVS. > Please correct me if I am wrong. > > (1)Start ipvsadm in the synchronization mode. > (2)Use arping binary which is available with all the distribution for > gracious ARP > OR > (2)send_arp binary distributed along with heartbeat. not sure what the difference is. I would just put the code in the daemon described above. > (3)have node events handling server in the LVS remove the entry for > the nodes that has gone down. > > > I can have a server running on LVS Director node and this server will > be monitored by our keep alive subsystem. This also will have a signal > handler for SIGCLUSTER. This handler will take care of step3. When this > handler is started it will remove the entry pertaining to the node that > went down. > > > -aneesh > > > > > > On Fri, 2002-04-12 at 19:53, Bruce Walker wrote: > > > Hi, > > > > > > Can i get some info regarding the work already done with respect to > > > SSI. > > > > > > -aneesh > > > > Below are a bunch of notes of current status/thoughts etc. > > Key thing we need to do initially is provide some form of failover > > for LVS (may just be a short keepalive script). > > > > bruce > > > > > > Hi, > > Putting some of the issues related to LVS and how to make use of LVS in > cluster wide TCP/IP. > > AIM: IP based load balancing and cluster wide port space. > > we can use LVS for load balancing. > Achieving high availability for LVS. > (1) Use the synchronization method with ipvsadm > (2) Write a CLMS key service for rebuilding the LVS routing table. > > Issues with the above two methods: > > PROBLEM WITH THE ipvsadm --sync > Decrease the performance because of muliticasting. > > PROBLEM WITH THE KEY SERVICE: > As of now only master are marked as able to run key services.this will > make LVS director to run on masters which will increase the load on > master. Again it will prevent the cluster from having different CVIP for > different subnets. > > Example. > |------ Node1 Node 3---------| > | \ / | > | \ / | > | ---------/ | > | | | | > | | CI | | > | | | | > | ---------\ | > | / \ | > | / \ | > |---------- Node2 Node 4----------| > | | > > subnet 1 subnet 2 > > > Node1 primary master and Node2 secondary master. Now since the nodes > are also a part of different network, we need to have different CVIP for > different subnets. Again if you use key service for rebuilding the LVS > table then the LVS director should be running on master and potential > masters only. In the above case we need the CVIP as > > CVIP1 -> LVS director should be on 1 or 2 > CVIP2 -> LVS director should be on 3 or 4. (This is not possible if you > use key service.I guess NSC allow CLMS to run on nodes other than > master, and i remember Kai saying that that part of the code is not yet > moved to SSI. ) > > Theres should be a mechanism for the user level daemon to know which > node LVS is rebuilt so that he can issue a gracious ARP for CVIP with > that nodes IP.( Will be explained below ) > > In short currently we cannot rebuild LVS table using key service and > if can ( by taking the code from NSC that allow key service to run on > nodes other than masters) there is an added complexity with the user > level daemon knowing which node is now acting as a LVS director. ( WE ca > assume that there will be only two masters. One primary and other > secondary :)) > > REMOVING THE ENTRY WITH RESPECT TO THE NODES WHEN THE NODES GOES DOWN: > For this we will have one user level daemon running on LVS director. > This daemon have two jobs. > > (1). Register a signal handler for SIGCLUSTER which will do the above > job. ( remove entries corresponding to down nodes ) > (2). Send the Gracious ARP messages when started. > This daemon will be monitored by the keepalive subsystem and will be > restarted on the LVS secondary . > > We can use send_arp distributed along with the heartbeat or arping > which is a standard utility along with iputils for sending gracious ARP. > > > To be continued in the next mail.:) > > -aneesh > > > > > > > > _______________________________________________ > ssic-linux-devel mailing list > ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel |