Re: [Keepalived-devel] keepalived partition handling and quorum
Status: Beta
Brought to you by:
acassen
|
From: Howley, T. <tom...@hp...> - 2014-09-26 14:04:20
|
Anders: Thanks for the great reply and thanks to Ryan for the follow up clarifications. I think you have both confirmed what I originally thought, which is that I was trying to bring in the load-balancing config in the vain hope that this would somehow bring some quorum functionality into the VRRP component of keepalived. I should have said from the outset that I'm using keepalived in combination with haproxy, so I'm only interested in keepalived for managing a VIP. So it seems that VRRP does not employ a concept of quorum. If a node gets isolated from the cluster, it will happily promote itself to Master (and send a gratuitous ARP). This was confirmed in a recent scenario where someone tested the blocking of all packets on a node that did not currently hold the VIP. Because the rule for blocking incoming packets was added just before the outgoing packet rule, it gave that node enough time to promote itself to Master and send a gratuitous ARP. I understand that this is somewhat an artificial test (the ordering should possibly be reversed to come closer to a cable pull), but it did give some insight into the behaviour. So it sounded like this would solve it: " garp_master_delay in keepalived should take care of this; it keeps the master re-announcing themself to the network routers over and over again." Although looking at man page, it suggests that this is just a one-off delay before sending a single ARP after transition to Master. Is that the actual effect of that setting or is there an alternative one? Thanks again, Tom -----Original Message----- From: Ryan O'Hara [mailto:ro...@re...] Sent: 26 September 2014 14:35 To: Anders Henke Cc: kee...@li... Subject: Re: [Keepalived-devel] keepalived partition handling and quorum On Fri, Sep 26, 2014 at 10:51:05AM +0200, Anders Henke wrote: > Hi Ryan, > > Thanks for correcting my sloppy description and use of language when trying to simplify things. Your email original was a good description of VRRP. I didn't think it was sloppy at all. Ryan > Of course, VRRP is not specific to balancing, the sentence resulted in a misleading concatenated, shortened version of the following: > - VRRP ensures an IP address to be present on a network at some node. > - the node operating that IP address is doing whatever they need to do: routing/forwarding/loadbalancing IP packets, failover of (near-)stateless services, ... in the context of keepalived, this usually is loadbalancing (it doesn't need to be that way, keepalived may be used without loadbalancing). > > And of course, the usage of garp_master_refresh isn't the preferred way. For some time now, keepalived does now optionally support the RFC-required usage of a virtual MAC address (use_vmac). However, there are some situations in real life where this falls short, and garp_master_refresh does address them. > > Best, > > Anders > > On 25.09.2014, Ryan O'Hara wrote: > > On Thu, Sep 25, 2014 at 12:49:38PM +0200, Anders Henke wrote: > > > On 24.09.2014, Howley, Tom wrote: > > > > I’m relatively new to keepalived, so apologies in advance. I > > > > have a relatively simple keepalived setup with a single vrrp > > > > instance that is managing a single VIP across three nodes. The > > > > only point worth noting is that my config is identical across > > > > all three nodes, so the ip address is used in the original > > > > Master election. I’m wondering if VRRP has a concept of quorum > > > > handling. I basically want to avoid the scenario where a network > > > > partition (which could be isolated to just a multicast failure) > > > > results in two nodes of the cluster claiming to hold the VIP. For example if a node that was Master becomes isolated, can I configure it to disassociate the VIP from itself? > > > > > > VRRP, extremely simplified: > > > - your balancer nodes are announcing their availability via multicast on your local network. > > > This availability message contains a router ID, a priority and a list of VIPs. > > > > s/availability/advertisement/ > > > > > - your balancer nodes are listing to those announces as well. > > > If they don't see an announcement with a higher priority than their own using > > > the same router ID and (optionally) the same list of VIPs, they'll start serving those > > > VIPs, otherwise they'll stop serving them. > > > > I'd refrain from calling these balancers. If you're using keepalived > > with for VRRP and IPVS, then yes, they are balancers. But if you are > > trying to give an overview of VRRP, they may not be balancers. VRRP > > really has nothing to do with load-balancing. > > > > > Some VRRP implementations also add an additional tie-breaker: when > > > multiple nodes are using the same parameters (router ID, VIPs, priority), the node with highest IP becomes master. > > > > This is actually part of the RFC. > > > > > Just by design, it's an "if I don't see anyone else trying to do that job, I'll do it" idea and there is no such thing as gaining a vote using some quorum mechanism or a sophisticated election algorithm between multiple nodes. > > > > > > Whenever your network partitions, you may end with two or more of your balancer nodes claiming to hold the same VIP. > > > > > > This doesn't need to be a bad thing: probably two of your three balancer nodes will see each other and would gain a majority vote, but they might be on an isolated part of your network without any internet connectivity. So if "both" partitions of your network are active, incoming traffic might arrive at a non-redundant, but still internet-connected balancer with hopefully some realservers in behind, serving incoming requests. > > > > > > Whenever your network re-unites, those balancer nodes will discover each other again, a single node will keep the VIP and the other ones will release it. > > > > > > However, there are some pitfalls involved; for example, if the master did try to announce themself to the network routers, but those initial ARP replies were ignored by a too-busy router, so your network may continue sending traffic to your backup balancer node. As the backup balancer does de-configure the VIP from its local interfaces, that box will still receiver, but also happily ignore that incoming traffic: resulting in non-availability of your service. > > > Configuring garp_master_delay in keepalived should take care of this; it keeps the master re-announcing themself to the network routers over and over again. > > > > > > > I have just tried adding some LVS config, specifying a pool of > > > > real servers, so that I now have a script that is invoked if either quorum is lost or regained. > > > > So I could possibly use that to do what I want, but it feels > > > > like I’m going the road of hackery. > > > > > > Quorum in that context is way different from VRRP; it's the the amount of available realservers, which is unrelated to the amount of available balancer nodes. > > > > > > VRRP just takes care about at least one balancer to announce themself to the network and distribute incoming traffic to your realservers (or a sorry_server). > > > > VRRP does not load balance. > > > > Ryan > > > > > Quorum in keepalived is just an extra. When keepalived does see "enough" realserver capacity to serve requests, it'll trigger a script with a custom action. When the capacity drops below a threshold, keepalived will trigger (a different) script, doing some (different) custom action. > > > > > > What is quorum good for? > > > - you may want to trigger custom actions, whenever there are "too few" realservers available. > > > A quorum script can notify your monitoring system or trigger a deployment system to > > > add more (virtual) realservers to your network. > > > > > > - you may want to announce your balancer not just via ARP, but via a dynamic routing protocol; > > > for example, you may want to serve the same VIP from multiple data centers using anycast. > > > A quorum script can reconfigure your local BGP daemon, withdrawing or adding VIP announcements > > > dynamically, ensuring that requests don't flood a balancer with too few available realservers. > > > > > > > > > Anders > > > -- > > > 1&1 Internet AG Expert Systems Architect (IT Operations) > > > Brauerstrasse 50 v://49.721.91374.0 > > > D-76135 Karlsruhe f://49.721.91374.225 > > > > > > Amtsgericht Montabaur HRB 6484 > > > Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, > > > Andreas Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, > > > Jan Oetjen, Christian Würst > > > Aufsichtsratsvorsitzender: Michael Scheeren > > > > > > ------------------------------------------------------------------ > > > ------------ Meet PCI DSS 3.0 Compliance Requirements with > > > EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with > > > Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 > > > Compliance? Download White paper Comply to PCI DSS 3.0 Requirement > > > 10 and 11.5 with EventLog Analyzer > > > http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/o > > > stg.clktrk _______________________________________________ > > > Keepalived-devel mailing list > > > Kee...@li... > > > https://lists.sourceforge.net/lists/listinfo/keepalived-devel > -- > 1&1 Internet AG Expert Systems Architect (IT Operations) > Brauerstrasse 50 v://49.721.91374.0 > D-76135 Karlsruhe f://49.721.91374.225 > > Amtsgericht Montabaur HRB 6484 > Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, > Andreas Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, Jan > Oetjen, Christian Würst > Aufsichtsratsvorsitzender: Michael Scheeren ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ Keepalived-devel mailing list Kee...@li... https://lists.sourceforge.net/lists/listinfo/keepalived-devel |