Re: [Keepalived-devel] keepalived partition handling and quorum

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Ryan,

Thanks for correcting my sloppy description and use of language when trying to simplify things.

Of course, VRRP is not specific to balancing, the sentence resulted in a misleading concatenated, shortened version of the following:
- VRRP ensures an IP address to be present on a network at some node.
- the node operating that IP address is doing whatever they need to do: routing/forwarding/loadbalancing IP packets, failover of (near-)stateless services, ... in the context of keepalived, this usually is loadbalancing (it doesn't need to be that way, keepalived may be used without loadbalancing).

And of course, the usage of garp_master_refresh isn't the preferred way. For some time now, keepalived does now optionally support the RFC-required usage of a virtual MAC address (use_vmac). However, there are some situations in real life where this falls short, and garp_master_refresh does address them.

Best,

Anders

On 25.09.2014, Ryan O'Hara wrote:
> On Thu, Sep 25, 2014 at 12:49:38PM +0200, Anders Henke wrote:
> > On 24.09.2014, Howley, Tom wrote:
> > > I’m relatively new to keepalived, so apologies in advance. I have a relatively
> > > simple keepalived setup with a single vrrp instance that is managing a single
> > > VIP across three nodes. The only point worth noting is that my config is
> > > identical across all three nodes, so the ip address is used in the original
> > > Master election. I’m wondering if VRRP has a concept of quorum handling. I
> > > basically want to avoid the scenario where a network partition (which could be
> > > isolated to just a multicast failure) results in two nodes of the cluster
> > > claiming to hold the VIP. For example if a node that was Master becomes
> > > isolated, can I configure it to disassociate the VIP from itself?
> > 
> > VRRP, extremely simplified:
> > - your balancer nodes are announcing their availability via multicast on your local network.
> >   This availability message contains a router ID, a priority and a list of VIPs.
> 
> s/availability/advertisement/
> 
> > - your balancer nodes are listing to those announces as well.
> >   If they don't see an announcement with a higher priority than their own using
> >   the same router ID and (optionally) the same list of VIPs, they'll start serving those
> >   VIPs, otherwise they'll stop serving them.
> 
> I'd refrain from calling these balancers. If you're using keepalived
> with for VRRP and IPVS, then yes, they are balancers. But if you are
> trying to give an overview of VRRP, they may not be balancers. VRRP
> really has nothing to do with load-balancing.
> 
> > Some VRRP implementations also add an additional tie-breaker: when multiple nodes are using
> > the same parameters (router ID, VIPs, priority), the node with highest IP becomes master.
> 
> This is actually part of the RFC.
> 
> > Just by design, it's an "if I don't see anyone else trying to do that job, I'll do it" idea and there is no such thing as gaining a vote using some quorum mechanism or a sophisticated election algorithm between multiple nodes.
> > 
> > Whenever your network partitions, you may end with two or more of your balancer nodes claiming to hold the same VIP.
> > 
> > This doesn't need to be a bad thing: probably two of your three balancer nodes will see each other and would gain a majority vote, but they might be on an isolated part of your network without any internet connectivity. So if "both" partitions of your network are active, incoming traffic might arrive at a non-redundant, but still internet-connected balancer with hopefully some realservers in behind, serving incoming requests.
> > 
> > Whenever your network re-unites, those balancer nodes will discover each other again, a single node will keep the VIP and the other ones will release it.
> > 
> > However, there are some pitfalls involved; for example, if the master did try to announce themself to the network routers, but those initial ARP replies were ignored by a too-busy router, so your network may continue sending traffic to your backup balancer node. As the backup balancer does de-configure the VIP from its local interfaces, that box will still receiver, but also happily ignore that incoming traffic: resulting in non-availability of your service.
> > Configuring garp_master_delay in keepalived should take care of this; it keeps the master re-announcing themself to the network routers over and over again.
> > 
> > > I have just tried adding some LVS config, specifying a pool of real servers, so
> > > that I now have a script that is invoked if either quorum is lost or regained.
> > > So I could possibly use that to do what I want, but it feels like I’m going the
> > > road of hackery.
> > 
> > Quorum in that context is way different from VRRP; it's the the amount of available realservers, which is unrelated to the amount of available balancer nodes.
> > 
> > VRRP just takes care about at least one balancer to announce themself to the network and distribute incoming traffic to your realservers (or a sorry_server).
> 
> VRRP does not load balance.
> 
> Ryan
> 
> > Quorum in keepalived is just an extra. When keepalived does see "enough" realserver capacity to serve requests, it'll trigger a script with a custom action. When the capacity drops below a threshold, keepalived will trigger (a different) script, doing some (different) custom action.
> > 
> > What is quorum good for?
> > - you may want to trigger custom actions, whenever there are "too few" realservers available.
> >   A quorum script can notify your monitoring system or trigger a deployment system to
> >   add more (virtual) realservers to your network.
> > 
> > - you may want to announce your balancer not just via ARP, but via a dynamic routing protocol;
> >   for example, you may want to serve the same VIP from multiple data centers using anycast.
> >   A quorum script can reconfigure your local BGP daemon, withdrawing or adding VIP announcements
> >   dynamically, ensuring that requests don't flood a balancer with too few available realservers.
> > 
> > 
> > Anders
> > -- 
> > 1&1 Internet AG              Expert Systems Architect (IT Operations)
> > Brauerstrasse 50             v://49.721.91374.0
> > D-76135 Karlsruhe            f://49.721.91374.225
> > 
> > Amtsgericht Montabaur HRB 6484
> > Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, 
> > Andreas Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, 
> > Jan Oetjen, Christian Würst
> > Aufsichtsratsvorsitzender: Michael Scheeren
> > 
> > ------------------------------------------------------------------------------
> > Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> > Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> > Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> > Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
> > http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Keepalived-devel mailing list
> > Kee...@li...
> > https://lists.sourceforge.net/lists/listinfo/keepalived-devel
-- 
1&1 Internet AG              Expert Systems Architect (IT Operations)
Brauerstrasse 50             v://49.721.91374.0
D-76135 Karlsruhe            f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, 
Andreas Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, 
Jan Oetjen, Christian Würst
Aufsichtsratsvorsitzender: Michael Scheeren