Re: [Keepalived-devel] Split brain due to keepalived apparently freezing in the vrrp master

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Try changing the LACP load balancing mode it sounds like you may not be distributing your traffic properly. Also consider using an alternate check method like tcptraceroute. Generally if MySQL is listening it's working. Also check if the box is swapping or has high IO wait.
An other interesting option for your backype would be to inline stream the output to an other box over ssh and do the compression there. I've done something similar but in reverse for backups ‎where I took file and ran bzip2 on them to stdout and streamed them to a files on a remote over the initiating ssh sessions. It worked really well when doing mulitthreaded backups accross 400 or more servers.

Sent from my BlackBerry 10 smartphone.
  Original Message  
From: Sergio Roysen‎
Sent: Thursday, January 15, 2015 18:47
To: Frank Baalbergen
Cc: kee...@li...
Subject: Re: [Keepalived-devel] Split brain due to keepalived apparently freezing in the vrrp master

Thank you Frank,

These incidents usually happen during two type of occasions:

1) Backups:
IO storage activity is high, and so is CPU activity, since we use 
several cores to compress the resultant file.
These backups take about 1:30 hours. Network traffic during these times 
hovers around 3-6 mbps.

2) Backups copied across the network to a backup server:
Here we do have high network traffic taking place for periods of 1:50 hours.
Average network traffic is 1-1.5 gbps, on a 40 gbps LACP link.

We have never been able to reproduce this behavior in any other group of 
servers.

All the servers in that group have plenty of HW resources to deal with 
the load that they get: We don't have any particular load issues on them.
We've never been alerted by any of our monitors not been able to ssh 
into any one of these servers.

All of these are real servers, not virtual ones.

The mysql_checker script is really simple:

---

#!/bin/bash
MYSQL_PING_TIMEOUT=1
CHECKER_LOG=/dev/null

function log_info() {
NOW_IS=$(date "+%F %H:%M:%S")
echo "$NOW_IS ($$): $1" >> $CHECKER_LOG
}

if [ -f "/etc/keepalived/debug" ]
then
CHECKER_LOG='/var/log/keepalived/reader_vip_checks.log'
fi

CHECK_NUMBER=`cat /tmp/mysql_check.cnt`
CHECK_NUMBER=$((CHECK_NUMBER+1))

log_info "Check number $CHECK_NUMBER"

MYSQL_REPORTED=$(/usr/bin/timeout $MYSQL_PING_TIMEOUT /usr/bin/mysql 
--defaults-file=/root/.my.cnf --host=127.0.0.1 --port=3306 -N -e "SHOW 
VARIABLES LIKE 'read_only'" 2>&1)
RET_ADMIN=$?
if [[ $MYSQL_REPORTED != read_only*ON ]]
then
log_info "Cannot connect to MySQL or it is not in read_only mode"
RET_ADMIN=2
fi
log_info "check_mysql_health will return $RET_ADMIN"
echo -n $CHECK_NUMBER >/tmp/mysql_check.cnt
exit $RET_ADMIN

---

You will notice that we have what could be considered like too big 
interval checks at five seconds each. The reason is that we used to get 
script timeouts from keepalived when we were using values of one or two 
seconds.
I suspect that both type of issues could be connected.

Here are some of the typical syslog entries for those errors:
---
Nov 20 01:31:52 db3 Keepalived_vrrp[5510]: 
VRRP_Script(force_failover_reader) timed out
Nov 20 01:31:52 db3 Keepalived_vrrp[5510]: Process [87701] didn't 
respond to SIGTERM
Nov 20 01:31:52 db3 Keepalived_vrrp[5510]: 
VRRP_Script(force_failover_reader) succeeded
Nov 20 01:43:09 db3 Keepalived_vrrp[5510]: Process [33209] didn't 
respond to SIGTERM
Nov 20 01:43:09 db3 Keepalived_vrrp[5510]: Process [33210] didn't 
respond to SIGTERM
---

Again, thanks for your help.

Sergio

On 2015-01-15 4:09 PM, Frank Baalbergen wrote:
> Hi Sergio,
>
> This is an interesting issue, I run the same keepalived version (Debian)
> and didn’t see this issue. I have some questions to have a better
> understanding of the problem.
>
> - What kind of activity do you mean with “usually at times of higher than
> usual activity on the server”, is that MySQL activity?
> - Is this reproducible when you manually increase the activity on db3?
> - Are there any load issues on the psychical hardware that could block
> there complete virtual machine or are you still able to ssh/use the
> virtual machine?
> - Can you share /usr/local/bin/mysql_checker?
>
> - How is the port utilisation on the switch at high activity peaks?
>
> Regards,
>
> Frank
>
> On 15/01/15 21:46, "Sergio Roysen" <ser...@sh...> wrote:
>
>> We are having a recurrent issue in a group of three servers where one of
>> the backup nodes grabs the virtual ip address because the master fails
>> to send the vrrp advert with its priority for a few seconds and it also
>> fails to run in time the local checks.
>>
>> This results in a split brain scenario until the original master is able
>> to broadcast its status again.
>>
>> This is a recurring issue, happening several times a day, usually at
>> times of higher than usual activity on the server, but not necessarily
>> at periods of higher than usual network traffic.
>>
>> All the servers are running Ubuntu 12.04 (we've seen a similar behavior
>> in another group of servers running Ubuntu 14.04)
>> We are using keepalived version 1.2.13
>>
>> This is the keepalived configuration that we are using:
>> ---
>> global_defs {
>> notification_email {
>> xxx
>> }
>> notification_email_from xxx
>> smtp_server xxx
>> smtp_connect_timeout 30
>> }
>> vrrp_script mysql_died {
>> script "/usr/local/bin/mysql_checker"
>> interval 5
>> fall 6
>> rise 6
>> }
>> vrrp_script force_failover_reader {
>> script "test \! -f /etc/keepalived/trigger.reader.failover"
>> interval 5
>> }
>> vrrp_instance reader_vip_shard_0 {
>> interface bond0
>> state BACKUP
>> virtual_router_id 150
>> priority 101
>> advert_int 1
>> authentication {
>> auth_type PASS
>> auth_pass xxxxx
>> }
>> virtual_ipaddress {
>> 172.16.255.55
>> }
>> track_script {
>> mysql_died weight -40
>> force_failover_reader weight -50
>> }
>> notify "/usr/local/bin/notify_mysql_failover.sh"
>> }
>> ---
>>
>> This is the initial keepalived status at the beginning of the incident:
>> db3: Master. Priority 101
>> db2: Backup. Priority 61
>> db1: Backup. Priority 101
>>
>> What follows is all the data that we were able to capture during this
>> incident.
>>
>> tcpdump captured on db1 filtering all the adverts for 'vrid 150'. Notice
>> the gap of vrrp adverts from db3 beginning at 13:09:29.
>> ---
>> 13:09:28.205765 IP db3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:29.205840 IP db3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:32.811435 IP db1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:32.811530 IP db2 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 61, authtype simple, intvl 1s, length 20
>> 13:09:32.811566 IP db1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:32.811626 IP db2 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 61, authtype simple, intvl 1s, length 20
>> 13:09:32.811646 IP db1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:33.812474 IP db1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:34.001709 IP db3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:34.001751 IP db1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:34.001842 IP db3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> 13:09:35.002100 IP db3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid
>> 150, prio 101, authtype simple, intvl 1s, length 20
>> ---
>>
>> We keep a log of the results of the mysql_checker script used by
>> keepalived.
>> These are the results on db3. There is an eight seconds gap beginning at
>> 13:09:27. The interval between checks should be five seconds.
>> ---
>> 2015-01-15 13:09:17 (66891): Check number 14605
>> 2015-01-15 13:09:17 (66891): check_mysql_health will return 0
>> 2015-01-15 13:09:22 (66973): Check number 14606
>> 2015-01-15 13:09:22 (66973): check_mysql_health will return 0
>> 2015-01-15 13:09:27 (67024): Check number 14607
>> 2015-01-15 13:09:27 (67024): check_mysql_health will return 0
>> 2015-01-15 13:09:35 (67318): Check number 14608
>> 2015-01-15 13:09:35 (67318): check_mysql_health will return 0
>> 2015-01-15 13:09:40 (67427): Check number 14609
>> 2015-01-15 13:09:40 (67427): check_mysql_health will return 0
>> ---
>>
>> Keepalived entries in the syslog of db3 at the time of the incident:
>> ---
>> Jan 15 13:09:34 db3 Keepalived_vrrp[48241]:
>> VRRP_Instance(reader_vip_shard_0) Received lower prio advert, forcing
>> new election
>> Jan 15 13:09:34 db3 Keepalived_vrrp[48241]:
>> VRRP_Instance(reader_vip_shard_0) Received lower prio advert, forcing
>> new election
>> ---
>>
>> Keepalived entries in the syslog of db1 at the time of the incident:
>> ---
>> Jan 15 13:09:32 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Transition to MASTER STATE
>> Jan 15 13:09:32 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Received lower prio advert, forcing
>> new election
>> Jan 15 13:09:32 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Received lower prio advert, forcing
>> new election
>> Jan 15 13:09:33 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Entering MASTER STATE
>> Jan 15 13:09:33 db1 Keepalived_vrrp[39988]: Opening script file
>> /usr/local/bin/notify_mysql_failover.sh
>> Jan 15 13:09:34 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Received higher prio advert
>> Jan 15 13:09:34 db1 Keepalived_vrrp[39988]:
>> VRRP_Instance(reader_vip_shard_0) Entering BACKUP STATE
>> Jan 15 13:09:34 db1 Keepalived_vrrp[39988]: Opening script file
>> /usr/local/bin/notify_mysql_failover.sh
>> ---
>>
>> ---
>>
>> Sergio Roysen
>> Operations - Shopify
>>
>>
>>
>> --------------------------------------------------------------------------
>> ----
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
>> GigeNET is offering a free month of service with a new server in Ashburn.
>> Choose from 2 high performing configs, both with 100TB of bandwidth.
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
>> http://p.sf.net/sfu/gigenet
>> _______________________________________________
>> Keepalived-devel mailing list
>> Kee...@li...
>> https://lists.sourceforge.net/lists/listinfo/keepalived-devel

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Keepalived-devel mailing list
Kee...@li...
https://lists.sourceforge.net/lists/listinfo/keepalived-devel