[Keepalived-announce] keepalived + NIC bonding
Status: Beta
Brought to you by:
acassen
|
From: John S. <jsk...@ra...> - 2015-07-04 02:04:05
|
Good Morning,
I'm wondering if anyone has any advice on configuration when using
keepalived plus the bonding of ethernet devices on a server. I've
discovered a problem with our implementation where despite an IP address
being applied to a bond interface, he's not receiving traffic. I'm still
in the midst of troubleshooting and suspect it may be arp related. Below
I've pasted our scenario with sanitized configs. In this case, IP address
10.1.1.102 does not respond to any traffic. Here's the important part of
our keepalived config:
vrrp_instance test {
state BACKUP
interface bond0
virtual_router_id 254
nopreempt
priority 02
advert_int 1
virtual_ipaddress {
10.1.1.101 dev bond0
10.1.1.102 dev bond0
}
}
In our case we've got quite a few NIC's that participate in a bond0
interface.
DEVICE="bond0"
BOOTPROTO="static"
GATEWAY="10.1.1.1"
IPADDR="10.1.1.100"
IPV6INIT=no
MTU="1500"
NETMASK="255.255.255.0"
NM_CONTROLLED=no
ONBOOT="yes"
TYPE="Ethernet"
BONDING_OPTS="mode=balance-alb miimon=100"
Using 'adaptive-load-balancing' I believe might be one portion of the
problem. The IP's get assigned to the expected interface:
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
link/ether 14:fe:99:cb:73:43 brd ff:ff:ff:ff:ff:ff
inet 10.1.1.100/24 brd 10.1.1.255 scope global bond0
inet 10.1.1.101/24 brd 10.1.1.255 scope global bond0
inet 10.1.1.102/24 brd 10.1.1.255 scope global bond0
inet6 fe80::16fe:b5ff:fecb:7343/64 scope link
valid_lft forever preferred_lft forever
But when looking at the arp tables on our switches we end up with a varied
set of MAC addresses to IP addresses:
On the switch (switch1) who's active slave NIC is plugged into we get
something like this:
Internet 10.1.1.100 32 14fe.99cb.7343 ARPA Vlan10
Internet 10.1.1.101 2 14fe.99cb.7343 ARPA Vlan10
Internet 10.1.1.102 2 14fe.9944.aa9e ARPA Vlan10
On the switch (switch2) who's inactive slave NIC is plugged into we get
something like this:
Internet 10.1.1.100 33 14fe.99cb.7343 ARPA Vlan10
Internet 10.1.1.101 3 14fe.99cb.7343 ARPA Vlan10
Internet 10.1.1.102 3 14fe.99cb.7343 ARPA Vlan10
Notice that on switch1, he's got a different MAC address for the
troublesome IP address. This is the MAC address applied to the slave
interface, and not that of the bond0 interface.
I'm struggling to put together an appropriate packet capture. All I end up
seeing is something such as this:
13:03:11.270093 ARP, Request who-has 10.1.1.102 (ff:ff:ff:ff:ff:ff) tell
10.1.1.102, length 46
This single message is repeated about 10 times, in groups of 5 about 4
seconds apart.
I'm sure there's more information that can be gathered to assist with
troubleshooting, I'm all ears.
Secondly, is this the appropriate place to ask such a question? I wasn't
able to find another mailing list.
-- John
|