Thread: Re: [Keepalived-devel] Keepalived communication with kernel failing after some time
Status: Beta
Brought to you by:
acassen
|
From: Ronie G. H. <ro...@ro...> - 2011-12-05 04:28:07
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body style="font-family: serif; font-size: 16px;" bgcolor="#ffffff"
text="#000000">
<div style="font-family: serif; font-size: 16px;">Hi,<br>
<br>
I just upgraded our Gentoo to kernel 3.0.6 and I am having the
same problem with keepalived 1.2.2 as reported by Rodrigo Severo
on Wed, 2011-11-09.<br>
<br>
After some debug and research, I found out which kernel changes
are causing this problem. Here it goes:<br>
<br>
kernel 2.6.29 healthcheckers strace:<br>
/etc/init.d/keepalived restart && strace -p `cat
/var/run/checkers.pid` -o /tmp/keepalived_k2.strace<br>
=====<br>
select(1024, [4 6], [], [], {0, 972990}) = 0 (Timeout)<br>
gettimeofday({1323012772, 256053}, NULL) = 0<br>
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 8<br>
setsockopt(8, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0<br>
fcntl(8, F_GETFL) = 0x2 (flags O_RDWR)<br>
fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0<br>
bind(8, {sa_family=AF_UNSPEC,
sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 128) = 0<br>
connect(8, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("10.0.0.1")}, 128) = -1 EINPROGRESS (Operation
now in progress)<br>
...<br>
=====<br>
<br>
<br>
kernel 3.0.6 healthcheckers strace:<br>
/etc/init.d/keepalived restart && strace -p `cat
/var/run/checkers.pid` -o /tmp/keepalived_k3.strace<br>
=====<br>
select(1024, [4 6], [], [], {0, 985079}) = 0 (Timeout)<br>
gettimeofday({1323013791, 409065}, NULL) = 0<br>
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 8<br>
setsockopt(8, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0<br>
fcntl(8, F_GETFL) = 0x2 (flags O_RDWR)<br>
fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0<br>
bind(8, {sa_family=AF_UNSPEC,
sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 128) = -1 EAFNOSUPPORT
(Address family not supported by protocol)<br>
...<br>
=====<br>
<br>
Comparing the bind on both straces, Linux kernel 3.x is
complaining about AF_UNSPEC.<br>
I found some posts related to it:<br>
<a class="moz-txt-link-freetext" href="http://comments.gmane.org/gmane.linux.network/205326">http://comments.gmane.org/gmane.linux.network/205326</a><br>
<br>
<br>
I tested it again with the following changes in Linux kernel
3.0.6, and it worked:<br>
net/ipv4/af_inet.c b/net/ipv4/af_inet.c<br>
int inet_bind(struct socket *sock, struct sockaddr *uaddr, int
addr_len)<br>
{<br>
...<br>
- if (addr->sin_family != AF_INET) {<br>
+ if (addr->sin_family != AF_INET &&
addr->sin_family != AF_UNSPEC) {<br>
err = -EAFNOSUPPORT;<br>
goto out;<br>
}<br>
...<br>
}<br>
<br>
<br>
Would it be possible to fix that AF_UNSPEC to AF_INET in
keepalived 1.2.2 to work with Linux 3.0.x ?<br>
<br>
Thanks,<br>
<div class="moz-signature">-- <br>
</div>
<div style="font: 11px arial,sans-serif; padding-left: 15px;"> <b
style="font-size: 14px;">Ronie Henrich</b> <br>
<a target="nw" href="http://www.git.com.br"><b>GIT - Sistemas
Ltda</b></a> <br>
<a target="nw"
href="http://maps.google.com/maps?q=%2CNovo+Hamburgo%2CRS%2CBrazil&hl=en"><span
style="text-decoration: none; color: rgb(0, 0, 0);">Novo
Hamburgo, RS Brazil</span></a> <br>
</div>
<div style="font: 10px arial,sans-serif; padding-left: 15px;"> <b><a
target="nw" href="http://br.linkedin.com/in/roniegh">http://br.linkedin.com/in/roniegh</a></b>
<br>
<a target="nw" href="http://www.linkedin.com/e/wwk/11684020/">See
who we know in common</a>
</div>
</div>
</body>
</html>
|
|
From: Vincent B. <be...@lu...> - 2011-12-05 09:25:15
Attachments:
bind-afunspec.patch
|
On Sun, 04 Dec 2011 23:00:43 -0500, Ronie Gilberto Henrich wrote:
> I just upgraded our Gentoo to kernel 3.0.6 and I am having the same
> problem with keepalived 1.2.2 as reported by Rodrigo Severo on Wed,
> 2011-11-09.
>
> After some debug and research, I found out which kernel changes are
> causing this problem. Here it goes:
>
> bind(8, {sa_family=AF_UNSPEC, sa_data=""}, 128) = -1 EAFNOSUPPORT
> (Address family not supported by protocol)
Hi!
Could you try this patch?
|
|
From: Ronie G. H. <ro...@ro...> - 2011-12-05 14:39:37
|
Hi,
After applying the bind-afunspec.patch, healthcheckers does not start anymore.
Is ipv4 sa_family being set to AF_INET at some point in healthcheckers?
keepalived/vrrp/vrrp.c sets ipv4 sin_family to AF_INET and ipv6 sin_family to AF_INET6:
=====
/* send VRRP packet */
static int
vrrp_send_pkt(vrrp_rt * vrrp)
{
...
/* Sending path */
if (vrrp->family == AF_INET) {
memset(&dst4, 0, sizeof(dst4));
dst4.sin_family = AF_INET;
dst4.sin_addr.s_addr = htonl(INADDR_VRRP_GROUP);
msg.msg_name = &dst4;
msg.msg_namelen = sizeof(dst4);
} else if (vrrp->family == AF_INET6) {
memset(&dst6, 0, sizeof(dst6));
dst6.sin6_family = AF_INET6;
dst6.sin6_port = htons(IPPROTO_VRRP);
dst6.sin6_addr.s6_addr16[0] = htons(0xff02);
dst6.sin6_addr.s6_addr16[7] = htons(0x12);
msg.msg_name = &dst6;
msg.msg_namelen = sizeof(dst6);
}
...
}
=====
Is sin_family being set in the same way as above for healthcheckers too?
*Ronie Henrich*
*GIT - Sistemas Ltda* <http://www.git.com.br>
Novo Hamburgo, RS Brazil <http://maps.google.com/maps?q=%2CNovo+Hamburgo%2CRS%2CBrazil&hl=en>
*http://br.linkedin.com/in/roniegh*
See who we know in common <http://www.linkedin.com/e/wwk/11684020/>
Vincent Bernat wrote:
> On Sun, 04 Dec 2011 23:00:43 -0500, Ronie Gilberto Henrich wrote:
>
>> I just upgraded our Gentoo to kernel 3.0.6 and I am having the same
>> problem with keepalived 1.2.2 as reported by Rodrigo Severo on Wed,
>> 2011-11-09.
>>
>> After some debug and research, I found out which kernel changes are
>> causing this problem. Here it goes:
>>
>> bind(8, {sa_family=AF_UNSPEC, sa_data=""}, 128) = -1 EAFNOSUPPORT
>> (Address family not supported by protocol)
>
> Hi!
>
> Could you try this patch?
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
>
>
> _______________________________________________
> Keepalived-devel mailing list
> Kee...@li...
> https://lists.sourceforge.net/lists/listinfo/keepalived-devel
|
|
From: Vincent B. <be...@lu...> - 2011-12-05 16:24:12
|
On Mon, 05 Dec 2011 09:39:20 -0500, Ronie Gilberto Henrich wrote: > Hi, > > After applying the bind-afunspec.patch, healthcheckers does not start > anymore. > > Is ipv4 sa_family being set to AF_INET at some point in > healthcheckers? Yes, it is done in inet_stosockaddr(). It is a bit odd that my patch prevents healthcheckers to start since it should only prevent bind() which is not used in your case (except if you use bindto directive? But in this case, it was not empty). Which healthchecker are you using with which options? |
|
From: Ronie G. H. <ro...@ro...> - 2011-12-05 17:34:38
|
I am using http, misc, ssl, smtp and tcp.
Here is my conf for imap (tcp healthchecker):
=====
virtual_server 100.100.100.100 143 {
delay_loop 60
lb_algo wrr
lb_kind NAT
persistence_timeout 900
protocol TCP
real_server 10.0.0.1 143 {
weight 1
TCP_CHECK {
connect_timeout 10
}
}
real_server 10.0.0.2 143 {
weight 1
TCP_CHECK {
connect_timeout 10
}
}
}
=====
Can this be the reason healthcheckers does not start if we prevent bind() ?
http://pubs.opengroup.org/onlinepubs/007904875/functions/connect.html
If the socket has not already been bound to a local address, /connect/() shall bind it to an address which, unless the socket's address family is AF_UNIX, is an unused local address.
*Ronie Henrich*
*GIT - Sistemas Ltda* <http://www.git.com.br>
Novo Hamburgo, RS Brazil <http://maps.google.com/maps?q=%2CNovo+Hamburgo%2CRS%2CBrazil&hl=en>
*http://br.linkedin.com/in/roniegh*
See who we know in common <http://www.linkedin.com/e/wwk/11684020/>
Vincent Bernat wrote:
> On Mon, 05 Dec 2011 09:39:20 -0500, Ronie Gilberto Henrich wrote:
>> Hi,
>>
>> After applying the bind-afunspec.patch, healthcheckers does not start
>> anymore.
>>
>> Is ipv4 sa_family being set to AF_INET at some point in
>> healthcheckers?
> Yes, it is done in inet_stosockaddr(). It is a bit odd that my patch
> prevents healthcheckers to start since it should only prevent bind()
> which is not used in your case (except if you use bindto directive? But
> in this case, it was not empty). Which healthchecker are you using with
> which options?
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Keepalived-devel mailing list
> Kee...@li...
> https://lists.sourceforge.net/lists/listinfo/keepalived-devel
|
|
From: Ronie G. H. <ro...@ro...> - 2011-12-07 01:37:15
|
Hi Vincent,
I applied the bind-afunspec.patch in a wrong way, resulting in another patch (do-not-need-kernel-sources.patch) to be ignored.
After fixing it, I recompiled keepalived (with bind-afunspec.patch), and now healthcheckers is working again. Thanks a lot and sorry for my mistake!
When I was looking at keepalived sources, I found 'if ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) &&' in include/check_data.h
Shouldn't it be 'if ((a1->sin_addr.s_addr == a2->sin_addr.s_addr) &&'
include/check_data.h
if ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) &&
(a1->sin_port == a2->sin_port))
return 1;
Thanks again, I really appreciate your help,
*Ronie Henrich*
*GIT - Sistemas Ltda* <http://www.git.com.br>
Novo Hamburgo, RS Brazil <http://maps.google.com/maps?q=%2CNovo+Hamburgo%2CRS%2CBrazil&hl=en>
*http://br.linkedin.com/in/roniegh*
See who we know in common <http://www.linkedin.com/e/wwk/11684020/>
Vincent Bernat wrote:
> On Mon, 05 Dec 2011 09:39:20 -0500, Ronie Gilberto Henrich wrote:
>> Hi,
>>
>> After applying the bind-afunspec.patch, healthcheckers does not start
>> anymore.
>>
>> Is ipv4 sa_family being set to AF_INET at some point in
>> healthcheckers?
> Yes, it is done in inet_stosockaddr(). It is a bit odd that my patch
> prevents healthcheckers to start since it should only prevent bind()
> which is not used in your case (except if you use bindto directive? But
> in this case, it was not empty). Which healthchecker are you using with
> which options?
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Keepalived-devel mailing list
> Kee...@li...
> https://lists.sourceforge.net/lists/listinfo/keepalived-devel
|
|
From: Vincent B. <be...@lu...> - 2011-12-07 09:21:01
|
On Tue, 06 Dec 2011 20:36:35 -0500, Ronie Gilberto Henrich wrote: > I applied the bind-afunspec.patch in a wrong way, resulting in > another patch (do-not-need-kernel-sources.patch) to be ignored. > After fixing it, I recompiled keepalived (with bind-afunspec.patch), > and now healthcheckers is working again. Thanks a lot and sorry for > my > mistake! I was a bit puzzled since I wasn't able to find where an unitialized sockaddr could have been used! > When I was looking at keepalived sources, I found 'if > ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) &&' in > include/check_data.h > Shouldn't it be 'if ((a1->sin_addr.s_addr == a2->sin_addr.s_addr) &&' > > include/check_data.h > if ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) && > (a1->sin_port == a2->sin_port)) > return 1; Good catch. This bug would lead to reload without effects. |
|
From: Ronie G. H. <ro...@ro...> - 2011-12-07 13:07:02
|
When would the next release (including bind-afunspec.patch) be available? *Ronie Henrich* *GIT - Sistemas Ltda* <http://www.git.com.br> Novo Hamburgo, RS Brazil <http://maps.google.com/maps?q=%2CNovo+Hamburgo%2CRS%2CBrazil&hl=en> *http://br.linkedin.com/in/roniegh* See who we know in common <http://www.linkedin.com/e/wwk/11684020/> Vincent Bernat wrote: > On Tue, 06 Dec 2011 20:36:35 -0500, Ronie Gilberto Henrich wrote: > >> I applied the bind-afunspec.patch in a wrong way, resulting in >> another patch (do-not-need-kernel-sources.patch) to be ignored. >> After fixing it, I recompiled keepalived (with bind-afunspec.patch), >> and now healthcheckers is working again. Thanks a lot and sorry for >> my >> mistake! > I was a bit puzzled since I wasn't able to find where an unitialized > sockaddr could have been used! > >> When I was looking at keepalived sources, I found 'if >> ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) &&' in >> include/check_data.h >> Shouldn't it be 'if ((a1->sin_addr.s_addr == a2->sin_addr.s_addr) &&' >> >> include/check_data.h >> if ((a1->sin_addr.s_addr == a1->sin_addr.s_addr) && >> (a1->sin_port == a2->sin_port)) >> return 1; > Good catch. This bug would lead to reload without effects. > > ------------------------------------------------------------------------------ > Cloud Services Checklist: Pricing and Packaging Optimization > This white paper is intended to serve as a reference, checklist and point of > discussion for anyone considering optimizing the pricing and packaging model > of a cloud services business. Read Now! > http://www.accelacomm.com/jaw/sfnl/114/51491232/ > _______________________________________________ > Keepalived-devel mailing list > Kee...@li... > https://lists.sourceforge.net/lists/listinfo/keepalived-devel |
|
From: Alexandre C. <ac...@fr...> - 2011-12-07 13:31:50
|
On Wed, 2011-12-07 at 08:06 -0500, Ronie Gilberto Henrich wrote: > When would the next release (including bind-afunspec.patch) be available? Next release scheduled for January. Best regs, Alexandre |