Menu

#561 Problems in DNS-SRV handling (multiple hosts in the answer)

ver devel
closed-fixed
modules (357)
5
2009-04-29
2009-03-10
Anonymous
No

I've done several test with different scenarios:
>
> 1) including 4 hosts in the answer of the SRV reply message; the 1st and 2nd host are unreachable, the third an fourth are reachable - the SIP message never reached the target (= host 3)
>
> 2) including 4 hosts in the answer of the SRV reply message; the SIP service is not running on the first and second host; the third and fourth have the service activated - the SIP message is transmitted to the second host after the fr_timeout of the TM module, but is not forwarded to the target (= host 3); it is retransmitted to host 2 until SIP timeout
>
> The conclusion is: when only on master and slave are in the answer of the SRV reply message it works fine. But starting with the third host the service hangs on the second host and does not try reaching an alternative target (as listed in the SRV response).
>
> In the syslog I can only see entries like this (but no ERROR message):
>
> Mar 9 15:36:21 lennysrv /usr/local/sbin/kamailio[25041]: DBG:core:mk_proxy: doing DNS lookup...
> Mar 9 15:36:21 lennysrv /usr/local/sbin/kamailio[25041]: DBG:core:a2dns_node: storing kamailio2.test.loc:5080
> Mar 9 15:36:21 lennysrv /usr/local/sbin/kamailio[25041]: DBG:core:a2dns_node: storing kamailio3.test.loc:5060
> Mar 9 15:36:26 lennysrv /usr/local/sbin/kamailio[25043]: DBG:core:mk_proxy: doing DNS lookup...
> Mar 9 15:36:28 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️is_3263_failure: dns-failover test: branch=0, last_recv=408, flags=1
> Mar 9 15:36:28 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️t_should_relay_response: trying DNS-based failover
> Mar 9 15:36:28 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️do_dns_failover: new destination available
> Mar 9 15:36:31 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️is_3263_failure: dns-failover test: branch=1, last_recv=408, flags=1
> Mar 9 15:36:31 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️t_should_relay_response: trying DNS-based failover
> Mar 9 15:36:31 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️do_dns_failover: new destination available
> Mar 9 15:36:34 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️is_3263_failure: dns-failover test: branch=2, last_recv=408, flags=1
> Mar 9 15:36:34 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️t_should_relay_response: trying DNS-based failover
> Mar 9 15:36:34 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️do_dns_failover: new destination available
> Mar 9 15:36:37 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️is_3263_failure: dns-failover test: branch=3, last_recv=408, flags=1
> Mar 9 15:36:37 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️t_should_relay_response: trying DNS-based failover
> Mar 9 15:36:37 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️do_dns_failover: new destination available
> Mar 9 15:36:40 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️is_3263_failure: dns-failover test: branch=4, last_recv=408, flags=1
> Mar 9 15:36:40 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️t_should_relay_response: trying DNS-based failover
> Mar 9 15:36:40 lennysrv /usr/local/sbin/kamailio[25047]: DBG™️do_dns_failover: new destination available
> Mar 9 15:36:43 lennysrv /usr/local/sbin/kamailio[25045]: DBG™️is_3263_failure: dns-failover test: branch=5, last_recv=487, flags=1
>
> The syslog output seems to be okay, because kamailio theoretically makes a dns_failover. But practically it is only sent to the first and second target.
>
>
> Do you have any other idea what could be wrong? I haven't found any specific parameters that could influence this behaviour of kamailio....

Discussion

  • Iñaki Baz Castillo

    Any comment on it? It seems an interesting bug.

     
  • Alex Hermann

    Alex Hermann - 2009-04-24

    I tried debugging this, but got entangled

    I applied a simple patch to enable more logging and from the logs it seems kamailio is always trying to evaluate the same dns_node. I'll post patch and logs to the mailinglist, because attaching files in SF seems impossible *again*

     
  • Alex Hermann

    Alex Hermann - 2009-04-24

    Ok, as usual, the ideas only arrive after posting...

    Below is a patch wich adds some more debugging, and as a bonus, in the last chunk it actually fixes the issue for me. I hereby request that someone familiar with the code checks if I added the increment at the right spot. If it seems right, please apply the last chunk to SVN.

    Index: kamailio-speakup-1.4/resolve.c

    --- kamailio-speakup-1.4.orig/resolve.c 2009-04-23 16:29:12.000000000 +0200
    +++ kamailio-speakup-1.4/resolve.c 2009-04-24 12:29:34.000000000 +0200
    @@ -748,6 +748,8 @@
    n->vals[l].ival = get_srv(r)->port;
    n->vals[l].sval = p;
    memcpy( p, get_srv(r)->name, get_srv(r)->name_len );
    + LM_DBG("nodes: %p node: %p idx: %d vals: %p sval: %s ival: %d",
    + dn, n, l, n->vals[l], n->vals[l].sval, n->vals[l].ival);
    LM_DBG("storing %.*s:%d\n", get_srv(r)->name_len,p,n->vals[l].ival);
    p += get_srv(r)->name_len;
    *(p++) = 0;
    @@ -1289,6 +1291,9 @@
    n = *node;
    last_srv = NULL;
    he = 0;
    +
    + LM_DBG("nodes: %p node: %p idx: %d vals: %p sval: %s ival: %d",
    + node, *node, n->idx, n->vals[n->idx], n->vals[n->idx].sval, n->vals[n->idx].ival);

    do {
    switch (n->type) {
    @@ -1344,6 +1349,7 @@
    shm_free(last_srv);
    *node = 0;
    }
    + n->idx++;
    return he;
    break;
    default:

     
  • Henning Westerholt

    patch with whitespace fixes

     
  • Henning Westerholt

    • assigned_to: nobody --> henningw
    • status: open --> closed-fixed
     
  • Henning Westerholt

    Hi Alex,
    your patch was applied to the 1.5 branch. I did not found any problems so far in my tests.

    Thanks,
    Henning

     

Log in to post a comment.

MongoDB Logo MongoDB