#557 Crash on force_tcp_conn_lifetime

1.8.x
closed-fixed
core (110)
9
2012-09-21
2012-09-18
saghul
No

After the patch proposed on issue #3546167 we are experiencing crashes with the following backtrace:

Program terminated with signal 11, Segmentation fault.
#0 force_tcp_conn_lifetime (rcv=0x8d1a814, timeout=3610) at tcp_main.c:1180
1180 tcp_main.c: No such file or directory.
in tcp_main.c
(gdb) bt full
#0 force_tcp_conn_lifetime (rcv=0x8d1a814, timeout=3610) at tcp_main.c:1180
con = 0x0
lifetime = 18146
#1 0xb70f5bc4 in update_contacts (_m=0x8d1a720, forced_binding=0x0, _d=0xaf4f3538 "\370\064O\257", _f=0x0, _s=0x0) at save.c:645
ci = <value optimized out>
e_max = <value optimized out>
tcp_check = 1
...

Apparently the connection is not found in the connections hash. Unfortunately I'm not all that familiar with that part of the code to provide a patch. What I did notice though is that all functions which operate on the connections hash hold a lock and this one doesn't. Could that explain this?

Thanks,

Discussion

    • priority: 5 --> 9
    • assigned_to: nobody --> bogdan_iancu
     
  • saghul
    saghul
    2012-09-19

    Since the proxy was crashing constantly I modified the code to look as follows:

    void force_tcp_conn_lifetime(struct receive_info *rcv, unsigned int timeout)
    {
    struct tcp_connection* con = NULL;
    unsigned int lifetime = get_ticks() + timeout;

    con = tcpconn_id_hash[rcv->proto_reserved1];
    if (con) {
    con->lifetime = lifetime;
    } else {
    LM_CRIT("connection not found in force_tcp_conn_lifetime");
    }
    }

    I left it running for a few hours and this is the result:

    $ grep CRITICAL /var/log/syslog.1 |grep "force_tcp_conn_lifetime" | wc -l
    69

    So seems to be quite easy to trigger this behavior though I don't know the exact pattern.

    Please let me know if I can do anything else to help.

     
  • The fix was committed on SVN trunk, 1.8 and 1.7

    Many thanks to Saul for helping in troubleshooting this.

    Regards,
    Bogdan

     
    • status: open --> closed-fixed