#546 opensips crash with TLS

1.7.x
closed-fixed
core (110)
5
2012-10-30
2012-07-31
Dragos Oancea
No

Hi all,

opensips 1.7.2 crashes when using TLS with create_dialog("Pp") in the routing script - that would send OPTIONS (nat ping) both to caller and callee during a dialog .

The TLS-related relevant lines in the routing script are:

tls_verify_server = 1
tls_verify_client = 0
tls_require_client_certificate = 0
tls_method = TLSv1
tls_certificate = "/etc/pki/CA/certs/x.crt"
tls_private_key = "/etc/pki/CA/private/x.key"
tls_ca_list = "/etc/pki/CA/certs/ca.crt"

listen = tls:X.X.X.X:5061
listen = tcp:X.X.X.X:5060

syslog:
http://pastebin.com/Kkdns7Cr

backtrace:

http://pastebin.com/7P4ADL9y

Apparently it crashes just after trying to send an OPTIONS or BYE to a device that is not there anymore (it's not on the socket opensips expects it to be - opensips
usually generates a "477 SendFailed" reply in situations like this) .

and interesting enough, if I add an udp port to listen to with "listen=udp:X.X.X.X:5060" does not crash anymore.

Regards,
Dragos

Discussion

    • assigned_to: nobody --> vladut-paiu
    • status: open --> open-accepted
     
    • assigned_to: vladut-paiu --> bogdan_iancu
    • priority: 5 --> 7
     
  • Hi,

    I see here 2 issues - one is the crash itself (which seems to be a memory corruption) ; second one is related to pinging, which seems not to choose the right interface (selects a UDP one instead TLS).

    I suggest first trying to identify the mem issue, and for this you need to recompile with memory debugging support (http://www.opensips.org/Resources/DocsTsMem , set memlog=6, memdump=1) . most probably the interface issues triggers some bogus mem ops..

    Regards,
    Bogdan

     
  • Probably related to 3522861

     
    • priority: 7 --> 5
     
  • Dragos Oancea
    Dragos Oancea
    2012-08-16

    Hi

    I think there is only one problem.
    so because the OPTIONS is sent to the wrong interface , no reply will come back , and opensips will generate and send a BYE bothways. but in my case everything is running fine..with OPTIONS being sent to the right places at first , then something happends (memory corruption) , and maybe the function that does the pinging is first to access some unallocated memory.
    The crash could also happen when the callee or the caller sends BYE.

    Some extra informations:

    There are mobile devices under NAT running on TCP or TLS involved in this whole scenario. So when the mobile device is not there anymore (it ran out of battery durring a call for example) , the crash is most likely to happen.
    Also , I noticed that there is no problem if I only listen to tls (not listening on tcp, not listening on udp).
    But I need tcp , so I cannot disable it.

    another gdb backtrace :

    http://pastebin.com/aXgABJtE

    I hope to replicate this in a controlled environment with memlog/memdump soon and let you.

     
  • Dragos Oancea
    Dragos Oancea
    2012-08-29

    Hi

    It happened again. This time I had only two phones registered via TLS and I was just making a call between them.

    gdb:
    http://pastebin.com/1miGb7ct

    log (stderr):
    http://pastebin.com/bvDjLzPh

    can someone confirm it is related to 3522861 ?

    Cheers,
    Dragos

     
  • Hi Dragos,

    Is this still happening with the latest code from 1.8 SVN branch ?

    Regards,
    Bogdan

     
  • Dragos Oancea
    Dragos Oancea
    2012-10-30

    Hi Bogdan,

    I am running 1.8.1 with the tls_init.c patch and the dlg_unref.patch ( bug-id 3570495 ) .
    I had no crash anymore.
    However, I cannot be 100% sure that the bug is not there anymore because I had to make some changes in the routing script.
    When it was crashing (and not patched) I had :
    modparam("dialog", "ping_interval", 20)
    modparam("tm", "fr_timer", 15)
    Few days after I patched it had to change these values to (because opensips was dropping calls - the reply to OPTIONS from the clients was not coming due to network issues ):
    modparam("dialog", "ping_interval", 40)
    modparam("tm", "fr_timer", 30)
    So I ran about 3 days with the old config and the patched version and then I changed these params , and it is running like this since then (about a month I think) .

    #opensips -V
    version: opensips 1.8.1-tls (x86_64/linux)
    flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, DISABLE_NAGLE, USE_MCAST, SHM_MEM, SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
    ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
    poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
    svnrevision: unknown
    @(#) $Id: main.c 8772 2012-03-08 11:16:13Z bogdan_iancu $
    main.c compiled on 11:49:51 Oct 15 2012 with gcc 4.4.6

    Regards,
    Dragos

     
    • status: open-accepted --> closed-fixed
     
  • OK, let's consider this fixed for the moment, if it does not crash anymore - if the problem pops up again, just open a new ticket.

    Regards,
    Bogdan