Menu

#639 Memory Leak RabbitMQ

1.10.x
open-fixed
modules (454)
5
2013-05-10
2013-04-04
Digipigeon
No

After upgrading from 1.8.2 to 1.9.x (latest), and also confirming the error on the trunk head. I am getting crashes of opensips:

CRITICAL:core:qm_free: freeing already freed pointer, first free: rabbitmq_send.c: rmq_process(323) - aborting
CRITICAL:core:qm_free: freeing already freed pointer, first free: dlg_profile.c: destroy_linkers(610) - aborting

BT FULL

#0 0x00007fac6d858425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007fac6d85bb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2 0x0000000000503425 in qm_free (qm=<optimised out>, p=0x7fac0be20d80, file=<optimised out>, func=<optimised out>, line=<optimised out>) at mem/q_malloc.c:450
f = <optimised out>
size = <optimised out>
__FUNCTION__ = "qm_free"
#3 0x00007fac63049bb7 in rmq_process (rank=<optimised out>) at rabbitmq_send.c:323
__FUNCTION__ = "rmq_process"
#4 0x00000000004b585d in start_module_procs () at sr_module.c:585
m = 0x7fac6985a850
n = <optimised out>
l = <optimised out>
x = <optimised out>
__FUNCTION__ = "start_module_procs"
#5 0x0000000000414edc in main_loop () at main.c:818
i = <optimised out>
pid = <optimised out>
si = <optimised out>
startup_done = 0x0
chd_rank = 0
rc = <optimised out>
load_p = 0x0
#6 main (argc=<optimised out>, argv=<optimised out>) at main.c:1557
cfg_log_stderr = <optimised out>
cfg_stream = <optimised out>
c = <optimised out>
r = <optimised out>
tmp = 0x7fff847dcf81 ""
tmp_len = <optimised out>
port = <optimised out>
proto = <optimised out>
options = 0x5843d0 "f:cCm:M:b:l:n:N:rRvdDETSVhw:t:u:g:P:G:W:o:"
ret = -1
seed = 2612855874
rfd = -496847072
__FUNCTION__ = "main"

I believe that the problem is related to rabbitmq module, as it does not appear to crash when I don't use enable the module

Discussion

  • Razvan Crainea

    Razvan Crainea - 2013-04-04

    Hi!

    Does this happen at startup or later, at runtime? Do you see any errors before opensips displays the Critical warning? Also, can you confirm that the two Critical messages are from different runs. Finally, please confirm that after removing the rabbitmq you are able to run your platform normally.

    Best regards,
    Răzvan

     
  • Razvan Crainea

    Razvan Crainea - 2013-04-04
    • assigned_to: nobody --> razvancrainea
     
  • Digipigeon

    Digipigeon - 2013-04-04

    Hi,

    The problem does not happen at start-up.
    I haven't noticed any other errors apart from what I have wrote.
    Those two messages were from the same run, they were output just before opensips crashed.
    At present I am 2 hours into running ver 1.9 (excluding rabbitmq), without any crashes or error messages. Previously the crash happened within the first 15 minutes, so I believe that it is stable, but I will update if this instance crashes.

    Regards Jonathan

     
  • Digipigeon

    Digipigeon - 2013-04-08

    I can confirm 4 full days of uptime without rabbitmq module enabled.
    No Crashes, 99% sure that is the cause of it.

     
  • Razvan Crainea

    Razvan Crainea - 2013-04-09

    Hi, Jonathan!

    I have found a small bug in the rabbitmq module. I have attached a patch here, can you please run it and see if this fixes your bug?

    Best regards,
    Răzvan

     
  • Razvan Crainea

    Razvan Crainea - 2013-04-09
     
  • Digipigeon

    Digipigeon - 2013-04-09

    Hello,

    Unfortunately this did not solve the issue, it crashed again:

    last informative log line:

    CRITICAL:core:qm_free: freeing already freed pointer, first free: rabbitmq_send.c: rmq_process(323) - aborting

    BT FULL:

    (gdb) bt full
    #0 0x00007f770cdc5425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
    No symbol table info available.
    #1 0x00007f770cdc8b8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
    No symbol table info available.
    #2 0x0000000000503425 in qm_free (qm=<optimised out>, p=0x7f76aaf1a990, file=<optimised out>, func=<optimised out>, line=<optimised out>) at mem/q_malloc.c:450
    f = <optimised out>
    size = <optimised out>
    __FUNCTION__ = "qm_free"
    #3 0x00007f77025b7597 in rmq_process (rank=<optimised out>) at rabbitmq_send.c:323
    __FUNCTION__ = "rmq_process"
    #4 0x00000000004b585d in start_module_procs () at sr_module.c:585
    m = 0x7f7708dc7850
    n = <optimised out>
    l = <optimised out>
    x = <optimised out>
    __FUNCTION__ = "start_module_procs"
    #5 0x0000000000414edc in main_loop () at main.c:818
    i = <optimised out>
    pid = <optimised out>
    si = <optimised out>
    startup_done = 0x0
    chd_rank = 0
    rc = <optimised out>
    load_p = 0x0
    #6 main (argc=<optimised out>, argv=<optimised out>) at main.c:1557
    cfg_log_stderr = <optimised out>
    cfg_stream = <optimised out>
    c = <optimised out>
    r = <optimised out>
    tmp = 0x7fff64ee8f81 ""
    tmp_len = <optimised out>
    port = <optimised out>
    proto = <optimised out>
    options = 0x5843d0 "f:cCm:M:b:l:n:N:rRvdDETSVhw:t:u:g:P:G:W:o:"
    ret = -1
    seed = 129474967
    rfd = -2118547680
    __FUNCTION__ = "main"

    Regards Jonathan

     
  • Anonymous

    Anonymous - 2013-04-15

    I'm having this exact same bug with the attached patch.

    Crash happens when I load the box up to 300cps. Rabbit called with raise_event in an event route. Low cps doesn't seem to hit, or maybe I'm just not letting it run fast enough.

     

    Last edit: Anonymous 2014-07-30
  • Razvan Crainea

    Razvan Crainea - 2013-04-16

    Hi, Jonathan!

    Thanks to Brett, I was able to see how this bug behaves in a real environment.
    I have attached a new patch that was taken from a fresh 1.9 checkout. It includes both some new changes to the event_rabbitmq modules and the previous ones. Please apply this one and let me know if works with it.

    Best regards,
    Răzvan

     
  • Razvan Crainea

    Razvan Crainea - 2013-04-16

    Includes previous patch

     
  • Razvan Crainea

    Razvan Crainea - 2013-04-19
    • status: open --> open-fixed
     
  • Razvan Crainea

    Razvan Crainea - 2013-04-19

    Hi, Jonathan!

    Have you managed to test this? I will mark this as fixed and come back to it if you encounter any problems.

    Best regards,
    Răzvan

     
  • Digipigeon

    Digipigeon - 2013-04-23

    Sorry for the delay, another bug caused my testing of this one to be delayed.

    I have just had OS running for 1 hour with rabbitmq successfully relaying cdr information and NOT crashing, nice work :)

    I will update if anything changes

    Regards Jonathan

     

Log in to post a comment.