#640 [DROUTING] Crash on use_next_gw/get_gw_by_id

trunk
closed-fixed
modules (454)
5
2013-05-27
2013-04-04
Ronald Cepres
No

Drouting crashes when selecting next gateway. Did a little investigation, and FWIW the next gateway's carrier status is disabled but the carrier's only gateway is enabled. Looked at the backtrace of the core dump, and found that it crashed while comparing two strings on get_gw_by_id called by use_next_gw. The strings compared were apparently GW ID strings.

I attached the GDB btfull logs (replaced some sensitive info with dummy text) for reference. Take note that 'dont optimize' flag was not set so some of the values were optimized and the crash happened randomly so I can't actually reproduce the crash.

I'm using Opensips 1.9 using this source tarball: http://opensips.org/pub/opensips/latest/src/opensips-1.9.0_src.tar.gz

Discussion

  • Ronald Cepres
    Ronald Cepres
    2013-04-04

    GDB btfull logs (replaced some sensitive info with dummy text)

     
  • Hi Ronald,

    Is there a way to reproduce this bug (like starting from a certain set of gw/carriers, etc) ? your details on the setup are not really clear for me.

    Do you still have the core file? could I ask you to extract more info from there ?

    Regards,
    Bogdan

     
    • assigned_to: nobody --> bogdan_iancu
     
  • Ronald Cepres
    Ronald Cepres
    2013-04-12

    Hi Bogdan,

    Thanks for the reply.

    For the destinations during that call, we have 6 carriers setup such as: #C1,#C2,#C3,#C4,#C5,#C6.
    The crash happened when selecting the next gateway from C1 to C2.
    C1 has gateways C1G1,C1G2,C1G3,C1G4,C1G5,C1G6, where C1G3 and C1G4 are disabled via MI. C2 has only gateway C2G1.

    The call failed for C1G1,C1G2,C1G6,C1G5 (in order and weighted). The crash happened during the transition from C1G5 to C2G1.

    I still have the core file. What specific info would you like to extract?

     
  • Hi Ronald,

    So , when calling use_next_gw, the C2G1 should be next in the line.

    In frame 1, print *id, gw, *gw and gw->id .
    Also, in script (if you can reproduce it), do a print_avp(); before the use_next_gw().

    Thanks and regards,
    Bogdan

     
  • Ronald Cepres
    Ronald Cepres
    2013-05-16

    Hi Bogdan,

    I added a log before the string comparison such that:

    LM_CRIT("RONALD: id->s=%s | gw->id.s=%s | id->len=%d | gw->id.len=%d\n", id->s, gw->id.s, id->len, gw->id.len);
    if ( (id->len == gw->id.len) && strncmp(id->s, gw->id.s, id->len)==0 ) {
    return gw;
    }

    I think opensips now crashes trying to print the log that I made, probably because of trying to access an out-of-bound memory address when printing the values.

    I attached the log output of the process that crashed. Again, I modified the log just to remove the sensitive info but if you want the raw logs and the new gdb output, i can send them to you privately.

    Thanks.

     
    Last edit: Ronald Cepres 2013-05-16
    Attachments
  • Please test the attached patch.

    Thanks and regards,
    Bogdan

     
    Attachments
    • status: open --> open-fixed
    • Group: 1.10.x --> 1.4.x
     
    • status: open-fixed --> closed-fixed
    • Group: 1.4.x --> trunk
     
  • The fix was pushed on the GIT and SVN repos, so it is now part of the official code. (see GIT commit d329bc)

    Regards,
    Bogdan