#200 nat_traversal TM callback registration

1.7.x
closed-fixed
saghul
modules (179)
5
2012-01-19
2011-12-21
No

We noticed that with 1.7.0 and still with 1.7.1 that we saw increased CPU utilization over time.

We also encountered a seg fault which led us to look closely at the TM REQIN callback list:

Core was generated by `/usr/local/sbin/opensips -P /var/run/opensips.pid'.
Program terminated with signal 11, Segmentation fault.
#0 0x00002b67e330a640 in ?? ()
(gdb) bt
#0 0x00002b67e330a640 in ?? ()
#1 0x00002b67d7f0820f in run_reqin_callbacks (trans=0x2b67e7759520, req=0x8c3248,
code=<value optimized out>) at t_hooks.c:248
#2 0x00002b67d7ef6265 in build_cell (p_msg=0x8c3248) at h_table.c:289
#3 0x00002b67d7f0fe91 in new_t (p_msg=0x8c3248) at t_lookup.c:973
#4 t_newtran (p_msg=0x8c3248) at t_lookup.c:1081
#5 0x00002b67d7f02e4b in t_relay_to (p_msg=0x2b67e7759520, proxy=0x2,
flags=-1059770624) at t_funcs.c:199
#6 0x00002b67d7f14261 in w_t_relay (p_msg=0x8c3248, proxy=0x0, flags=0x0) at tm.c:1129
#7 0x0000000000410970 in do_action (a=0x7f77d8, msg=0x8c3248) at action.c:1280
#8 0x0000000000414906 in run_action_list (a=<value optimized out>, msg=0x8c3248)
...

We didn't conclusively find the seg fault, but we did find the source of the CPU increase.

It appears that the nat_traversal module registers for TM reqin callbacks once per message rather than once during mod_init().

This results in the callback list becoming huge (1.7M elements!) over time.

We moved the registration into mod_init(), and it seems to perform as it should.

I'm including the modified module/nat_traversal.c file from 1.7.1.

Let me know if you have further questions, or if our fix doesn't seem correct.

Thanks much,

Alan Erringer, alan@pinger.com

Discussion

  • Alan Erringer

    Alan Erringer - 2011-12-21

    module/nat_traversal.c file from 1.7.1, mod_init() fix

     
  • saghul

    saghul - 2011-12-22

    Hi Alan,

    Thanks for the report! You are right. There are indeed some issues with nat_traversal which I'm working on, and that is one of them.

    A bigger fix is needed in order to better work with the dialog module, I expect to have it ready by early january.

    Kind regards,

    --
    Saúl Ibarra Corretgé
    AG Projects

     
  • Bogdan-Andrei Iancu

    • assigned_to: nobody --> saghul
     
  • saghul

    saghul - 2012-01-19
    • status: open --> open-fixed
     
  • saghul

    saghul - 2012-01-19

    This is now fixed together with some other issues as of r8669 in trunk and r8674 in the 1.7 branch.

     
  • saghul

    saghul - 2012-01-19
    • status: open-fixed --> closed-fixed
     

Log in to post a comment.