Menu

#259 deadlock on pua presentity lock

1.6.x
closed-invalid
modules (454)
1
2010-03-02
2010-02-09
No

Hi,

when sending the PUBLISH messages to self, using:
modparam("pua_dialoginfo", "presence_server", "sip:a.b.c.d:5060") # send PUBLISH messages to self
I get a deadlock of the entire opensips daemon.

If I send the messages to somewhere else, the deadlock does not occur.

If I release the locks as per the attached patch, the deadlock doesn't occur either.
But according to the comments in the code, the lock is not meant to be freed just yet. So I'm not sure what my patch breaks instead. (I'm not even sure I'm supposed to publish to self, but it seems like it, as I want to generate notifies based on the state changes found in the publish.)

I'm running 1.6-svn (r6568)

Regards,
Walter Doekes
OSSO B.V.

Discussion

  • Walter Doekes

    Walter Doekes - 2010-02-09

    Patch to work around deadlock

     
  • Anca Vamanu

    Anca Vamanu - 2010-02-09

    Hi Walter,

    That is not the right fix, because that lock must actually be kept until the reply is received. I want to reproduce this myself to be able to investigate. I don't understand exactly which is the case when you get deadlock. You subscribe for dialog event to yoursefl? Is this it?

    Regards,
    Anca

     
  • Anca Vamanu

    Anca Vamanu - 2010-02-09
    • assigned_to: nobody --> anca_vamanu
    • status: open --> open-accepted
     
  • Walter Doekes

    Walter Doekes - 2010-02-12

    Hi Anca,

    Thanks for your quick reply.

    I've tried to reduce/declutter my config file to make it easily reproducable, but during this reduction the problem goes away. I can hardly give you the full config file as it's a bit of a complex hack, and incomplete at that ;)

    Yes, what I want to do is implement BLF on OpenSIPS. To do this, I do basically this:

    route {
    if (uri == myself) {
    if ($si == "e.f.g.h") { # my IP
    if (method == "PUBLISH") {
    handle_publish("sip:myself@myself.myself");
    exit;
    }
    }
    if (method == "SUBSCRIBE") {
    handle_subscribe();
    exit;
    }
    if (method == "INVITE") {
    dialoginfo_set();
    record_route();
    }
    if ($si == "a.b.c.d") {
    $var(local) = "sip:" + $(hdr(X-PIDstAccount)[-1]) + "@anydomain";
    lookup("opensips_location", "", "$var(local)");
    } else {
    $rd = "a.b.c.d";
    }
    t_relay();
    }
    }

    (Add a bit of nat handling, registration handling and transaction handling.)

    Modules loaded are:

    loadmodule "xlog.so" # logging (xlog)
    loadmodule "sl.so" # stateless functions (sl_*)
    loadmodule "tm.so" # t_*: transactions in memory
    loadmodule "signaling.so" # send reply according to state

    loadmodule "rr.so" # record route
    loadmodule "nathelper.so" # fix_nated_*()

    loadmodule "textops.so" # append_hf
    loadmodule "uri.so" # has_totag

    loadmodule "db_mysql.so" # mysql
    loadmodule "auth.so" # digest auth
    loadmodule "auth_db.so" # digest auth
    loadmodule "group.so" # groups (for acls)
    loadmodule "permissions.so" # permissions (provide acls together with groups)

    loadmodule "usrloc.so" # user location

    loadmodule "registrar.so" # lookup/save/registered

    loadmodule "dialog.so" # ...
    loadmodule "presence.so" # handle SUBSCRIBE events
    loadmodule "presence_dialoginfo.so" # handle SUBSCRIBE events for dialoginfo
    loadmodule "presence_xml.so" # handle SUBSCRIBE events for dialoginfo
    loadmodule "pua.so" # ...
    loadmodule "pua_dialoginfo.so" # ...

    ...
    modparam("pua_dialoginfo", "presence_server", "sip:e.f.g.h:5060") # send PUBLISH messages to self

    Now, it's quite possible that I'm doing things wrong. My grandstream test phone has not answered at all to the opensips NOTIFY's sent by handle_publish().

    I'll try re-adding and reorganising my config file to get back the complete behaviour I want (with or without the deadlock). In the mean time you can consider this report INVALID/WORKSFORME and I can file a new one if the problem re-appears.

    Regards,
    Walter

     
  • Walter Doekes

    Walter Doekes - 2010-02-24

    Furthermore, I can add that I used children=1.

    (Having children as 1 also led me to believe that setting $var()s in startup_route could be used as process-wide constants, which they cannot. This should clear up some more of the odd issues I was having. How would you feel about a process_startup_route? I cannot believe I'm the only one who likes constants at the top of the file.)

     
  • Walter Doekes

    Walter Doekes - 2010-02-24
    • priority: 5 --> 1
     
  • Anca Vamanu

    Anca Vamanu - 2010-02-24
    • status: open-accepted --> closed-invalid
     
  • Anca Vamanu

    Anca Vamanu - 2010-02-24

    Hi Walter,

    The reason you get a deadlock is exactly that you use only one child. The process might get blocked while trying to send a new Publish request and there is no one to handle the reply for the previous one and release the lock. So, it is compulsory to have more children when using pua module.

    Regards,
    Anca

     
  • Walter Doekes

    Walter Doekes - 2010-03-01

    Okay, thank you for that information.

    Isn't this something that should be documented? Or is it already and have I simply missed it?

    Regards,
    Walter

     
  • Walter Doekes

    Walter Doekes - 2010-03-01
    • status: closed-invalid --> open-invalid
     
  • Anca Vamanu

    Anca Vamanu - 2010-03-02
    • status: open-invalid --> closed-invalid
     
  • Anca Vamanu

    Anca Vamanu - 2010-03-02

    Thanks you, Walter. I updated the documentation.

    Regards,
    Anca

     

Log in to post a comment.