Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#145 Messages not Delivered

closed-fixed
nobody
None
5
2011-01-18
2011-01-12
Rob Winch
No

Environment

* Pidgin 2.7.9 with Sipe from master today (I have seen similar issues since the early days of SIPE)
* Ubuntu 10.10 32bit
* Friend is using Communicator 2007 client

I am not sure I fully understand this issue, but I am going to describe it as well as I do understand it. Below is one scenario that I am able to reproduce consistently.

1) I start pidgin and login to SIPE
2) Dustin Singleton starts communicator and logs in
3) Dustin IMs me - this succeeds
4) I message Dustin - this succeeds
5) We both leave the conversation open
6) Dustin unplugs his network cable (he has wireless disabled) and then plugs it back in. NOTE: I never see that he went offline
7) After Dustin signs in he IMs me - this succeeds
8) I try to respond and get an error stating that the message cannot be delivered
9) I restart Pidgin and IM him - this succeeds
10) He IMs me - this succeeds

I have attached two different logs one for 1-8 and the second set is for the remaining logs.

I have seen other similar issues, but the trend seems to be:

* someone gets disconnected/reconnected (either me or the friend). Typically this will happen when moving from a desk to a meeting room.
* the conversation window is left open on at least one end
* I have seen these similar issues with me using SIPE and the friend being on the Communicator 2007 client (both windows and mac), SIPE on Ubuntu

Problems seem to be:

* Cannot IM the other user and get a message indicating that I cannot message them
* Messages from the friend that are sent to me don't make it to me and no error is displayed to the friend
* If the friend disconnects I do not see it change in my window. This may just be because we don't wait long enough though.

To solve it:

* Sometimes I cannot send a message (and I get an error stating it cannot be delievered), the friend is indicated as online, and the only way to fix it is if they restart
* Sometimes I cannot send a message (and I get an error stating it cannot be delievered), the friend is indicated as online, and I restart and things work

I realize this bug is pretty vague, but this is about all the information I have been able to gather and it wasn't until recently that I was able to reproduce anything so that I could gather logs. I will be happy to provide any additional information that I am able to.

PS: Thank you for all the excellent work with SIPE :)

Discussion

  • Rob Winch
    Rob Winch
    2011-01-12

    Two files containing the pidgin debug logs for steps 1-8 and 9-10 of the bug description.

     
    Attachments
  • Stefan Becker
    Stefan Becker
    2011-01-12

    - step 7) we get an INVITE with a record route provided by OCS:

    (10:52:18) sipe:
    MESSAGE START <<<<<<<<<< SIP - 2011-01-12T16:52:18.437005Z
    INVITE sip:10.160.21.152:53193;transport=tls;ms-opaque=d3470f2e1d;ms-received-cid=149D6F00 SIP/2.0
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-from;lr;ms-route-sig=dzqi2d4fNHmN1fJH6RnOiv6AFGyrXfGk_rmGiZ4wAA>;ms-rrsig=dzNCA9i8o_NVN1fGi5oyBUEPgYfeXfGk_rmGiZ4wAA;tag=9A7CA2437657DD6973AE644551224485
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;lr>;tag=8C86D3A25FD33C6D2EEF513034163E70
    Call-ID: 7d7182bf6f0d48abaca2db00c418bbeb

    - SIPE recognizes this as an exisiting conversation:

    (10:52:18) sipe: parsing address out of "Singleton,Dustin"<sip:Dustin.Singleton@abcdef.com>;tag=923a6e52c0;epid=02ed1891d3
    (10:52:18) sipe: got sip:Dustin.Singleton@abcdef.com
    (10:52:18) sipe: sipe_dialog_find who='sip:Dustin.Singleton@abcdef.com'
    (10:52:18) sipe: process_incoming_invite, session already has dialog!

    - SIPE parses the record route information:

    (10:52:18) sipe: sipe_dialog_parse_routes: route sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-from;lr;ms-route-sig=dzqi2d4fNHmN1fJH6RnOiv6AFGyrXfGk_rmGiZ4wAA
    (10:52:18) sipe: sipe_dialog_parse_routes: route sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;lr

    - SIPE sends our OK to the INVITE by using a stripped-down version of the original INVITE message. This includes the record route and is accepted by OCS:

    (10:52:18) sipe:
    MESSAGE START >>>>>>>>>> SIP - 2011-01-12T16:52:18.440717Z
    SIP/2.0 200 OK
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-from;lr;ms-route-sig=dzqi2d4fNHmN1fJH6RnOiv6AFGyrXfGk_rmGiZ4wAA>;ms-rrsig=dzNCA9i8o_NVN1fGi5oyBUEPgYfeXfGk_rmGiZ4wAA;tag=9A7CA2437657DD6973AE644551224485
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;lr>;tag=8C86D3A25FD33C6D2EEF513034163E70
    Call-ID: 7d7182bf6f0d48abaca2db00c418bbeb

    - step 8) SIPE sends MESSAGE using the parsed record route (NOTE the DIFFERENT Call-ID, this one is from the INVITE in step 3!):

    (10:52:24) sipe:
    MESSAGE START >>>>>>>>>> SIP - 2011-01-12T16:52:24.835293Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-from;lr;ms-route-sig=dzqi2d4fNHmN1fJH6RnOiv6AFGyrXfGk_rmGiZ4wAA>
    Call-ID: be1fd4af83fd420785be57c3d4dc34c8
    Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;lr>

    - OCS rejects this message with "malformed route header":

    (10:52:24) sipe:
    MESSAGE START <<<<<<<<<< SIP - 2011-01-12T16:52:24.857000Z
    SIP/2.0 400 Malformed route header
    Authentication-Info: NTLM rspauth="01000000A097D1605579FCBE64000000", srand="91735097", snum="22", opaque="D969687D", qop="auth", targetname="grdocsfe00.abcdef.net", realm="SIP Communications Service"
    From: <sip:Rob.Winch@abcdef.com>;tag=1039825375;epid=f847053fb6a1
    To: <sip:Dustin.Singleton@abcdef.com>;tag=809b4b5452;epid=02ed1891d3
    Call-ID: be1fd4af83fd420785be57c3d4dc34c8
    CSeq: 2 MESSAGE
    Via: SIP/2.0/tls 10.184.60.40:53193;received=10.160.21.152;ms-received-port=53193;ms-received-cid=149D6F00
    ms-diagnostics: 2;reason="See response code and reason phrase";source="grdocsfe00.abcdef.net";HRESULT="C3E93C2E(SIPPROXY_E_ROUTE_SIGNATURE_INVALID)"
    (10:52:24) sipe: process_input_message: msg->response(400),msg->method(MESSAGE)
    (10:52:24) sipe: process_input_message: we have a transaction callback
    (10:52:24) sipe: parsing address out of <sip:Dustin.Singleton@abcdef.com>;tag=809b4b5452;epid=02ed1891d3
    (10:52:24) sipe: got sip:Dustin.Singleton@abcdef.com
    (10:52:24) sipe: sipe_dialog_find who='sip:Dustin.Singleton@abcdef.com'
    (10:52:24) sipe: process_message_response: MESSAGE response >= 400

    IMHO the problem is that we don't update the Call-ID of the existing IM session when we accept the new INVITE. So our MESSAGE is send out with updated Route: but old Call-ID:, which OCS probably no longer recognizes or considers invalid.

     
  • Stefan Becker
    Stefan Becker
    2011-01-12

    Please try if commit e61ef48 fixes your problem.

     
  • Rob Winch
    Rob Winch
    2011-01-13

    Logs after using commit e61ef48

     
  • Rob Winch
    Rob Winch
    2011-01-13

    This did not seem to fix the issue. I have attached a new set of logs with a suffix of the commit. Please note that I did slightly change the order of what was done in that I was the first person to IM this time. I apologize for this, but did not realise it until after I had done it and do not wanted to minimise bothering Dustin. Besides which this would still be a problem even though it is a different flow.

    Again thanks for the excellent support with SIPE and please let me know if there is any more information I can get for you.

     
  • Stefan Becker
    Stefan Becker
    2011-01-13

    The error message is still the same: source="grdocsfe01.abcdef.net";HRESULT="C3E93C2E(SIPPROXY_E_ROUTE_SIGNATURE_INVALID)"

    - In the INVITE we get:
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;ms-role-rs-from;lr;ms-route-sig=aaCwvpUp69vwg1LIV8NyBH1OoyEu8NP_2okyxNLQAA>;tag=8C86D3A25FD33C6D2EEF513034163E70

    - SIPE parses that info:
    (08:01:22) sipe: sipe_dialog_parse_routes: route sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;ms-role-rs-from;lr;ms-route-sig=aaCwvpUp69vwg1LIV8NyBH1OoyEu8NP_2okyxNLQAA

    - SIPE sends that info in the outgoing messages:
    Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-to;ms-role-rs-from;lr;ms-route-sig=aaCwvpUp69vwg1LIV8NyBH1OoyEu8NP_2okyxNLQAA>

    but OCS rejects it with the above error message.

    - Educated guess: ms-route-sig= might need to be recalculated, but I couldn't find any information in [MS-SIPRE] how to do that.
    - The other possibility might be that SIPE needs to add "Supported: ms-dialog-route-set-update" and process 430 responses (those would then be sent by OCS if I understand [MS-SIPRE 4.6] correctly)

    As the documentation doesn't tell anything you'll have to run M$ Communicator on your end too and provide the Communicator-UCCP-Api.log, i.e. the one with the SIP messages, generated from step 1-8. (see FAQ on the wiki: https://sourceforge.net/apps/mediawiki/sipe/index.php?title=FAQ\)

     
  • Stefan Becker
    Stefan Becker
    2011-01-14

    Please try if commit b9909a7 fixes your problem.

    I changed to sipe_dialog_parse_route() to no longer parse out only parts of Record-Route: headers. The logs you provided showed that only routes with an additional parameter (;tag=...) seem to cause the problem. This information got lost in the original implementation.

     
  • Rob Winch
    Rob Winch
    2011-01-14

    The second commit did not appear to fix the issue. I have reattached logs, this time reproduced with Tim instead of Dustin. I will probably be able to get the logs from Communicator next week. It has been a bit tricky finding two people available at the same time to do this for me (I only have Linux).

     
  • Rob Winch
    Rob Winch
    2011-01-14

    Logs after using commit b9909a7

     
  • Rob Winch
    Rob Winch
    2011-01-14

    I probably should have clarified that since I only have Linux I cannot use Communicator to get the logs

     
  • Stefan Becker
    Stefan Becker
    2011-01-15

    It is still the same error message. With the latest code we clone the route as-is:

    incoming (from OCS):
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-from;lr;ms-route-sig=abQmxjVN-sjrVwFPZ1wbpJGUbXgvQnad0AkyxNLQAA>;ms-rrsig=ab0aEDdsd2boGadz8I734j05ICGLQnad0AkyxNLQAA;tag=8C86D3A25FD33C6D2EEF513034163E70
    Record-Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-to;lr>;tag=9A7CA2437657DD6973AE644551224485

    outgoing (from SIPE):
    Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe01.abcdef.net;ms-role-rs-from;lr;ms-route-sig=abQmxjVN-sjrVwFPZ1wbpJGUbXgvQnad0AkyxNLQAA>;ms-rrsig=ab0aEDdsd2boGadz8I734j05ICGLQnad0AkyxNLQAA;tag=8C86D3A25FD33C6D2EEF513034163E70
    Route: <sip:ocs.abcdef.net:5061;transport=tls;ms-fe=grdocsfe00.abcdef.net;ms-role-rs-to;lr>;tag=9A7CA2437657DD6973AE644551224485

    the first proxy still complains:
    ms-diagnostics: 2;reason="See response code and reason phrase";source="grdocsfe01.abcdef.net";HRESULT="C3E93C2E(SIPPROXY_E_ROUTE_SIGNATURE_INVALID)"

    It will be interesting to see the M$ Communicator logs. I wouldn't be surprised to see the same error, as the proxy seems to be confused, but then some undocumented error recovery from the Communicator side.

     
  • Rob Winch
    Rob Winch
    2011-01-17

    MS Communicator Client logs performing steps 1-8

     
  • Rob Winch
    Rob Winch
    2011-01-17

    I was able to get two people to produce the logs for me. Dustin is the user that disconnects his network connection and Micah is the user that would normally fail with sipe. They did not perform steps 9-10 since the response succeeds immediately. Dustin's logs are prefixed with Dustin and Micah's are prefixed with Micah. Please let me know if there is anything else I can do to help with this issue. Once again thanks again for your excellent support.

    PS: Its no big deal since it is an easy google search, but you might consider adding the link on How to enable logs for Communicator 2007 (http://support.microsoft.com/kb/2379086) to the FAQ. This might help since you need to have certain permissions to enable logging. I would have added it myself, but it doesn't appear I am allowed to edit the page (and understandably so) despite being logged in.

     
  • Stefan Becker
    Stefan Becker
    2011-01-18

    Thanks for the logs again. I've identified at least one thing that is different and I'm going to try to put that into code.

    I've also just been able to reproduce the problem with our OCS and a colleagues M$ Communicator. So I'll be able to verify the change faster.

    I've updated the wiki with the KB article URL. The article doesn't say anything about special rights, though, and I'm not aware that you need special rights to change the settings of the Office Communicator client. But maybe that is once again something that differs between the Windows (security) environments.

     
  • Rob Winch
    Rob Winch
    2011-01-18

    This is good news. Thank you for your prompt and excellent support with this issue.

    According to the article... "To enable tracing in Windows Vista, the user must be added to the Performance Log Users group. After the users are added, the users must log off and then log back on for the setting to take effect." I believe you are correct that the behaviour differs between Windows (security) environments. I was required to add the user to that group when I was trying to simulate the issue with a VM using Windows 7. However, I'm pretty sure that the people that ran the simulation did not need to do this when using Windows 7. Perhaps they were already added to that group. Regardless adding a link to the article probably doesn't hurt anything.

    Once again thank you for all your hard work :)

     
  • Stefan Becker
    Stefan Becker
    2011-01-18

    Please try if commit f901a41 fixes your problem.

    I have been able to reproduce the situation once with the new code and now it worked OK. I believe your problem should be fixed.

     
  • Rob Winch
    Rob Winch
    2011-01-18

    Fixed! Thanks again :)

    I am having another issue related to losing a connection. Would you prefer to continue in this bug or log another? The issue is if someone disconnects they still appear online (I believe this is normal for a short period). If I IM them, I do not get any indication the message couldn't be delivered. I would expect that it give an indication and then show the buddy as signed off.

     
  • Stefan Becker
    Stefan Becker
    2011-01-18

    • status: open --> closed-fixed
     
  • Stefan Becker
    Stefan Becker
    2011-01-18

    Please open another error once you can provide a test description and a log showing that SIPE does something incorrectly.

    PLEASE REMEMBER: the OCS backend doesn't always know that a client is offline, e.g. in your case when you yank the network cable. In that case SIPE sends a message to OCS and only if OCS timeouts then SIPE will receive an error back. The same is true for the presence update: only once OCS determines that a user is offline, then it will provide SIPE with a presence update for the user.

    Closing this bug.