Menu

#953 recv_timeout not triggered when periodic signal interrupts select() on Linux

v1.0 (example)
closed-fixed
None
5
2013-12-02
2013-11-28
peclik
No

From time to time sending SOAP request hangs indefinitely even if recv_timeout is set to non-zero value.

Observed on:

  • Linux client, connection over Internet to other host
  • gSoap 2.8.9
  • tcp_timeout set to 15 seconds
  • application has installed 5 seconds periodic signal

Cause:

  • gSoap in frecv() calls tcp_select() with proper timeout
  • tcp_select() is interrupted by signal before timeout triggering resulting in EINTR error code
  • frecv() calls tcp_select() again with full timeout in endless loop

Proposed solutions:

  • on Linux, select() call updates timeout structure to be the time the call was not waiting (but this is Linux specific, see man select()), next select() call should have set this time for timeout, not the full recv_timeout
  • for solving problem on different platforms, gSoap should decrease recv_timeout between select() calls itself

The problem can probably arise on other places in the code where recv_timeout and send_timeout are used.

Discussion

  • peclik

    peclik - 2013-11-28

    Notes:

    • Occasional behavior is probably due to the occasional problem with a connection which then lasts longer then signal period.
    • The problematic select is on line 4425 of stdsoap2.cpp.
    • 'tcp_timeout set to 15 seconds' in the description above should be 'recv_timeout set to 15 seconds'
     

    Last edit: peclik 2013-11-28
  • Robert van Engelen

    • status: open --> open-accepted
    • assigned_to: Robert van Engelen
     
  • Robert van Engelen

    Hm, reusing the timeout value after select() is not portable.

    BSD select():
         If timeout is a non-nil pointer, it specifies a maximum interval to wait
         for the selection to complete.  If timeout is a nil pointer, the select
         blocks indefinitely.  To effect a poll, the timeout argument should be
         non-nil, pointing to a zero-valued timeval structure.  Timeout is not
         changed by select(), and may be reused on subsequent calls, however it is
         good style to re-initialize it before each invocation of select().
    

    The 2.8.17 change is:

    /* Max number of EINTR while poll/select on a socket */
    /* Each EINTR can lengthen the I/O blocking time by at most one second */
    #define SOAP_MAXEINTR (10)
    

    Therefore, the tcp_select() function in gsoap 2.8.17 will be changed to allow EINTR only a limited number of times per poll/select before a fatal EINTR error is raised. This can lengthen the poll/select blocking time only by a limited duration, which will be 10 seconds by default.

     
  • Robert van Engelen

    • status: open-accepted --> closed-fixed
     
  • peclik

    peclik - 2013-12-02

    I'm afraid MAXEINTR solution is not robust enough. Consider EINTR going each 20 ms. Then it will not be possible to send even one SOAP request with 10 attempts; on the other hand in different application parts, the EINTR can trigger with different period (let's say in one case 20ms in other case 1000ms), so it's not possible to set SOAP_MAXEINTR to meaningful value. Also user may not have control of EINTR because of 3rd party libraries and it's not always feasible to disable EINTR.

    Also I think this new behavior can break existing applications (while until now it waited very long time, from 2.8.17 it can timeout very soon prematurely because of EINTR).

    I think that the only robust and portable solution is to count timeout in tcp_select() itself - i.e. store current time before calling select(), call select(), and then substract time spent from timeout (probably adding new members to the soap structure - bool firstSelectCall and actual timeout).

     

Log in to post a comment.

MongoDB Logo MongoDB