Menu

#585 Broken Pipe in SockSend (rxsock)

closed
nobody
5
2012-08-14
2009-03-21
Mike Protts
No

I've come across a problem with rexx terminating with 'Broken pipe' if using socksend and the client closes the connection part way through sending (I'd uses sockselect to check for the send socket being available).

The problem seems to be that when the client closes the connection while the send is active (in this case using curl and then terminate with ctrl-C) there is a SIGPIPE generated (about 90% of the time), and rexx then terminates immediately. The sample code below is called in a long running loop to send hundreds of MB, so it is easy to trigger.

I have written and tested a simple patch (below) to rexutils/rxsock.c, but I'd be happier if someone else can check I've not been too simplistic. The patch simply adds an ignore action to the SIGPIPE handler. I've not been able to identify any particular problem with ignoring a SIGPIPE, but there may be other circumstances that might indicate saving the current status and restoring after the c socket call. Patch code is trivial, and can be treated as public domain.

Cheers
Mike

sample code:

use arg data
receive_sock.0 = 0
send_sock.0 = 1
send_sock.1 = self~socket()
err_sock.0 = 1
err_sock.1 = self~socket()
rc = SockSelect('RECEIVE_SOCK.', 'SEND_SOCK.', 'ERR_SOCK.', 3)
Select
When (rc = 0) then do
Say 'Send Timeout'
end
When (rc < 0) then do
say 'SockSelect' rc errno SockPSock_Errno() SockSock_Errno()
end
otherwise do
if (err_sock.0 = 1) then do
say 'Error on' err_sock.1
return 99
end
rc = SockSend(send_sock.1, data)
if (rc > 0) then do
rc = 0
end
else do
say 'send error' rc errno SockPSock_Errno() SockSock_Errno()
end
end
end
===============

Patch:

ooRexx-3.2.0/rexutils/rxsock.c Sun Oct 7 23:04:11 2007
--- ooRexx-3.2.0.mike/rexutils/rxsock.c Sat Mar 21 18:49:32 2009
***
74,79 ****
--- 74,80 ----

include <ctype.h>

include <setjmp.h>

  • include <signal.h>

/------------------------------------------------------------------
Windows includes


922,927
--- 923,929 ----
int i;
ULONG ulRc;
RexxFunctionHandler
pRxFunc;
+ struct sigaction new_action, old_action;

ifdef WIN32

WORD wVersionRequested;
WSADATA wsaData;


965,970
--- 967,977 ----
/
---------------------------------------------------------------
call function
---------------------------------------------------------------*/
+ new_action.sa_handler = SIG_IGN;
+ sigemptyset (&new_action.sa_mask);
+ new_action.sa_flags = 0;
+ sigaction (SIGPIPE, &new_action, NULL);
+
ulRc = pRxFunc(name,argc,argv,qName,retStr);

/*---------------------------------------------------------------

==========

Discussion

  • Mike Protts

    Mike Protts - 2009-03-21

    patch to ignore sigpipe in rxsock.c

     
  • Mike Protts

    Mike Protts - 2009-03-22

    A bit more info:

    Using oorexx 3.2.0 on Linux (intel debian and puppy), I don't have a Windows box to test on immediately but I'll try to do so.

    The main thread runs a listen loop, and passes each connection to an async client thread, and goes back to listening. I use a 3 second timeout on a select for the listen, so the accept will not block. The client thread then performs the accept/receive followed by multiple sends (if the data to send is more than the 32767 byte buffer it is split).

    It seems that the broken pipe is mainly triggered if the main thread has timed out at least once on the select. I can provide the complete code if needed, but I can't post it publicly at the moment.

     
  • Mike Protts

    Mike Protts - 2009-03-22

    A bit more info:

    Using oorexx 3.2.0 on Linux (intel debian and puppy), I don't have a Windows box to test on immediately but I'll try to do so.

    The main thread runs a listen loop, and passes each connection to an async client thread, and goes back to listening. I use a 3 second timeout on a select for the listen, so the accept will not block. The client thread then performs the accept/receive followed by multiple sends (if the data to send is more than the 32767 byte buffer it is split).

    It seems that the broken pipe is mainly triggered if the main thread has timed out at least once on the select. I can provide the complete code if needed, but I can't post it publicly at the moment.

     
  • Mike Protts

    Mike Protts - 2009-04-01

    simple web server, run as rexx swj.rexx PORT DOCROOT

     
  • Mike Protts

    Mike Protts - 2009-04-01

    I don't seem to have this problem with version 4, only 3.2.

    I've uploaded my test file. To recreate the SIG_PIPE problem (with oorexx 3.2), start the web server (rexx swj.rexx 8888 testdir) and then begin a download of a large file (may need to try a few times) - I used curl http://server:8888/largefile.tgz, and then cancelled with ctrl-c. Rexx terminates with the Broken Pipe message.

    As there has been a lot of change in the sockets and signal handling, I suspect this has been fixed in 4.0.

    Mike

     
  • Mark Miesfeld

    Mark Miesfeld - 2009-04-01

    Mike,

    I'll run your test under 3.2.0 and see if I can reproduce the problem. If I can reproduce it on 3.2.0, and then can not reproduce it under 4.0.0, then that will be a good confirmation that it is fixed in 4.0.0

     
  • Mark Miesfeld

    Mark Miesfeld - 2009-04-25

    Mike,

    I keep getting errors running your server on Linux using 3.2.0:

    Request 1 GET /ooRexx-4.0.0-4459.i386.fedora10.rpm HTTP/1.1
    User-Agent: curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
    Host: localhost:8888
    Accept: /

    20090424 17:15:15.243875 4 /web/ooRexx-4.0.0-4459.i386.fedora10.rpm started
    HTTP/1.1 200 OK
    Content-Length:
    Content-Type: application/octet-stream
    Connection: keep-alive

    175 - buffsize = self~content_length()-self~read()
    224 - rc = self~getdata()
    REX0041E: Error 41 running /work.ooRexx/web/swj.rexx line 175: Bad arithmetic conversion
    REX0404E: Error 41.1: Nonnumeric value ("") used in arithmetic operation

    I'd really like to close this, but since I can't really claim to have tested it ...

    If you, as the opener of the bug, are satisfied that what you found under 3.2.0 is fixed in 4.0.0. then I'm good with that. Just set the resolution to fixed, add a note saying seems to be fixed in 4.0.0, and close it. If for some reason SourceForge doesn't let you change the resolution, just put in a note saying you're satisfied it is fixed and I'll close it.

    Thanks a lot for you contributions.

     
  • Mike Protts

    Mike Protts - 2009-04-25

    The error running the program is probably due to the read being for a missing file (the code is used as a teaching aid, so there are plenty of missing pieces of validation etc.). The server starts with two parameters, the port, and the 'docroot', which is a directory. The full URL including file name must be used by a client (with the exception of DOCROOT/index.html for a request with no path at all).

    I cannot recreate the bug in 4.0, although I can recreate easily in 3.2, so I am happy for this to be closed as fixed in 4.0.

    Thanks
    Mike

     

Anonymous
Anonymous

Add attachments
Cancel