From: Paul D. F. <pdf...@ku...> - 2005-10-12 11:45:26
|
I've been writing a old-fashioned? single-threaded message passing server with long term client connections in Jython using the Java sockets and aSocket.setSoTimeout(1) and BufferedReader.ready() [not sure how that ready() call works but it seems to avoid a timeout?]. It would have been nice to have your stuff in Jython at the start though instead (and maybe to use twisted etc.). I'll probably be moving to a select type model with java nio if this server sees more development down the road (and take a harder look at your stuff then). I'll throw my two cents in to try to be helpful with more of the obvious -- though I may be missing the point, not being an expert on sockets and not fully understanding all aspects of the problem you mention. The obvious question to me is: when you run your test suite once, and then close the application to shut the JVM, can you then run it again within a new JVM with no errors? Or is this, as the reply to the poster in the first link you reference suggested, something to do with an expectation of how sockets work at a low level (in which case, closing the JVM might have no effect)? That reply was: "The TCP/IP spec indicates that a socket will remain in TIME_WAIT for twice the maximum segment lifetime (MSL) before being finally closed. This is regardless of whether all the ACKs have been received, and is designed to protect against reuse of sequence numbers when there may still be segements in the network that contain them. The MSL is 2 minutes, so the total time in TIME_WAIT would be four minutes." Being sure the JVM is shut down may not be as trivial as it seems. When I was developing the initial code on my machine, using Eclipse 3.0 with JVM 1.4 under Linux and launching the code from within Eclipse, I found I actually had to shut down (and restart) Eclipse to get the socket for accept closed when I was first making lots of mistakes with closing both sockets and windows. I guess Eclipse and the applications it ran shared a JVM somehow? Or more likely, closing Eclipse ensured and JVM instances it started got completely shut down? Restarting was time consuming and annoying, so for a while I just kept bumping up the port number in my code by one by hand with each new test. :-) Anyway, if your test suite can run again after being sure the JVM it ran in is really truly shut down, then it seems to me the sockets aren't getting properly closed in your test suite after the first run, because obviously the JVM can close them properly when it exits. Maybe there is a missing .close() in your test code and the socket is then not garbage collected right away (because Jython can't guarantee when that happens)? This may be paranoid and overkill and obvious, but here is the code I use for closing the java sockets: #SERVER SIDE: if not self.listenSocket.isClosed(): self.listenSocket.close() for clientConnection in self.clientConnections: if not clientConnection.clientSocket.isClosed(): clientConnection.clientSocket.shutdownInput() clientConnection.clientSocket.shutdownOutput() clientConnection.clientSocket.close() #CLIENT SIDE: if self.clientSocket: if not self.clientSocket.isClosed(): self.clientSocket.shutdownInput() self.clientSocket.shutdownOutput() self.clientSocket.close() A close probably is good enough, but it still seemed that shutting down the input and output first before closing was the cautious thing to do. Perhaps you could try that in your tests? Anyway, probably too simple a solution. Still, perhaps you could also try tunning the GC manually with a System.gc() after your test suite to see what happens if it is indeed a missing close()? If the problem goes away with a gc forcing a finalization on any open sockets, maybe something isn't getting closed. Or, maybe, as you imply, this is just a bug in the JVM under Windows and it is just not responding completely to a close? I'm running Linux (Debian unstable 2.6.10 kernel i686), and I am using the latest from the Jython CVS and set up with JRE 1.4 and JRE 1.5, so I could help test maybe for Linux if it was easy to try your code (though this week is fairly hectic with a big deliverable on Monday, so I can't promise a speedy turnaround). --Paul Fernhout Alan Kennedy wrote: > [Alan] > >>> The problem arises on the second and subsequent runs of the test >>> suite: for some reason, many of the sockets created leave open >>> sockets in either the TIME_WAIT or CLOSE_WAIT state. > > > [Richie Hindle] > >> I imagine I'm teaching my grandmother to suck eggs here, but I'll mention >> this anyway: the usual fix for this sort of thing is to set SO_REUSEADDR >> on any socket with which you use bind(). > > > Yep, sucked those eggs dry already ;-) > > I tried all possible combinations of client and server socket options, > i.e. SO_REUSEADDR, SO_LINGER, TCP_NODELAY, etc. > > But thanks for trying anyway :-) > > Cheers, > > Alan. |