From: Geoffrey T. <gta...@na...> - 2004-02-11 14:17:01
|
I have no FreeBSD experience but I do have a couple of ideas. Ian Maurer wrote: > Basically Webware just stops accepting requests. The process is still > running with the state of 'lockf' (important?) but it just doesn't > respond. > > Usually Webware just stops answering and the 'verbose' output > shows the last request but no response is given: > > 363 2004-02-08 13:31:54 /WK/Context/Example > > (no 363 response) > > I am getting the following message at the end of the output LOG > I am keeping: > > StreamOut Error: (54, 'Connection reset by peer') This isn't _necessarily_ an error. It can happen when a client presses the Stop button in their browser while a servlet is still processing the request, if the servlet is using flush() to send partial responses. Try running the "PushServlet" example servlet, then press Stop in your browser before the page is done rendering. You'll probably get the same message, and it's not an error. > > Which I believe comes from the ThreadedAppServer.py module at line > 446. > > I am really at a loss for where to start chasing down this problem. > I have been running Webware for over 2 years without any problems, > so I guess I am a little bit spoiled. > > Any thoughts or suggestions? Any more information needed? I'm wondering if the server grinds to a halt all at once, or if the threads get locked up one by one until they are all wedged (which has happened to me through no fault of WebKit -- a 3rd party library was locking up occasionally, and as soon as all of the threads in the pool were wedged, the appserver was dead). You can help figure that out by putting this in ThreadedAppServer.py at the top of RequestHandler.handleRequest(): print '%5i thread is %s' % (self._number, threading.currentThread().getName()) By examining the messages this produces you should be able to figure out if the pool of available threads is getting smaller and smaller as your appserver runs. Ordinarily it seems to cycle through the threads in round-robin fashion (at least this is how it apparently works on Windows; I'm not sure about other OS's) so it's easy to tell when a thread gets wedged. In AppServer.config you may want to set StartServerThreads, MaxServerThreads, and MinServerThreads to the same value so there's no confusion about how large your thread pool is. Good luck! - Geoff |
From: Geoffrey T. <gta...@na...> - 2004-02-12 21:16:39
|
Ian Maurer wrote: > Hello All, > > Thanks for all of your help so far. I did indeed have a mutex > problem in my code that caused some deadlock in my code. I replaced > the fcntl.flock calls with the lock in the threading module and that > seemed to do the trick. > > Now that I have that problem out of the way, I am pretty confident > that there is some sort of socket problem with that code in the > flush method of the TASASSStreamOut class (or at least that's where > the socket problem is showing up). Is this problem still causing your appserver to lock up? Or is it just resulting in unusual messages in your logfile that you want to track down? My understanding of the code is that if the client is no longer listening, flush() will cause the "StreamOut Error" warning message to appear but will otherwise just be a no-op. Your servlet should go on processing, blissfully unaware that the client has disconnected. (It might make more sense for flush() to "raise EndResponse" at that point to keep the client from doing unnecessary additional processing. That would be a useful option to add to flush perhaps. It would also be useful for it to log a less alarming-looking error message.) - Geoff |
From: Ian M. <ian...@ya...> - 2004-02-12 21:21:45
|
> > Now that I have that problem out of the way, I am pretty > confident > > that there is some sort of socket problem with that code in the > > flush method of the TASASSStreamOut class (or at least that's > where > > the socket problem is showing up). > > Is this problem still causing your appserver to lock up? Or is it > just > resulting in unusual messages in your logfile that you want to > track down? I am sorry I wasn't clearer. The server is indeed locked up. It won't respond to any more requests and it won't even shutdown normally. __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree |
From: Ian M. <ian...@ya...> - 2004-02-11 15:04:21
|
> I'm wondering if the server grinds to a halt all at once, or if the > threads get locked up one by one until they are all wedged (which > has happened to me through no fault of WebKit -- a 3rd party library > was locking up occasionally, and as soon as all of the threads in > the pool were wedged, the appserver was dead). That is it. So I need to do some debugging on my code, which is a relief since I am pretty sure it will be fixable and not a unfixable limitation of my host machine networking capabilities. If anyone has any tips or experience as to why an application ported from Windows and Linux would now not lock up on FreeBSD, I would appreciate any suggestions since I am new to FreeBSD. Otherwise, I think I have enough info to track down this issue on my own. Thank you very much for your help... Ian Maurer __________________________________ Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online. http://taxes.yahoo.com/filing.html |
From: Ian B. <ia...@co...> - 2004-02-11 18:09:50
|
Ian Maurer wrote: >>I'm wondering if the server grinds to a halt all at once, or if the >>threads get locked up one by one until they are all wedged (which >>has happened to me through no fault of WebKit -- a 3rd party > > library > >>was locking up occasionally, and as soon as all of the threads in >>the pool were wedged, the appserver was dead). > > > That is it. So I need to do some debugging on my code, which is a > relief since I am pretty sure it will be fixable and not a unfixable > limitation of my host machine networking capabilities. > > If anyone has any tips or experience as to why an application ported > from Windows and Linux would now not lock up on FreeBSD, I would > appreciate any suggestions since I am new to FreeBSD. Otherwise, I > think I have enough info to track down this issue on my own. Some people have had problems with threads on FreeBSD before, for specific versions. I think it had something to do with the socket handling. I thought we applied some fix related to this, but maybe we never did, or maybe it wasn't sufficient. Ian |
From: Aaron H. <aaron@MetroNY.com> - 2004-02-11 19:12:05
|
I had an issue on FreeBSD where the postgres (pyPGSQL) connections were being dropped for some reason. I think that I had to update the python version installed and the postgres driver. -Aaron Ian Bicking wrote: > Ian Maurer wrote: > >>> I'm wondering if the server grinds to a halt all at once, or if the >>> threads get locked up one by one until they are all wedged (which >>> has happened to me through no fault of WebKit -- a 3rd party >> >> >> library >> >>> was locking up occasionally, and as soon as all of the threads in >>> the pool were wedged, the appserver was dead). >> >> >> >> That is it. So I need to do some debugging on my code, which is a >> relief since I am pretty sure it will be fixable and not a unfixable >> limitation of my host machine networking capabilities. >> >> If anyone has any tips or experience as to why an application ported >> from Windows and Linux would now not lock up on FreeBSD, I would >> appreciate any suggestions since I am new to FreeBSD. Otherwise, I >> think I have enough info to track down this issue on my own. > > > Some people have had problems with threads on FreeBSD before, for > specific versions. I think it had something to do with the socket > handling. I thought we applied some fix related to this, but maybe we > never did, or maybe it wasn't sufficient. > > Ian > > > ------------------------------------------------------- > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > Build and deploy apps & Web services for Linux with > a free DVD software kit from IBM. Click Now! > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > _______________________________________________ > Webware-discuss mailing list > Web...@li... > https://lists.sourceforge.net/lists/listinfo/webware-discuss -- -Aaron http://www.MetroNY.com/ "I don't know what's wrong with my television set. I was getting C-Span and the Home Shopping Network on the same station. I actually bought a congressman." - Bruce Baum |
From: Ian M. <ian...@ya...> - 2004-02-12 21:03:48
|
Hello All, Thanks for all of your help so far. I did indeed have a mutex problem in my code that caused some deadlock in my code. I replaced the fcntl.flock calls with the lock in the threading module and that seemed to do the trick. Now that I have that problem out of the way, I am pretty confident that there is some sort of socket problem with that code in the flush method of the TASASSStreamOut class (or at least that's where the socket problem is showing up). I also figured out that I could replicate the problem by stopping a request using Internet Explorer (of course :). So now that I can actually replicate the problem, it should be only a matter of time to figure out the source of the problem. So, I started looking at the putting some print statements in the code since I don't know how else to tackle debugging a web server... Here is my modified code for debugging. I added 1 debug statement and made debug a variable for holding the time and a random int: def flush(self): from time import time from sys import stdout from random import randint debug=(time(), randint(1,100000)) result = ASStreamOut.flush(self) if result: ##a true return value means we can send reslen = len(self._buffer) if debug: print debug, "TASASStreamout is sending %s bytes" % reslen sent = 0 while sent < reslen: try: sent = sent + self._socket.send(self._buffer[sent:sent+8192]) except socket.error, e: if e[0]==errno.EPIPE: #broken pipe pass else: print "StreamOut Error: ", e break if debug: print debug, "TASASStreamout has sent %s bytes" % sent self.pop(sent) stdout.flush() Here are the results... Request 1 and 2 are successful requests. Number 3 is the broken one. Creating 5 threads..... Ready (0.67 seconds after launch) 1 thread is Thread-3 1 2004-02-12 12:45:03 /WK/M/Page (1076618708.494272, 32645) TASASStreamout is sending 20567 bytes (1076618708.494272, 32645) TASASStreamout has sent 20567 bytes (1076618708.496495, 59979) TASASStreamout is sending 0 bytes (1076618708.496495, 59979) TASASStreamout has sent 0 bytes 1 5.01 secs /WK/M/Page 2 thread is Thread-4 2 2004-02-12 12:45:57 /WK/M/Page (1076618757.3941059, 65872) TASASStreamout is sending 20697 bytes (1076618757.3941059, 65872) TASASStreamout has sent 20697 bytes (1076618757.396776, 79609) TASASStreamout is sending 0 bytes (1076618757.396776, 79609) TASASStreamout has sent 0 bytes 2 0.12 secs /WK/M/Page 3 thread is Thread-5 3 2004-02-12 12:46:06 /WK/M/Page (1076618768.8915739, 41809) TASASStreamout is sending 15068 bytes StreamOut Error: (54, 'Connection reset by peer') (1076618768.8915739, 41809) TASASStreamout has sent 8192 bytes (1076618768.8936629, 69418) TASASStreamout is sending 6876 bytes (1076618768.8936629, 69418) TASASStreamout has sent 0 bytes As you can see, the flush command is being called twice (the random integer shows that). I am, of course, unsure if this is a cause, a result, or just what's suppose to happen. If anyone with more experience with socket programming has any ideas then I would be glad to hear them. thanks again for your help, ian __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree |