Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#82 Transaction Monitoring and Control

closed-accepted
WebKit (45)
5
2007-11-25
2007-10-03
Steve Schwarz
No

Hi,
I've been running a local patch for a couple weeks now with following features:

1 - Added two methods to ThreadedAppServer: startRequest() and endRequest() methods. These methods can be used to implement page statistics and/or click tracking by session. Default implementation supports feature 2 below.

2 - Allow real time monitoring of the requests being processed by each thread. A new ThreadControl page has been added to the Admin context. Setting 'TrackThreads' to True in AppServer.config enables this feature. If the supplied thread2.py module is loaded requests can be cancelled via this page.

3 - Support automatically cancelling requests taking more than a configurable amount of time. Setting 'CancelLongRequests' to the number of seconds after which a request should be cancelled in the AppServer.config file enables this feature. This feature also requires 'TrackThreads' to be enabled.

The patch is relatively small, if there is interest in looking at these changes and moving them into the mainline I'd be happy to forward them to anyone interested. Of course, comments and suggestions are very welcome.

The performance penalty for feature 1 is the cost of some accessors into the current transaction, a couple time() calls and two function calls. I would guess these changes should have a negligible impact on performance.

Features 2 and 3 use a new class which is basically a locked dictionary. So the cost is two additional mutex lock/unlocks for each transaction. If WebKit didn't support adding/removing threads dynamically that mutex could be removed and the additional overhead is some dictionary accesses.

I'd be happy to get feedback on these enhancements.

Thanks,
Steve

Discussion

  • Steve Schwarz
    Steve Schwarz
    2007-10-03

    WebKit Thread Mods 2007-10-02

     
    • assigned_to: nobody --> cito
     
    • status: open --> closed-accepted
     
  • Logged In: YES
    user_id=193957
    Originator: NO

    Thanks a lot, Steve. This has now been implemented in r7097 and will go into Webware 0.9.5. I only changed some implementation details. For instance, I am not using any locks. Since we have only one writer, and since setattr operations are atomic in Python, you can actually do without them. I have also added an improved version of your ThreadControl servlet. Let me know if you think this implementation is ok or if you find any bugs or problems.

     
  • Logged In: YES
    user_id=193957
    Originator: NO

    I just noticed that I have only implemented features 1 and 2, and completely forgot about your CancelLongRequests option for canceling long-runners automatically. Reopened this ticket as a reminder to implement that, too.

     
    • status: closed-accepted --> open-accepted
     
  • Logged In: YES
    user_id=193957
    Originator: NO

    In r7107 has now been implemented in r7107. Again, let me know if there is still something missing or not working for you.

     
    • status: open-accepted --> closed-accepted
     
  • Steve Schwarz
    Steve Schwarz
    2007-11-25

    Logged In: YES
    user_id=1111832
    Originator: YES

    Thanks for promoting this code. I'll take a look at it this week.

    My concern with removing the mutexes is the situation where the list of active requests is being created for a request (i.e. for the admin screen) at the same time the number of threads is being reduced. I believe I had the map iterator raise an exception when the map changed underneath it in that situation.

     
  • Logged In: YES
    user_id=193957
    Originator: NO

    I don't think that the changing map iteration is a problem since I'm using the items() or values() of the map only.

    However, now that you say it, I see a different problem. There is a minuscule chance that while the list of active threads is iterated, one of these threads finishes by itself and processes another request. In this case, the new request will be canceled instead of the one we actually wanted to cancel. I think both of our implementations have this problem. To solve this, we may need a different kind of lock used inside the threadloop(). I'll examine this end of this week. Maybe you can make some suggestions until then already.

     
  • Steve Schwarz
    Steve Schwarz
    2007-11-27

    Logged In: YES
    user_id=1111832
    Originator: YES

    Christoph: "However, now that you say it, I see a different problem. There is a
    minuscule chance that while the list of active threads is iterated, one of
    these threads finishes by itself and processes another request. In this
    case, the new request will be canceled instead of the one we actually
    wanted to cancel. I think both of our implementations have this problem. To
    solve this, we may need a different kind of lock used inside the
    threadloop()."

    I only just started looking at the code. One thing that comes to mind is to have the exception that is passed into ThreadWorker.abort() contain the requestId being cancelled. Then that code could compare request ids just before issuing the async thread cancel.

    Another idea I played with was having long running tasks and/or cancelled tasks give a traceback when they are cancelled. That way you can see what the page was working on when it was killed - the client doesn't have to get the traceback though. In my original version it did that, the version I supplied in this ticket catches that specific exception and just logged a message. WDYT?

     
  • Logged In: YES
    user_id=193957
    Originator: NO

    Concerning the first problem, checking the thread again just before it's canceled would decrease the (already small) possibility of a wrong cancelation, but the chance would still exist. I've now checked in an air-tight solution in r7112 which should not cost performance and makes sure that always the right request is canceled.

    Concerning your second idea: Currently requests are aborted with an ConnectionAbortedError. Actually this exception is used in the case where the client aborted the connection and the response could not be sent to the client any more.

    I think you're right and a different exception should be used for aborting requests, e.g. a new class RequestAbortedError. This should be handled just like any Exception, i.e. a traceback is logged and the client gets either the traceback or an error message (depending on the ShowDebugInfoOnErrors setting) if the servlet does not catch this. If you agree, I will implement it that way and we can close this issue.

    Also, what do you think about the default setting of MaxRequestTime = 300. Is this ok? it corresponds with the Apache default timeout value. Or should we set it to None (disable the feature) by default?

     
  • Logged In: YES
    user_id=193957
    Originator: NO

    Concerning your second idea, in r7118 different (http service unavailable) exceptions are now used to abort requests.

    Please let me know if there are still issues with the current solution.