From: Ian B. <ia...@co...> - 2004-05-21 20:16:20
|
John Dickinson wrote: > Some code is below. It is used to take a bunch (1000s) of files and > merge them into one big file. I hope it helps. The hardest part of > writing it was debugging the worker thread. If everythong doesn't go > just right, then you end up with a bunch of threads out there in > memory taking up cycles. > > I do have one (related) problem with doing this. In another (similar) > application, I tried to use this technique to prevent server > timeouts. The problem was that my worker thread was trying to eval a > huge (~40MB) dictionary. This eval seemed to block everything in the > entire Webware process. Now if I read the Python documentation > correctly, that happens because Python's threads are simulated in the > interpreter and each operation is viewed as atomic. The eval was > taking a long time and would not allow any other thread in the > process to do it's thing. Is this correct? If not, what is happening? > If so, what other way can I do this? Python uses OS threads, so other threads continue operating. However, there is a global interpreter lock (GIL) which is locked for many operations, and you may have encountered that. It usually doesn't cause a problem, but I guess it depends on what you are doing. Imports and some code evaluation, for instance, lock the entire process. Generally I would suggest putting these workers in their own processes -- if not in deployment, at least for development (to make them easier to debug -- i.e., make it usable as a command-line program). A process generally allows you much more control -- you can kill it off, run it as a different user, you don't get caught by the GIL, etc. This works well with queuing jobs in a database. I was just looking at remoteD lately as well (http://www.neurokode.com/Projects/remoted.php), and it looks like a very easy way to do interprocess queuing without a database. -- Ian Bicking / ia...@co... / http://blog.ianbicking.org |