From: Matt F. <ma...@da...> - 2004-02-18 18:42:18
|
Hello. We've run into an interesting problem using ZODB 3.3a2 and Webware Webkit's threaded application server (http://webware.sourceforge.net) on Windows (but not Linux). The symptom that alerted us to the problem was that object attributes sometimes -- but inconsistently -- failed to be persisted. (No, this problem has nothing to do with mutables. Yes, we're sure.) First (for the ZODB people), a quick summary of how the Webkit application server works: the server maintains a pool of threads to service URI requests. Each request is assigned an available, running thread. The app server does one of two things: 1) instantiates a new servlet instance and passes the request to be serviced - or - 2) it finds an already-running instance of the relevant servlet and passes it the request to be serviced In either case, these threads eventually return a response string to the http client. The crucial subtlety is that a thread does NOT destroy/garbage-collect its servlet instance after servicing the request. Rather, the servlet instance is "put to sleep" and kept around in case another request is for the same servlet. If the application is taking requests for many different servlets, the servlet instances do slip in and out of the pool as different requests come in, but there is no guarantee that a servlet instance will be brand new with every request. The crux: it appears that Webkit's threading model exposes a ZODB problem when a Webkit servlet transacts with a ZODB under Windows. Changes to persistent objects are made and committed within servlet code. If you watch the _p_changed attribute before and after running get_transaction().commit(), we assume it should always be 1 before the commit() and 0 after the commit. Under our Windows test case, sometimes _p_changed is 1 following the commit() suggesting that get_transaction() somehow isn't getting the right transaction, and therefore nothing is actually being committed after all. The exact same Webkit code does not exhibit this problem running under Linux. We have identified to workarounds on Windows: 1) limit the Webkit app server to one thread; this solves the problem but neatly renders the app server useless for production purposes 2) add a seemingly superfluous get_transaction().abort() call just before the servlet is put to sleep (and before the database connection is closed) by the app server; apparently, if you make this call (even when you've made no changes) a side effect is precluding whatever circumstances are leading to the real problem Here are some more details on what we've tested: Windows: - python 2.3.3 - ZODB 3.3a2 using standard FileStorage with BTrees - NOT using ZEO - Webware 0.8 Linux: - python 2.3.1 - ZODB 3.3a2 using standard FileStorage with BTrees - NOT using ZEO - Webware 0.8 We also noticed that the object returned by get_transaction() has an _id. If we watch this value as servlets transact with ZODB, we've noticed that the ids are frequently re-used. Perhaps this is relevant? If anyone would like more information on tracking down this bug, we'd love to help. We're afraid that we're probably near the limits of our expertise, but can help with debug print statement output or whatever. Thanks! |
From: Tim P. <ti...@zo...> - 2004-02-19 03:28:07
|
[Matt Feifarek, w/ some ugly thread symptoms seen only on Windows] Matt, what does this print for you if you run it on your Linux box? """ import thread, time def worker(): print thread.get_ident() for i in range(10): thread.start_new_thread(worker, ()) time.sleep(1) """ A typical run on Windows prints the same number 10 times. If it = typically prints 10 distinct numbers for you, I'm betting that's a clue. There are no deliberate ways in which Python thread semantics differ = across Linux and Windows. There are huge pragmatic differences, though, = due to the way the OSes implement and schedule their native threads. = The one most often "to blame" when platform-specific behavior is seen is = that Windows is typically much more willing to switch threads frequently = than Linux. But Windows is also eager to reuse internal handles ASAP = (thread.get_ident() is the return value from the Win32 API = GetCurrentThreadId() call on Windows, and is used by ZODB as a key in a = dict to map the current thread to its current transaction), while I = *expect* Linux is not (thread.get_ident() is the return value from the = pthreads pthread_self() on Linux, and Linux is an oddball among Unixes = in confusing thread ids with process ids -- most Unixes don't reuse pids = for a loooong time, but Windows reuses tids ASAP). I don't know that this relates to the problem you're seeing -- just = trying to get more evidence now, and have been nervous about the = thread-id -> transaction dict since I first saw it. If you let a thread die with changes in progress (neither commit nor = abort on its then-current transaction), it looks all but certain to me = that ZODB's tid->transaction dict will get confused by the next thread = Windows creates (which is all but certain to have the same tid as the = thread that just died). |
From: Kapil T. <ha...@ob...> - 2004-02-19 04:34:47
|
just tossing out an idea.. the default zodb semantics is per thread semantics. or at least tries to be which is problematic for many scenarios.. perhaps the local transaction shane implemented would be of use here.. ie. if you call setLocalTransaction upon start of request processing and commit that txn at the end of processing, via _p_jar.getTransaction().commit() hth, -kapil On Wed, 2004-02-18 at 22:22, Tim Peters wrote: > [Matt Feifarek, w/ some ugly thread symptoms seen only on Windows] > > Matt, what does this print for you if you run it on your Linux box? > > """ > import thread, time > > def worker(): > print thread.get_ident() > > for i in range(10): > thread.start_new_thread(worker, ()) > time.sleep(1) > """ > > A typical run on Windows prints the same number 10 times. If it typically prints 10 distinct numbers for you, I'm betting that's a clue. > > There are no deliberate ways in which Python thread semantics differ across Linux and Windows. There are huge pragmatic differences, though, due to the way the OSes implement and schedule their native threads. The one most often "to blame" when platform-specific behavior is seen is that Windows is typically much more willing to switch threads frequently than Linux. But Windows is also eager to reuse internal handles ASAP (thread.get_ident() is the return value from the Win32 API GetCurrentThreadId() call on Windows, and is used by ZODB as a key in a dict to map the current thread to its current transaction), while I *expect* Linux is not (thread.get_ident() is the return value from the pthreads pthread_self() on Linux, and Linux is an oddball among Unixes in confusing thread ids with process ids -- most Unixes don't reuse pids for a loooong time, but Windows reuses tids ASAP). > > I don't know that this relates to the problem you're seeing -- just trying to get more evidence now, and have been nervous about the thread-id -> transaction dict since I first saw it. > > If you let a thread die with changes in progress (neither commit nor abort on its then-current transaction), it looks all but certain to me that ZODB's tid->transaction dict will get confused by the next thread Windows creates (which is all but certain to have the same tid as the thread that just died). > > > _______________________________________________ > For more information about ZODB, see the ZODB Wiki: > http://www.zope.org/Wikis/ZODB/ > > ZODB-Dev mailing list - ZOD...@zo... > http://mail.zope.org/mailman/listinfo/zodb-dev > |
From: Christian R. R. <ki...@as...> - 2004-02-19 12:14:59
|
On Wed, Feb 18, 2004 at 10:22:13PM -0500, Tim Peters wrote: > """ > import thread, time > > def worker(): > print thread.get_ident() > > for i in range(10): > thread.start_new_thread(worker, ()) > time.sleep(1) > """ FWIW: 1026 1026 2050 2050 3074 3074 4098 4098 5122 5122 6146 6146 7170 7170 8194 8194 9218 9218 10242 10242 Linux manonegra 2.4.21k7 #8 Tue Jun 17 18:05:29 BRST 2003 i686 unknown -rwxr-xr-x 1 root root 1153784 Apr 8 2003 /lib/libc-2.2.5.so* Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 |
From: Tim P. <ti...@zo...> - 2004-02-19 18:09:11
|
Running: >> """ >> import thread, time >> >> def worker(): >> print thread.get_ident() >> >> for i in range(10): >> thread.start_new_thread(worker, ()) >> time.sleep(1) >> """ Christian reported: > FWIW: > > 1026 > 1026 > 2050 > 2050 > 3074 > 3074 > 4098 > 4098 > 5122 > 5122 > 6146 > 6146 > 7170 > 7170 > 8194 > 8194 > 9218 > 9218 > 10242 > 10242 > > Linux manonegra 2.4.21k7 #8 Tue Jun 17 18:05:29 BRST 2003 i686 unknown > > -rwxr-xr-x 1 root root 1153784 Apr 8 2003 > /lib/libc-2.2.5.so* Really?! The test program prints 10 lines of output. Do you really get 20 lines? Ah ... OK, you must have run this interactively, from a Python shell. Then the return value from thread.start_new_thread(worker, ()) gets displayed too, and that's the same as what thread.get_ident() returns for the thread. Had me going there <wink>. Anyway, as expected, Linux doesn't reuse tids often, but Windows reuses them as soon as it can. Typical output for me on Windows (running from an interactive shell too, to match the style above): -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 -762973 Alas, it wasn't clear in the original report whether threads *are* getting created and destroyed, or whether the threads in the pool hang around forever. If the latter, it doesn't matter how ofter tids get reused; whether on Linux or Windows, no two threads *simultaneously* alive will return the same tid. |
From: Ian B. <ia...@co...> - 2004-02-19 18:27:36
|
Tim Peters wrote: > Alas, it wasn't clear in the original report whether threads *are* getting > created and destroyed, or whether the threads in the pool hang around > forever. If the latter, it doesn't matter how ofter tids get reused; > whether on Linux or Windows, no two threads *simultaneously* alive will > return the same tid. If it helps, Webware preallocates its threads, and reuses them indefinitely. I'm not sure if this is the case if you are running scheduled tasks in Webware, though I believe that's implemented with a single preallocated thread as well. Maybe Matt or someone can confirm if scheduled tasks are part of the application, or a their use a predictor of the problem. Ian |
From: Matt F. <ma...@da...> - 2004-02-19 20:05:40
|
Ian Bicking wrote: > If it helps, Webware preallocates its threads, and reuses them > indefinitely. > > I'm not sure if this is the case if you are running scheduled tasks in > Webware, though I believe that's implemented with a single > preallocated thread as well. Maybe Matt or someone can confirm if > scheduled tasks are part of the application, or a their use a > predictor of the problem. > We are not using scheduled tasks at all. I suppose that the session cleanup sweeper in WebKit is a scheduled task, but we're not putting and ZODB stuff in session, so it shouldn't matter. |
From: Matt F. <ma...@da...> - 2004-02-19 19:26:41
|
On Thu, Feb 19, 2004 at 01:03:00PM -0500, Tim Peters wrote: | Alas, it wasn't clear in the original report whether threads *are* getting | created and destroyed, or whether the threads in the pool hang around | forever. If the latter, it doesn't matter how ofter tids get reused; | whether on Linux or Windows, no two threads *simultaneously* alive will | return the same tid. I'm pretty sure that they're hanging around forever. I'll look at the appserver source and see if I can be completely sure. Yes, no two threads can ever have the same tid, but what about when a servlet runs, doesn't .commit() or .abort(), and 10 minutes later, it runs again? Same tid, different objects, different circumstances, maybe this time there's a change to commit? |
From: Tim P. <ti...@zo...> - 2004-02-19 20:06:48
|
[Matt Feifarek] > Yes, no two threads can ever have the same tid, but what about when a > servlet runs, doesn't .commit() or .abort(), and 10 minutes later, it > runs again? > > Same tid, different objects, different circumstances, maybe this time > there's a change to commit? I can't guess. It depends on (at least) three things I don't know: 1) When the servlet runs again 10 minutes later, is it running in the same thread it ran in before, or in a different thread? 2) If the answer to #1 is "the same thread", did the servlet happen to run in any other threads between the consecutive times you know it ran in this single thread? 3) *Would* a commit() or abort(), right before the servlet was put to sleep, have done something, had commit() or abort() been called? Since you originally said the problem went away if you: > 2) add a seemingly superfluous get_transaction().abort() call just > before the servlet is put to sleep (and before the database > connection is closed) by the app server; then the best guess I can make is that the answer to #3 is "yes", the answer to #1 is "different threads", and then things are screwed up for reasons Shane explained. If you, e.g., mutate a persistent object P, bound to something.someattr, then merely rebinding something.someattr to None doesn't change that the transaction still has a change to P pending -- you need to commit that change, or abort it, and from the thread that made that mutation. There's still no reason in this scenario for why it *can't* fail on Linux too, but reason to believe it would fail more frequently on Windows (and possibly much more frequently on Windows). |
From: Shane H. <sh...@zo...> - 2004-02-19 18:54:51
|
On Wed, 18 Feb 2004, Matt Feifarek wrote: > The crucial subtlety is that a thread does NOT destroy/garbage-collect > its servlet instance after servicing the request. Rather, the servlet > instance is "put to sleep" and kept around in case another request is > for the same servlet. If the application is taking requests for many > different servlets, the servlet instances do slip in and out of the pool > as different requests come in, but there is no guarantee that a servlet > instance will be brand new with every request. It sounds like a servlet can re-awaken in a different thread from the one in which it was put to sleep. If that happens, changes made before sleeping get registered with a different transaction than the transaction being committed. The new transaction is unaware of the change. Later, you might accidentally commit or abort the original transaction. What a mess! If the above guess is correct, you would benefit from the experimental support for binding transactions to connections (like Kapil mentioned.) Your servlets span threads; therefore, transactions should be bound to connections rather than threads. The API is still experimental because connection-bound transactions really ought to be set up at the time you open the database, not when you open the connection. The API is simple enough that we haven't had a strong reason to fix it, though. Shane |
From: Matt F. <ma...@da...> - 2004-02-19 19:15:51
|
On Thu, Feb 19, 2004 at 01:48:48PM -0500, Shane Hathaway wrote: | | It sounds like a servlet can re-awaken in a different thread from the one | in which it was put to sleep. If that happens, changes made before | sleeping get registered with a different transaction than the transaction | being committed. The new transaction is unaware of the change. Later, | you might accidentally commit or abort the original transaction. What a | mess! I don't believe that this is the case. I think that a servlet instance stays in a thread, and is destroyed if the AppServer needs the thread for another servlet. | If the above guess is correct, you would benefit from the experimental | support for binding transactions to connections (like Kapil mentioned.) | Your servlets span threads; therefore, transactions should be bound to | connections rather than threads. This may still be worth exploring, however. Thanks. |
From: Christian R. R. <ki...@as...> - 2004-02-19 19:01:35
|
On Thu, Feb 19, 2004 at 01:48:48PM -0500, Shane Hathaway wrote: > If the above guess is correct, you would benefit from the experimental > support for binding transactions to connections (like Kapil mentioned.) As a data point, this "experimental" support in my experience works *really* well, and we've got tens of people hammering all day creating and changing objects (the largest BTree has just gone over 100,000 items and growing 1000 objects/week). > The API is still experimental because connection-bound transactions really > ought to be set up at the time you open the database, not when you open > the connection. The API is simple enough that we haven't had a strong > reason to fix it, though. Because I use them via IndexedCatalog's Shelf, it appears to us as if the setting is bound to the database itself, but you're right; mixing local transactions with non-local ones may be a problem I hadn't considered yet. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 |
From: Tim P. <ti...@zo...> - 2004-02-19 19:14:37
|
[Matt Feifarek] >> The crucial subtlety is that a thread does NOT >> destroy/garbage-collect its servlet instance after servicing the >> request. Rather, the servlet instance is "put to sleep" and kept >> around in case another request is for the same servlet. If the >> application is taking requests for many different servlets, the >> servlet instances do slip in and out of the pool as different >> requests come in, but there is no guarantee that a servlet instance >> will be brand new with every request. [Shane Hathaway] > It sounds like a servlet can re-awaken in a different thread from the > one in which it was put to sleep. If that happens, changes made > before sleeping get registered with a different transaction than the > transaction being committed. The new transaction is unaware of the > change. Later, you might accidentally commit or abort the original > transaction. What a mess! It's a decent theory, except for the lack of an obvious reason for why they see these failures on Windows but not Linux (it's possible that, because Windows is more willing to switch threads frequently, it's more *likely* for a servlet to re-awake in a different thread on Windows -- provided that's possible at all, which I don't think we know yet). > If the above guess is correct, you would benefit from the experimental > support for binding transactions to connections (like Kapil > mentioned.) Your servlets span threads; therefore, transactions > should be bound to connections rather than threads. > > The API is still experimental because connection-bound transactions > really ought to be set up at the time you open the database, not when > you open the connection. The API is simple enough that we haven't > had a strong reason to fix it, though. Another random <wink> thing to try: when a servlet re-awakes, call get_transaction().begin() |
From: Matt F. <ma...@da...> - 2004-02-19 22:25:35
|
Shane Hathaway wrote: >It sounds like a servlet can re-awaken in a different thread from the one >in which it was put to sleep. If that happens, changes made before >sleeping get registered with a different transaction than the transaction >being committed. The new transaction is unaware of the change. Later, >you might accidentally commit or abort the original transaction. What a >mess! > > I'm not sure I understand, no changes are EVER made before sleeping without a commit. But here's a better re-cap of what I DO understand: 1. A servlet either exists, or is instantiated 2. As a part of it's building of state, it gets a database connection, and then a handle on an object in the db 3. If no changes are made to the object, no get_transaction().commit() is run (we know if changes are made, with good certainty) 4. If changes ARE made, we do call get_transaction().commit() 5. Before the servlet "shuts down" we assign the references to the object, to the root, to the connection all to None (but we do NOT delattr() them...) 6. The servlet is removed back to the pool, waiting to be deployed into a different thread (there should be no object references or root or connection references at this point) I don't understand how there could be any transaction confusion. How could one "transaction" stay around from one servlet lifecycle to the next? Why would it stay in the thread if it was never used or after it was commited? I have seen that connection instances are re-used when you get a "new" one... but only one of two things can happen: no changes have been made (therefore no "transaction", right?) OR changes are made and IMMEDIATELY after, we commit() the transaction. Somewhere in there lies my misunderstanding: do "transactions" stay around, even after they are committed? And are there "transactions" that exist even if no data has been changed? If there are, so what? Why does code have to treat them so lightly if it didn't actually do anything? >If the above guess is correct, you would benefit from the experimental >support for binding transactions to connections (like Kapil mentioned.) >Your servlets span threads; therefore, transactions should be bound to >connections rather than threads. > > This seems reasonable. But I'd still like to try and understand the nature of transactions. Thanks again. This whole thread has been extremely helpful! |
From: Jeremy H. <je...@zo...> - 2004-02-19 22:35:08
|
On Thu, 2004-02-19 at 17:24, Matt Feifarek wrote: > 1. A servlet either exists, or is instantiated > 2. As a part of it's building of state, it gets a database connection, > and then a handle on an object in the db > 3. If no changes are made to the object, no get_transaction().commit() > is run > (we know if changes are made, with good certainty) You can ask for certain whether changes were made and avoid any question of your own certainty. > 4. If changes ARE made, we do call get_transaction().commit() > 5. Before the servlet "shuts down" we assign the references to the > object, to the root, to the connection all to None (but we do NOT > delattr() them...) You should actually close the connection -- call the close() method. Just before you close the connection, add some variant of this code assert not get_transaction()._objects The _objects attribute stores all the objects registered with the transaction. If you haven't modified anything, the list will be empty. If you have modified something, the assertion will fail. > 6. The servlet is removed back to the pool, waiting to be deployed into > a different thread > (there should be no object references or root or connection > references at this point) How does the servlet get its original reference to the connection? I assume it calls the open() method on the database at the start of each request. (Just checking, because I haven't seen any code.) > I don't understand how there could be any transaction confusion. How > could one "transaction" stay around from one servlet lifecycle to the > next? Why would it stay in the thread if it was never used or after it > was commited? I have seen that connection instances are re-used when you > get a "new" one... but only one of two things can happen: no changes > have been made (therefore no "transaction", right?) OR changes are made > and IMMEDIATELY after, we commit() the transaction. > > Somewhere in there lies my misunderstanding: do "transactions" stay > around, even after they are committed? And are there "transactions" that > exist even if no data has been changed? If there are, so what? Why does > code have to treat them so lightly if it didn't actually do anything? Transactions begin implicitly but are terminated explicitly. When 1) a thread calls get_transaction() or whenever a persistent object is modified and 2) a transaction does not already exist, a new transaction is created. Note that the mechanism in the modification case is for the connection to call get_transaction(). If a transaction already exists when you call get_transaction(), it is used. The transaction is normally associated with a thread. If a thread begins a transaction but does not finish it, then subsequent calls to get_transaction() will return the old, in-progress transaction. Jeremy |
From: Christian R. R. <ki...@as...> - 2004-02-19 22:44:25
|
On Thu, Feb 19, 2004 at 05:25:43PM -0500, Jeremy Hylton wrote: > > 4. If changes ARE made, we do call get_transaction().commit() > > 5. Before the servlet "shuts down" we assign the references to the > > object, to the root, to the connection all to None (but we do NOT > > delattr() them...) > > You should actually close the connection -- call the close() method. > > Just before you close the connection, add some variant of this code > > assert not get_transaction()._objects > > The _objects attribute stores all the objects registered with the > transaction. If you haven't modified anything, the list will be empty. > If you have modified something, the assertion will fail. A note that may come in handy, which is an exception to what Jeremy states above: when a subtransaction is committed, this list is emptied (see Transaction.py:commit for details), so if you are using subtransactions, that can trick you into thinking nothing was changed, when in fact it was commit(1)-ed away. I don't know how to check for objects modified and commit(1)-ed, but I suspect there is a [potentially non-trivial] way. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 |
From: Shane H. <sh...@zo...> - 2004-02-19 22:50:49
|
On Thu, 19 Feb 2004, Matt Feifarek wrote: > I'm not sure I understand, no changes are EVER made before sleeping > without a commit. But here's a better re-cap of what I DO understand: > > 1. A servlet either exists, or is instantiated > 2. As a part of it's building of state, it gets a database connection, > and then a handle on an object in the db > 3. If no changes are made to the object, no get_transaction().commit() > is run > (we know if changes are made, with good certainty) > 4. If changes ARE made, we do call get_transaction().commit() As a sanity check, you might consider calling get_transaction().commit() every time. If there have been no changes, it does no harm. Alternatively, you can call get_transaction().abort() if no changes were intended. A third alternative is to assert that get_transaction()._objects is empty when you made no changes (although that's not a public API.) > 5. Before the servlet "shuts down" we assign the references to the > object, to the root, to the connection all to None (but we do NOT > delattr() them...) > 6. The servlet is removed back to the pool, waiting to be deployed into > a different thread > (there should be no object references or root or connection > references at this point) Since transactions are bound to threads, the transaction stays, along with anything registered in the transaction. If an exception occurs at any point in this process, do you abort the transaction in a "finally" clause? If you don't, the transaction has no way to know that it should be aborted. > I don't understand how there could be any transaction confusion. How > could one "transaction" stay around from one servlet lifecycle to the > next? Why would it stay in the thread if it was never used or after it > was commited? I have seen that connection instances are re-used when you > get a "new" one... but only one of two things can happen: no changes > have been made (therefore no "transaction", right?) OR changes are made > and IMMEDIATELY after, we commit() the transaction. > > Somewhere in there lies my misunderstanding: do "transactions" stay > around, even after they are committed? And are there "transactions" that > exist even if no data has been changed? If there are, so what? Why does > code have to treat them so lightly if it didn't actually do anything? Transactions containing no objects are harmless, that's correct. It might help to watch calls to Transaction.register(). I learned a lot by adding print statements there. Shane |
From: Matt F. <ma...@da...> - 2004-02-19 19:23:51
|
On Wed, Feb 18, 2004 at 10:22:13PM -0500, Tim Peters wrote: | [Matt Feifarek, w/ some ugly thread symptoms seen only on Windows] | | Matt, what does this print for you if you run it on your Linux box? | | """ | import thread, time | | def worker(): | print thread.get_ident() | | for i in range(10): | thread.start_new_thread(worker, ()) | time.sleep(1) | """ On linux: 1026 2050 3074 4098 5122 6146 7170 8194 9218 10242 On Winders: (don't know why I'm getting doubles) (Note, this windows box is python 2.2.1 not 2.3.2... but I'm away from that box today) 1716 1716 1780 1780 684 684 1120 1120 964 964 848 848 1364 1364 576 576 376 376 764 764 | If you let a thread die with changes in progress (neither commit nor abort on its then-current transaction), it looks all but certain to me that ZODB's tid->transaction dict will get confused by the next thread Windows creates (which is all but certain to have the same tid as the thread that just died). So is it "best practice" to do abort() even if no changes are made? Is there even a transaction if no changes are made? Thanks. I'll followup on the other windows box later today. |
From: Jeremy H. <je...@zo...> - 2004-02-19 19:45:35
|
On Thu, 2004-02-19 at 14:18, Matt Feifarek wrote: > So is it "best practice" to do abort() even if no changes are made? Is > there even a transaction if no changes are made? I'm not sure I understand exactly how you're managing Webkit threads and ZODB transactions, so it's hard to be confident with the advice I'm giving. If you've got some code you can point us at, that would be great. ZODB associates a Transaction object with a thread. Modified objects register themselves with their thread's current transaction. If you modify an object in between calling commit() and closing the connection, you will have problems. The next request to run in that thread will pick up a transaction that is already populated with objects from a closed connection. I thought we had done something to raise an exception when a closed connection was involved in a transaction, but I can't find any code to do that. (Shane, if you're listening, do you remember?) One solution is to find out what code is modifying objects but not committing them and fixing it. I think that's got to be the root cause of the problem. Another possibility is to call ZODB.Transaction.free_transaction() when you close the connection. That will delete the Transaction object that holds the registrations from modified objects. This approach feels more like a band-aid than a fix. Jeremy |
From: Shane H. <sh...@zo...> - 2004-02-19 20:15:29
|
On Thu, 19 Feb 2004, Jeremy Hylton wrote: > I thought we had done something to raise an exception when a closed > connection was involved in a transaction, but I can't find any code to > do that. (Shane, if you're listening, do you remember?) ZODB raises an exception if you try to load ("unghostify") an object from a closed connection. A closed connection ought to do something similar when you try to store, but Connection.register() currently just lets it go. It seems like an easy fix. Shane |
From: Jeremy H. <je...@zo...> - 2004-02-19 20:55:56
|
On Thu, 2004-02-19 at 15:09, Shane Hathaway wrote: > On Thu, 19 Feb 2004, Jeremy Hylton wrote: > > > I thought we had done something to raise an exception when a closed > > connection was involved in a transaction, but I can't find any code to > > do that. (Shane, if you're listening, do you remember?) > > ZODB raises an exception if you try to load ("unghostify") an object from > a closed connection. A closed connection ought to do something similar > when you try to store, but Connection.register() currently just lets it > go. It seems like an easy fix. I was thinking about the case where an object is registered with a transaction and its connection is closed. The connection, in general, doesn't have a good way to ask about the transaction. If the connection is closed, at least there will be an error at commit time. If the connection is closed and re-opened, I think it will just do something strange. Jeremy |
From: Dieter M. <di...@ha...> - 2004-02-19 20:10:56
|
Matt Feifarek wrote at 2004-2-18 13:41 -0500: > ... >2) add a seemingly superfluous get_transaction().abort() call just >before the servlet is put to sleep (and before the database connection >is closed) by the app server; apparently, if you make this call (even >when you've made no changes) a side effect is precluding whatever >circumstances are leading to the real problem It is a very good idea to always finish a request with a call to either "commit" or "abort". Otherwise, the next request can get a connection with objects in an undefined state. If it gets a different thread id (than the previous request that used this connection), a commit of this (new) transaction will not be able to make objects with "_p_changed=1" persistent (as they are registered with a different transaction). -- Dieter |
From: Christian R. R. <ki...@as...> - 2004-02-20 17:43:19
|
On Fri, Feb 20, 2004 at 10:58:58AM -0500, Casey Duncan wrote: > > As a sanity check, you might consider calling > > get_transaction().commit() every time. If there have been no changes, > > it does no harm. Alternatively, you can call > > get_transaction().abort() if no changes were intended. A third > > alternative is to assert that get_transaction()._objects is empty when > > you made no changes (although that's not a public API.) > > Hmm, maybe a get_transaction().ismodifed() method that returned True if > there were changes made in the transaction would be useful as a public > API? > > Or, a crazier idea, may the transaction object should be iterable (or > have a method that returns an iterator) of the modified objects in the > transaction? Well, I raised this issue a bit back, and right now I'm curious: how do you find out what objects were modified in subtransactions? If that's easy to discover, then adding API (is_modified() or whatever) is the trivial part. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 |
From: Dieter M. <di...@ha...> - 2004-02-21 17:56:57
|
Christian Robottom Reis wrote at 2004-2-20 14:36 -0300: > ... >> Hmm, maybe a get_transaction().ismodifed() method that returned True if >> there were changes made in the transaction would be useful as a public >> API? >> >> Or, a crazier idea, may the transaction object should be iterable (or >> have a method that returns an iterator) of the modified objects in the >> transaction? > >Well, I raised this issue a bit back, and right now I'm curious: how do >you find out what objects were modified in subtransactions? If that's >easy to discover, then adding API (is_modified() or whatever) is the >trivial part. The transaction is able to detect whether there is work to do in a final commit/abort. Thus, it could provide this information via an API method. *BUT* I doubt that any appliciation should need this information. The appliciation should instead call "commit" or "abort" (whatever is more appropriate). It would be unnecessary complexity (and overhead) to make the calls conditional on whether or not "commit/abort" would really do something. Keep it as simple as possible! -- Dieter |
From: Casey D. <ca...@zo...> - 2004-02-20 20:57:35
|
On Thu, 19 Feb 2004 17:44:38 -0500 (EST) Shane Hathaway <sh...@zo...> wrote: > On Thu, 19 Feb 2004, Matt Feifarek wrote: > > > I'm not sure I understand, no changes are EVER made before sleeping > > without a commit. But here's a better re-cap of what I DO > > understand: > > > > 1. A servlet either exists, or is instantiated > > 2. As a part of it's building of state, it gets a database > > connection, and then a handle on an object in the db > > 3. If no changes are made to the object, no > > get_transaction().commit() is run > > (we know if changes are made, with good certainty) > > 4. If changes ARE made, we do call get_transaction().commit() > > As a sanity check, you might consider calling > get_transaction().commit() every time. If there have been no changes, > it does no harm. Alternatively, you can call > get_transaction().abort() if no changes were intended. A third > alternative is to assert that get_transaction()._objects is empty when > you made no changes (although that's not a public API.) Hmm, maybe a get_transaction().ismodifed() method that returned True if there were changes made in the transaction would be useful as a public API? Or, a crazier idea, may the transaction object should be iterable (or have a method that returns an iterator) of the modified objects in the transaction? -Casey |