Thread: [SQLObject] Cache Lock Problems
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Luke O. <lu...@me...> - 2003-04-14 20:06:14
Attachments:
Cache.py
|
(Using SQLObject 0.3-CVS) Hello all - Wow. Had a big long email written out, went back to debug a little more after lunch, and found the bug(s). Still would like a way to _completely_ turn off caching, including weakref\'ing. But it'd only be useful for us on days like today, when we're suffering from a bug there and would prefer to have all but one of us just avoid it. :) Symptoms: We've been having SQLObject completely hang the python interpreter on a regular basis, after some digging it turned out to be during Cache lookups. More specifically, if you stored an object, then destroyed it (purge in Cache), then tried to do a lookup for that id in the cache again (expecting failure), it would hang. Having found the problem, it's likely this behavior could happen at other points, but that's the easy way to duplicate. :) Fix: Cache.py is missing a lot of lock statements. I've attached a fixed version with "## LDO" comments, they are all in get() or put(). - Luke |
From: Ian B. <ia...@co...> - 2003-04-14 21:55:47
|
On Mon, 2003-04-14 at 14:52, Luke Opperman wrote: > Symptoms: We've been having SQLObject completely hang the > python interpreter on a regular basis, after some digging > it turned out to be during Cache lookups. More > specifically, if you stored an object, then destroyed it > (purge in Cache), then tried to do a lookup for that id in > the cache again (expecting failure), it would hang. Having > found the problem, it's likely this behavior could happen > at other points, but that's the easy way to duplicate. :) Hmmm... purge, delete, etc, aren't well thought out yet. I.e., they don't ensure consistency. > Fix: Cache.py is missing a lot of lock statements. I've > attached a fixed version with "## LDO" comments, they are > all in get() or put(). Actually there shouldn't be a locks there -- if None is returned from get(), the caller is responsible for creating an object and putting it into the cache with put(). Otherwise you can't be sure that two objects won't be created. I have a feeling if there's a problem, it's in SQLObject.__new__, where maybe it isn't careful enough about this sequence of calls. Anyway, I'll look at this more closely later. Taxes due tomorrow :( Ian |
From: Luke O. <lu...@me...> - 2003-04-14 22:22:06
Attachments:
test_cache.py
|
Quoting Ian Bicking <ia...@co...>: > On Mon, 2003-04-14 at 14:52, Luke Opperman wrote: > ... > Hmmm... purge, delete, etc, aren't well thought out yet. I.e., > they don't ensure consistency. Hmm. What consistency should happen here? The problem we identified was _not_ a thread/race-condition, but a single-thread issue that can be identified completely separate from SQLObject. I've attached a test script I used, that does the insert/purge/retrieve steps I mentioned last time, directly on a CacheSet. > Actually there shouldn't be a locks there -- if None is returned > from get(), the caller is responsible for creating an object and > putting it into the cache with put(). Otherwise you can't be sure > that two objects won't be created. I'm confused. Are you saying it is the caller's responsibility to lock a get/fail/put cycle? I can understand that, in which case an alternate fix to Cache.py is to remove all locking. Doesn't change the fact that locking is very broken in the currently released file, singlethreaded or not. :) Or is the intention that Cache is expected to handle the locking, but by the caller executing get and put in a specific order (this is more like the code appears to be...). Not sure I agree with this plan, and I'd definitely prefer it was more explicit (CacheSetInst.createLock.acquire()/release() by the caller?). Ok, i've blabbered enough. It appears the second one is what you were trying to implement, and I see your point about ensuring two copies aren't created if you don't have a lock in some way around the whole get/fail/put cycle... from my look it seems SQLObject.__new__ is doing the right thing then, so I'm feeling less comfy... > I have a feeling if there's a problem, it's in > SQLObject.__new__, where maybe it isn't careful enough about this > sequence of calls. Anyway, I'll look at this more closely later. > Taxes due tomorrow :( Enjoy, - Luke |
From: Ian B. <ia...@co...> - 2003-04-16 21:10:44
|
On Mon, 2003-04-14 at 17:08, Luke Opperman wrote: > Quoting Ian Bicking <ia...@co...>: > > > On Mon, 2003-04-14 at 14:52, Luke Opperman wrote: > > ... > > Hmmm... purge, delete, etc, aren't well thought out yet. I.e., > > they don't ensure consistency. > > Hmm. What consistency should happen here? The problem we identified > was _not_ a thread/race-condition, but a single-thread issue that can > be identified completely separate from SQLObject. I've attached a > test script I used, that does the insert/purge/retrieve steps I > mentioned last time, directly on a CacheSet. Yes, the whole destroy/purge thing is broken. That's why I didn't document it :) I'll have to take some time to read it over more closely and probably reimplement it. > > Actually there shouldn't be a locks there -- if None is returned > > from get(), the caller is responsible for creating an object and > > putting it into the cache with put(). Otherwise you can't be sure > > that two objects won't be created. > > I'm confused. Are you saying it is the caller's responsibility to lock > a get/fail/put cycle? I can understand that, in which case an > alternate fix to Cache.py is to remove all locking. Doesn't change > the fact that locking is very broken in the currently released file, > singlethreaded or not. :) No, there's no lock required if the object is already in the cache. There's only a lock required if the cache lookup fails. That's why the caller doesn't do the locking, the Cache instance does. > Or is the intention that Cache is expected to handle the locking, but > by the caller executing get and put in a specific order (this is more > like the code appears to be...). Not sure I agree with this plan, and > I'd definitely prefer it was more explicit > (CacheSetInst.createLock.acquire()/release() by the caller?). Yes, that's it. But like I said, the caller doesn't know if a lock is required. Ian |
From: Ian B. <ia...@co...> - 2003-04-17 03:08:20
|
Okay, I see the problem: def testPurge1(self): x = CacheSet() y = Something() obj = x.get(1, y.__class__) self.assertEqual(obj, None) x.put(1, y.__class__, y) j = x.get(1, y.__class__) self.assertEqual(j, y) x.purge(1, y.__class__) j = x.get(1, y.__class__) print "Got %r" % j So when you do that last .get, you've locked the cache and you don't do a put to unlock it. So that's an error in how you're using CacheSet/CacheFactory. I've changed it slightly so that you can use try:finally: to protect against this, which SQLObject instances now do. I suspect some error was happening during instantiation of some object, and that's what was causing the bug. Ian |