From: Gavin_King/Cirrus%<CI...@ci...> - 2002-03-02 06:50:58
|
> Here's an alternate design... I have some questions: > Maintain (potentially) multiple versions of objects in the cache, each > timestamped with the commit time of the writing session. The session > enters the object into the cache when the session commits. Sessions > can also enter objects at load time (on a cache miss) with the > timestamp set to the session start time. Q1. the "commit time" would at best be a time "sometime after commit", unless we did something awful like synchronizing around the commit(). So how would you know which version actually represented the latest version in the following: (suppose isolation level is read committed or less) transaction 1 updates foo transaction 2 updates foo transaction 1 commits transaction 2 commits transaction 2 grabs its timestamp transaction 1 grabs its timestamp > Then any read simply takes (a version of) an object from the cache if > the object's timestamp is less than the session start time. Of course, > it must take the newest such instance (older than the session). Q2. I think I see part of your reasoning for "less than the session start time", but could you please elaborate.... Q3. We still need to do the stale-check, right? As I said to Doug in a private email, I've gone cold on the idea of doing stale-checking using anything other than version numbers. The performance cost is too high.... (i will elaborate in *this* later) > - there is no synchronization, except to maintain the integrity of the > data structure (i.e., no lock counts, etc.) If by "sychronization" you mean blocking other threads, any proposed solution _must_ obey this, far as im concerned. |
From: Gavin_King/Cirrus%<CI...@ci...> - 2002-03-02 07:37:56
|
> > transaction 1 updates foo > > transaction 2 updates foo > > transaction 1 commits > > transaction 2 commits > > transaction 2 grabs its timestamp > > transaction 1 grabs its timestamp > You wouldn't know (but it's no worse than the alternative, timestamp > at update time, is it?). In fact, even if you synchronized, you > wouldn't know: the database could order these transactions either way. > It's an example of why I've been queasy about this cache all along. I believe my proposal avoids this problem by 1. Only caching stuff that came direct from a select. (important) 2. Never caching an instance until we know all updating transactions are finished You pointed out the problem that the select could return stale data, but it can never return data that was stale as of the start of the transaction. If we restrict it to only cache stuff that no-ones _begun_ to modify at any time _after_ the _start_ of the transaction, it can't possibly add stale data to the cache, right? > Well, if you are trying to get serializable transactions, you can't > see any modifications made after you've begun. If we allow reading > data written after session start, we've entered non-ACID territory. cool. Thats an improvement to what I had in mind. I think its possible to merge the good points of these proposals. So I'm very happy we are having this discussion. |
From: Doug C. <de...@fl...> - 2002-03-02 08:15:10
|
>> wouldn't know: the database could order these transactions either way. >> It's an example of why I've been queasy about this cache all along. > I believe my proposal avoids this problem by > 1. Only caching stuff that came direct from a select. (important) It does seem safer that way, but... > 2. Never caching an instance until we know all updating transactions are > finished you can never know. The database has a lot of latitude, within the isolation level model, to reorder things. > You pointed out the problem that the select could return stale data, but > it can never return data that was stale as of the start of the transaction. > If we restrict it to only cache stuff that no-ones _begun_ to modify at > any time _after_ the _start_ of the transaction, it can't possibly add > stale data to the cache, right? Well, we could also break transaction isolation in another way T1 starts T1 loads obj1 v1 T1 caches obj1 v1 T1 completes T2 starts (obj1 v1) T2 reads obj1 v1 from cache T3 starts (obj1 v1) T3 updates obj1 v2 T3 commits obj1 v2 T2 updates obj2 (based on obj1 v1) <<< error T2 commits The database didn't "see" T2 read obj1 since it came from the cache. So the database can't know that T2 and T3 interfere, and T2 must be rolled back. > I think its possible to merge the good points of these proposals. So > I'm very happy we are having this discussion. Agreed. I think version columns are the safest route. e |
From: Gavin_King/Cirrus%<CI...@ci...> - 2002-03-02 07:47:59
|
> If we restrict it to only cache stuff that no-ones _begun_ to modify at > any time _after_ the _start_ of the transaction, it can't possibly add > stale data to the cache, right? uurrrghhhh was badly put. There are two rules: 1. if some session started updating an instance, you cant cache it 2. if some session finished updating the instance _after_ the start of the current sassion, you cant cache it |
From: Gavin_King/Cirrus%<CI...@ci...> - 2002-03-02 08:29:11
|
> The database didn't "see" T2 read obj1 since it came from the cache. > So the database can't know that T2 and T3 interfere, and T2 must be > rolled back. yes, im aware of this one. thats why we certainly need to check version numbers when we update. |
From: Doug C. <de...@fl...> - 2002-03-02 19:19:43
|
>> The database didn't "see" T2 read obj1 since it came from the cache. >> So the database can't know that T2 and T3 interfere, and T2 must be >> rolled back. > yes, im aware of this one. thats why we certainly need to check version > numbers when we update. But no other transaction wrote ojb2, so its version number is OK. Only the versions of objects it depends on have changed (if indeed they are versioned at all). Perhaps this is why you say the isolation level will be weakened? e |
From: Doug C. <de...@fl...> - 2002-03-02 07:17:12
|
>> Maintain (potentially) multiple versions of objects in the cache, each >> timestamped with the commit time of the writing session. The session >> enters the object into the cache when the session commits. Sessions >> can also enter objects at load time (on a cache miss) with the >> timestamp set to the session start time. > Q1. > the "commit time" would at best be a time "sometime after commit", unless > we did something awful like synchronizing around the commit(). So how would > you know which version actually represented the latest version in the > following: > (suppose isolation level is read committed or less) > transaction 1 updates foo > transaction 2 updates foo > transaction 1 commits > transaction 2 commits > transaction 2 grabs its timestamp > transaction 1 grabs its timestamp You wouldn't know (but it's no worse than the alternative, timestamp at update time, is it?). In fact, even if you synchronized, you wouldn't know: the database could order these transactions either way. It's an example of why I've been queasy about this cache all along. [Version info could be used to force (adjust) the order of the timestamps.] >> Then any read simply takes (a version of) an object from the cache if >> the object's timestamp is less than the session start time. Of course, >> it must take the newest such instance (older than the session). > Q2. > I think I see part of your reasoning for "less than the session start > time", but could you please elaborate.... Well, if you are trying to get serializable transactions, you can't see any modifications made after you've begun. If we allow reading data written after session start, we've entered non-ACID territory. > Q3. We still need to do the stale-check, right? I believe so, both because of Q1, and to protect against people trying to use this across multiple JVMs. >> - there is no synchronization, except to maintain the integrity of the >> data structure (i.e., no lock counts, etc.) > If by "sychronization" you mean blocking other threads, any proposed > solution _must_ obey this, far as im concerned. Yes, I mean both thread synchronization, and cache "book keeping" or locking to redirect loads to the database. e |
From: Paul S. <pau...@ne...> - 2002-03-02 19:50:45
|
Is it possible? There's mention of optimistic locking stuff, but not the plain old vanilla pessimistic variety. Does hibernate handle this? Even just by asking the underlying database to do it? Regards, PaulS. |