From: Gavin_King/Cirrus%<CI...@ci...> - 2002-03-02 02:00:36
|
You are absolutely spot on in most of your points, Doug. But it was something I did need to suck-and-see.... > I liked your shortcut method of dirty checking, also. It was a big > performance win, I suspect, and I wonder if you are trading it away > for a feature (caching) of dubious value for some types of collections. yeah the shortcut method stays (at least as an optimisation). > I hope you will keep "deep lazy collections" in mind in the design. > Here are some things to consider... yeah its probably a bigger performance boost than caching for some applications. > Where are cloning and diffing necessary? > Is it possible for an application to avoid it (e.g., by not assigning > a collection to more than one persistent property, and/or by not > caching the collection)? I will need to keep working through this to really answer the question. But what I know already is that fully diffing the object graphs is not going to solve the problems I thought it would solve, so its off the table for now. After playing around with the code for a while, I already realised that radical changes won't be the answer. What I *am* trying to do is implement stale checking for nonversioned objects. For versioned objects, what you do is this UPDATE some_table SET foo='foo' WHERE id=20002 AND version=69 for non-versioned objects, it would have to be UPDATE some_table SET foo='foo' WHERE id=20002 AND foo='old value' This is useful for more than caching. Its also useful for optimistic locking, which is something I would like to provide better support for in the future. Interestingly, it seems to be a *huge* negative performance hit. > I have watched other object database projects founder on caching > implementation problems. There is *no* easy way to do it. In order to > support multiple transactions you *must* have multiple versions of > objects materialized. ODMG pays lip service to this; JDO defines > various kinds of object equality to support it. Yeah, I have said earlier that one of the main aims of the project is to *not* implement a database in RAM, on top of the database on disk. This would be the path to endless concurrency bugs, nonscalability and low performance. > One of the reasons I was drawn to Hibernate was its simplicity, and > its dependence on the JDBC and database layers for the capabilities > they provide, such as transactions. JDBC and the database also (can > and sometimes do) provide caching. I liked the fact that Hibernate > "bit the bullet" and supported materialization of an object > independently in several transactions (sessions). It seems to me that > there are only two caching strategies consistent with this: (a) let > the database do it, and/or (b) a separate cache per session. What I have in mind is something *very* moderate. Basically my notion of caching is to allow sessions to use state loaded by a prevous session as long as everyone is only trying to _read_ the instance. As soon as some session wants to update it, we mark the instance uncacheable (in the cache) and force a return to the "normal" behaviour of hibernate. sessions which got the object from the cache will need to update it using a stale checking update (above). Once we know the outcome of all updating sessions, we can mark the instance as cachable again. This approach avoids redefining the transaction isolation level from how its defined by the database and still delegates all locking to the DB. Well, actually it would weaken serializable isolation, but i *think* it leaves repeatable read untouched. Do you see any obvious problems with this approach that makes it a waste of time? |