|
From: Harry E. <ha...@tr...> - 2004-06-16 04:03:38
|
I have a simple design for distributed cache management. It has a couple of holes, but doesn't seem like a bad first pass at it. I would like to get feedback from the group to see what others think. CacheManager methods (don't worry about local vs. remote right now) -lock(Object o, transactionId) : Lock Checks version of object, and attempts to create a lock object for it. Version is either some kind of change counter, or some time kind of time stamp of last update time in the datastore. --o The xorm managed object to lock --transactionId a relatively unique id for the transaction obtaining the lock. Used to allow the same transaction to acquire a lock on the same object more than once. --returns If the object is already locked, returns the special LOCK_FAILED Lock object. (An overridden version of the method might allow the caller to specify waiting until the lock is obtained) If a lock is granted, returns a Lock object with status LOCK_SUCCESS, and a valid lock expiration time. If a lock is granted, but the object is stale, returns a Lock object with LOCK_REFRESH, and a valid lock expiration time. This means that the caller has obtained a lock on the object, but needs to refresh the object with the data store. -commit(Object o, transactionId) : Lock Verifies that this object is has a lock held by the transactionId. If so, it will update the version in the Cache Manager to match the version held by the object. This method should be called AFTER a change to the datastore, but before committing the changes to the datastore. This method also depends on the version of the object being updated based on the changes to the datastore, even though it is not yet committed. Regardless of outcome, this method will result in releasing the lock on this object in the CacheManager. --o The xorm managed object that a lock was previously obtained for --transactionId the transactionId that was used to obtain the lock for o --returns If there is a lock for this object, and the transactionId matches, and the lock has not expired, returns COMMIT_SUCCESS Otherwise if the lock does not exist, is not for this transactionId, or has expired, returns COMMIT_FAILURE. The proper action on a COMMIT_FAILURE might be a retry, or a rollback. -check(Object o) : boolean Checks whether the object is the most current version of the object seen by the CacheManager. --o The object to check --returns Returns true if this version of the object is the most recent seen by the CacheManager otherwise false. -refreshLock(Object o, transactionId) : LOCK Checks for a valid lock on the object, and if all is good, extends the expiration of the Lock. Might not be needed (could just be rolled into lock method). --return values as per lock() method. The concept behind all this is that the CacheManager object does not talk to the db, but simply deals with objects as it sees them. Given a compliant local implementation of XORM, all objects would be seen whenever a transaction was involved, so there shouldn't be a race condition issue. A simple sequence diagram follows: XORM CacheManager DataStore ---- ------------ --------- | | | | |---startTrans()-- | | | | | | | | |<---------------- | | | | | | | |---lock(o, tId)--------->| | | | |<----------Lock object---| | | | | | | | |---refreshObject(o) (if stale)--------------------->| | | |<---------------------------------refreshedObject---| | | | | | | |---make changes-- | | | | | | | | |<---------------- | | | |---commitTrans() (begin)- | | | | | | | | |<------------------------ | | | | | | | |---makeUpdates (don't commit)---------------------->| | | |<-----------------------------------------success---| | | | | | | | |---commit(o, tid)------->| | | | | |<--------------success---| | | | | | | | | | |---commitTransaction------------------------------->| | | |<-----------------------------------------success---| | | | | | | |---commitTrans() (end)- | | | | | | | | |<---------------------- | | | | | I think the whole scheme might work, if you assume that no one operates outside of it. I don't know if that is a reasonable caveat or not. Obviously, it is possible for the distributed cache manager to provide methods for locking or committing more than one object at the same time. Also, it is assumes that the distributed cache manager would most likely be "remote" from the XORM instance using it, so the "local" methods I showed should be assumed to be local facades to a remote distributed cache manager. I tis probably obvious (but worth stating) that all XORM instances dealing with a particular datastore would use the same CacheManager, possibly with CacheManager replication for failover. Persistent connections, and good communications design should make it relatively low overhead data wise. The Lock object as it is internally held by the CacheManager might look something like this: Lock ---- class : Class id : objectPrimaryKey version : Version (int, timestamp, etc) txnId : TransactionId (int, long, etc) expires: Time (long, Date, etc) For objects that have been seen, but are not currently locked, txnId and expires would be null. For objects that are locked, all values would be non null. Locks can be removed lazily, by checking a lock when needed, and if it is expired, nulling txnId and expires. This might be way off base, but it seemed like a good first pass. I am not trying to reinvent the wheel here, but it seems like something like this might solve most of the "coordinated" or "distributed" cache management issues we see with xorm right now. Thoughts / comments / questions? Harry |