Re: [Xorm-devel] Distributed Cache Management

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

This looks really good, thanks for the writeup.

Two quick comments:
1) This isn't cache management, it's lock management.. I'm guessing the
existing CacheManager implementation would be the component that
interacts with this, though.

2) We might want to see if the javax.transaction.xa interfaces map well
to this in terms of a standardized API.  I'll look at the Javadoc and
see if I can find a fit.

Cheers,

Wes

Harry Evans wrote:

> I have a simple design for distributed cache management.  It has a 
> couple of holes, but doesn't seem like a bad first pass at it.  I 
> would like to get feedback from the group to see what others think.
>
> CacheManager methods (don't worry about local vs. remote right now)
> -lock(Object o, transactionId) : Lock
> Checks version of object, and attempts to create a lock object for 
> it.  Version is either some kind of change counter, or some time kind 
> of time stamp of last update time in the datastore.
> --o
> The xorm managed object to lock
> --transactionId
> a relatively unique id for the transaction obtaining the lock.  Used 
> to allow the same transaction to acquire a lock on the same object 
> more than once.
> --returns
> If the object is already locked, returns the special LOCK_FAILED Lock 
> object. (An overridden version of the method might allow the caller to 
> specify waiting until the lock is obtained)
> If a lock is granted, returns a Lock object with status LOCK_SUCCESS, 
> and a valid lock expiration time.
> If a lock is granted, but the object is stale, returns a Lock object 
> with LOCK_REFRESH, and a valid lock expiration time.  This means that 
> the caller has obtained a lock on the object, but needs to refresh the 
> object with the data store.
>
>
> -commit(Object o, transactionId) : Lock
> Verifies that this object is has a lock held by the transactionId.  If 
> so, it will update the version in the Cache Manager to match the 
> version held by the object.  This method should be called AFTER a 
> change to the datastore, but before committing the changes to the 
> datastore.  This method also depends on the version of the object 
> being updated based on the changes to the datastore, even though it is 
> not yet committed.  Regardless of outcome, this method will result in 
> releasing the lock on this object in the CacheManager.
> --o
> The xorm managed object that a lock was previously obtained for
> --transactionId
> the transactionId that was used to obtain the lock for o
> --returns
> If there is a lock for this object, and the transactionId matches, and 
> the lock has not expired, returns COMMIT_SUCCESS
> Otherwise if the lock does not exist, is not for this transactionId, 
> or has expired, returns COMMIT_FAILURE.  The proper action on a 
> COMMIT_FAILURE might be a retry, or a rollback.
>
> -check(Object o) : boolean
> Checks whether the object is the most current version of the object 
> seen by the CacheManager.
> --o
> The object to check
> --returns
> Returns true if this version of the object is the most recent seen by 
> the CacheManager
> otherwise false.
>
> -refreshLock(Object o, transactionId) : LOCK
> Checks for a valid lock on the object, and if all is good, extends the 
> expiration of the Lock.  Might not be needed (could just be rolled 
> into lock method).
> --return values as per lock() method.
>
>
> The concept behind all this is that the CacheManager object does not 
> talk to the db, but simply deals with objects as it sees them.  Given 
> a compliant local implementation of XORM, all objects would be seen 
> whenever a transaction was involved, so there shouldn't be a race 
> condition issue.  A simple sequence diagram follows:
>
> XORM                    CacheManager                DataStore
> ----                    ------------                ---------
>  |                           |                          |
> | |---startTrans()--         |                          |
> | |                |         |                          |
> | |<----------------         |                          |
> | |                          |                          |
> | |---lock(o, tId)--------->| |                         |
> | |<----------Lock object---| |                         |
> | |                          |                          |
> | |---refreshObject(o) (if stale)--------------------->| |
> | |<---------------------------------refreshedObject---| |
> | |                          |                          |
> | |---make changes--         |                          |
> | |                |         |                          |
> | |<----------------         |                          |
> | |---commitTrans() (begin)- |                          |
> | |                        | |                          |
> | |<------------------------ |                          |
> | |                          |                          |
> | |---makeUpdates (don't commit)---------------------->| |
> | |<-----------------------------------------success---| |
> | |                          |                         | |
> | |---commit(o, tid)------->| |                        | |
> | |<--------------success---| |                        | |
> | |                          |                         | |
> | |---commitTransaction------------------------------->| |
> | |<-----------------------------------------success---| |
> | |                          |                          |
> | |---commitTrans() (end)-   |                          |
> | |                      |   |                          |
> | |<----------------------   |                          |
>  |                           |                          |
>
> I think the whole scheme might work, if you assume that no one 
> operates outside of it.  I don't know if that is a reasonable caveat 
> or not.  Obviously, it is possible for the distributed cache manager 
> to provide methods for locking or committing more than one object at 
> the same time.  Also, it is assumes that the distributed cache manager 
> would most likely be "remote" from the XORM instance using it, so the 
> "local" methods I showed should be assumed to be local facades to a 
> remote distributed cache manager.  I tis probably obvious (but worth 
> stating)  that all XORM instances dealing with a particular datastore 
> would use the same CacheManager, possibly with CacheManager 
> replication for failover.  Persistent connections, and good 
> communications design should make it relatively low overhead data wise.
>
> The Lock object as it is internally held by the CacheManager might 
> look something like this:
> Lock
> ----
> class : Class
> id : objectPrimaryKey
> version : Version (int, timestamp, etc)
> txnId : TransactionId (int, long, etc)
> expires: Time (long, Date, etc)
>
> For objects that have been seen, but are not currently locked, txnId 
> and expires would be null.  For objects that are locked, all values 
> would be non null.  Locks can be removed lazily, by checking a lock 
> when needed, and if it is expired, nulling txnId and expires.
>
> This might be way off base, but it seemed like a good first pass.  I 
> am not trying to reinvent the wheel here, but it seems like something 
> like this might solve most of the "coordinated" or "distributed" cache 
> management issues we see with xorm right now.
>
> Thoughts / comments / questions?
>
> Harry