Thread: [Hibernate] Re: [Hibernate-devel] distributed caching

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Christoph sorry about the slow ping time. It takes a while to collect
thoughts and write a response to some of these things.....

>>> Yesterday I was thinking about implementing a distributed cache for
Hibernate. I want each node to have its own cache, but if one node writes
to its db the data should be invalidated in all caches. You once mentioned
that hibernate would need a transaction aware distributed cache to support
distributed caching.
I dont get why this is necessary. Can you tell me what kind of problems you
think i will run into when trying to implement such a beast, and where you
think I could start?

I was thinking about using jcs as cache, and when a session is committed,
just invalidate all written objects in the cache. <<<

If you have a look over cirrus.hibernate.ReadWriteCache you'll see that
theres some interesting logic that ensures transaction isolation is
preserved. A cache entry carries around with it:

(0) the cached data item if the item is fresh
(1) the time it was cached
(2) a lock count if any transactions are currently attempting to update the
item
(3) the time at which all locks had been released for a stale item

All transactions lock an item before attempting to update it and unlock it
after transaction completion.

ie. the item has a lifecycle like this:

                                 lock
                               <---------
  ------> fresh ------> locked ---------> stale
    put          lock           release

(actually the item may be locked and released multiple times while in the
"locked" state until the lock count hits zero but the difficulty of
representing that surpasses my minimal ascii-art skills.)

A transaction may read an item of data from the cache if the transaction
start time is AFTER the time at which the item was cached. (If not, the
transaction must go to the database to see what state the database thinks
that transaction should see.)

A transaction may put an item into the cache if

 (a) there is no item in the cache for that id
              OR
 (b) the item is not fresh AND
 (c) the item in the cache with that id is unlocked AND
 (d) the time it was unlocked BEFORE the transaction start time

So what all this means is that when doing a put, when locking, and when
releasing, the transaction has to grab the current cache entry, modify it,
and put it back in the cache _as_an_atomic_operation_.

If you look at ReadWriteCache, atomicity is enforced by making each of
these methods synchronized (a rare use of synchronized blocks in
Hibernate). However, in a distributed environment you would need some other
kind of method of synchronizing access from multiple servers.

I imagine you would implement this using something like the following:

* Create a new implementation of CacheConcurrencyStrategy
-DistributedCacheConcurrencyStrategy
* DistributedCacheConcurrencyStrategy would delegate its functionality to
ReadWriteCache which in turn delegates to JCSCache (which must be a
distributed JCS cache, so all servers see the same lock count + timestamps)
* implement a LockServer process that would sit somewhere on the network
and hand out very-short-duration locks on a particular id.
* DistributedCacheConcurrencyStrategy would use the LockServer to
synchronize access to the JCS Cache between multiservers.

Locks would be expired on the same timescale as the cache timeout (which is
assumed in all this to be >> than the transaction timeout) to allow for
misbehaving processes, server failures, etc.

Of course, any kind of distributed synchronization has a *very* major
impact upon system scalability.

I think this would be a very *fun* kind of thing to implement and would be
practical for some systems. It would also be a great demonstration of the
fexibility of our approach because clearly this is exactly the kind of
thing that Hibernate was never meant to be good for!

:)

Gavin

Thread: [Hibernate] Re: [Hibernate-devel] distributed caching

An object relational-mapping (ORM) library for Java

hibernate-devel