Kevin,
 
I've been working through the DBCache design.  My present inclination is that it is best suited to a transactional manager of page
memory.  I have modified the design to include exclusive write locks on regions of a page, which would serve to support the jdbm
requirement for managing physical rows.  (A physical row which spans multiple pages would require a lock on the appropriate
region of each page.)  The use of exclusive write locks on regions of a page also supports the existing logical to physical row id
translation pages since we can lock and update individual slots on a page.  I hope to send along a UML model in a day or so which
lays out the data structures to support this design.
 
Given this design, DBCache will appear as an undifferentiated linear address space supporting shared read locks (on pages) and
concurrent exclusive write locks on regions of pages with transactional processing.  jdbm can then impose its existing data structure
on that linear address space.  Ideally we will even be able to preserve the existing file structure - that would be quite something!
In order to support transactional processing in jdbm we will have to modify the jdbm API to provide a record-oriented transaction
API (based on RecordManager).
 
The goal of object locking is more difficult.  While DBCache would support locking at the page image level, there are still design
issues at the object layer for shared read-locks on objects combined with exclusive write locks on copies of objects (if we use the
same locking scheme at the object layer).  It is certainly feasible to develop a protocol either in a common base class for persistent
objects (I know your objections to this!) or an interface which persistent objects would implement (it might be difficult to get the
interface semantics correct as an application developer).  Another option is BCE (byte code engineering) as you have mentioned
before.  The purpose of object locking would be to facilitate shared access to read-only objects within the JVM heap and exclusive
access to an updatable copy of an object within a Tx.  This provides for safety transaction isolation at the object layer together and
a savings in the heap since read-only objects may be shared across transactions.  The alternative is to have distinct instances of
each persistent object within each transaction, in which case DBCache will provide transaction isolation by rejecting requests for
shared write locks.  That alternative will fall out naturally, so we can then look at the added funtionality of a locking protocol for the
object layer.
 
My own view is that we need work incrementally.  To that end I am currently exploring the design and a prototype for DBCache.  I
hope to send along a UML model shortly.  Once that is working, I would like to re-factor the RecordManager APIs and their current
implementation to support transactions.  With that in hand, I believe that we will be in a good position to consider object locking.
 
-bryan
-----Original Message-----
From: jdbm-developer-admin@lists.sourceforge.net [mailto:jdbm-developer-admin@lists.sourceforge.net] On Behalf Of Kevin Day
Sent: Monday, January 30, 2006 8:24 PM
To: JDBM Developer listserv
Subject: re: [Jdbm-developer] page locking, row locking.

Bryan-
 
I think that working with pages in the dbcache may prove to not be the way to go...  Nothing about the way that people actually use jdbm has to do with pages - they all deal directly with rows (or objects reconstructed from those rows).  I think that trying to manage locking at the page level creates an artificial barrier, when we can go directly to the row level and work there...
 
I'm actually quite partial to Alex's idea of ditching the page cache altogether and use the dbcache for managing individual rows...  As rows are evicted from dbcache, the underlying page manager can do whatever it wants to with them (i.e. write rows in place, write them to a new page, optimize, etc...).
 
There would still be a need for a page cache in the underlying page manager, but this would be mostly for read optimization purposes only - not transaction management (in other words, the dbCache would operate transactionally at a level above the page manager, so the page manager would not need to do any transaction handling itself - rollback or transaction failure would be managed entirely at the row level in the dbCache).
 
If we go down that path, then we have a very big question to ask about the Isolation requirement of ACID...
 
If we strictly enforce isolation, then we must, by definition, create an object per transaction (which will require one copy of each row per transaction, plus the 'current' copy).  The alternative is to ditch isolation in favor of application level consistency semantics (e.g. synchronization at the object level to ensure that simultaneous changes can not force an object into an invalid state).
 
At the current time, I don't see a good way around this tradeoff - maybe there is one I'm just not thinking of...
 
 
For what it's worth, the documented dbCache design does allow for multiple copies of a given page to exist in memory.  The safe itself would only contain one version of the page - the 'current' version.  Basically, you have to be able to reconstruct the state of the cache, with all transactions rolled back, by restoring from the safe.
 
- K
 
 
> All,

I've been doing some modeling and some prototyping in order to understand
the DBCache design a little bit better.  DBCache used page locking and
therefore never had more than one image of a page in the cache.  This means
that: (a) the pageId was a unique key for the cache -- and we need a unique
key in order to support an MRU for the cache; and (b) an original image of a
page was no longer available once a transaction had written on that page --
even if the transaction has not committed yet.

Imagine that we have two transactions, one of which is read only.  Here is
one possible serialization of events:

Tx1 begins (readOnly).

Tx1 reads the state of page12.

Tx2 begins.

Tx2 updates page10 (copy on write converts an original of page10 into a
copy).

Tx1 reads page10 and is aborted since page10 is not an original.

Tx2 commits.

This shows how a transaction which starts earlier can be aborted by an
updated on a page which it has not read yet.  Some stores allow this and
expose the original state of the store to transactions which have begun
before the writer commits.  This could be done by maintaining the original
image of page10 during the copy-on-write.  Any transaction wanting to read
page10 can still do so until Tx2 commits.  I am not clear on the effect of
Tx2 committing.  Does it result in Tx1 seeing the version of page10 from
when Tx1 started or the updated version of page10 which was committed by Tx2
-- presumably the latter.

I think that we might want to create or research such serializations of
events in multiple transactions in order to make some decisions about the
polices we want to support for jdbm.  Those cases can be readily captured as
unit tests for validation.

I have also been thinking about strategies to support row locking.  My
inclination is to expose the page image as an nio ByteByffer wrapping the
byte[].  This makes it possible to expose an immutable view of the data.  It
also makes it possible to expose only a region of the image for a page.  In
the context of jdbm, and with some additional state for row locks on the
page, this could be used to request only the region of a page corresponding
to a physical row.  This would effect a read (or write) lock on that part of
the page.  Since a physical row can span more than one page in jdbm, row
locking in DBCache would be in terms of an offset and length on a page
image, so jdbm would create a row lock on the appropriate region of each
page that was spanned by the physical record.

I've done a little thinking about the data structure for maintaining such
row locks.  I am inclined to start out with an insertion sort on the offset
into the page and a constraint test during the insert to verify that the
offset + range is less than the offset of next row lock on the page.

It seems to me that both allowing page reads of originals which have been
written on by uncommitted transactions and allowing row locking will require
that we allow up to 2C page images into the cache, where C is the capacity
of the safe.  DBCache as published only permitted a single image for a page
into the cache.

Cheers,

-bryan


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Jdbm-developer mailing list
Jdbm-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<
------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Jdbm-developer mailing list Jdbm-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jdbm-developer