afraid that I did not see a strategy laid out in enough detail to judge.
Did I miss something?
notion of a analog of DBCache in which objects live in the safe and cache might
However, one of the things which has become evident in prototying is that
the safe is designed for
sequential writes of fixed size records. I tried a design in which the
writes on the safe only contained
region of a page which was actually updated (this often corresponds to a
physical row, unless the physical
spans more than one page) and rejected it for several reasons. First,
since the safe is oriented at sustained
sequential I/O, I am not sure that there will be a performance gain from
only logging updated regions. Second,
recovery you can not simply retrieve the page from the safe since
all you have is a part of the page. You
to go to the DB and do a random access to get the BFIM and then merge the
updated region from the safe.
means that we are sacrificing fast restart. Third, when you are making
many changes to the same page,
logging those changes as distinct entries on the safe will actually take
more I/O than just logging the entire page.
Finally, the design of the safe management algorithms are entirely
oriented towards fixed size records in the safe.
would need a new design which was thought through in depth for the safe if it
only stored serialized objects
parts of a page).
general, I feel that we are better off by moving incrementally from a design
which has been "tested". We
build that, critique it, and look for further improvements. There are more
than enough gotcha's in doing
that. Call it risk management.
I went down this path as well (had a nifty
locking mechanism based on a bitmap, etc...), but after hearing Alex's idea, I
changed my mind... What are your comments on his strategy?
BTW - Alex's approach is also conducive to
leaving the current file structure in place - what's even better, it allows
for a completely orthoganol paging optimization strategy for the underlying
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes searching
your log files as easy as surfing the web. DOWNLOAD SPLUNK!
_______________________________________________ Jdbm-developer mailing list
I've been working through the DBCache design.
My present inclination is that it is best suited to a
transactional manager of page
memory. I have modified the
design to include exclusive write locks on regions of a page,
which would serve to support the jdbm
requirement for managing
physical rows. (A physical row which spans multiple pages
would require a lock on the appropriate
region of each page.)
The use of exclusive write locks on regions of a page also
supports the existing logical to physical row id
pages since we can lock and updateindividual slots on a page.
I hope to send along a UML model in a day or so which
lays out the data structures to support this design.
Given this design, DBCache will appear as an
undifferentiated linear address space supporting shared read locks
(on pages) and
concurrent exclusive write locks on regions of pages
with transactional processing. jdbm can then impose its
existing data structure
on that linear address space.
Ideally we will even be able to preserve the existing file
structure - that would be quite something!
In order to
support transactional processing in jdbm we will have to modify the
jdbm API to providea record-oriented transaction
(based on RecordManager).
The goal of object
locking is more difficult. While DBCache would support
locking at the page image level, there are still design
at the object layer for shared read-locks on objects combined with
exclusive write locks on copies of objects (if we use the
locking scheme at the object layer). It is certainly
feasible to develop a protocol either in a common base class for
objects (I know your objections to this!) or an interface
which persistent objects would implement (it might be difficult to
interface semantics correct asan application
developer). Another option is BCE (byte code engineering) as
you have mentioned
before. The purpose of object locking would
be to facilitate shared access to read-only objects within the JVM
heap and exclusive
access to an updatable copy of an
object within a Tx. This provides for safety transaction
isolation at the object layer together and
a savings in the
heap since read-only objects may be shared across transactions.
The alternative is to have distinct instances of
persistent object within each transaction, in which case DBCache
will provide transaction isolation by rejecting requests for
shared write locks. That alternative will fall out
naturally, so we can then look at the added funtionality of a
locking protocol for the
view is that we need work incrementally. To that end I am
currently exploring the design and a prototype for DBCache.
hope to send along a UML model shortly. Once
that is working, I would like to re-factor the RecordManager APIs
and their current
implementation to support transactions.
With that in hand, I believe that we will be in a good
position to consider object locking.
On Behalf Of Kevin Day
Sent: Monday, January 30, 2006 8:24
To: JDBM Developer listserv
Subject: re: [Jdbm-developer]
page locking, row locking.
think that working with pages in the dbcache may prove to not be
the way to go... Nothing about the way that people actually
use jdbm has to do with pages - they all deal directly with rows (or
objects reconstructed from those rows). I think that trying
to manage locking at the page level creates an artificial barrier,
when we can go directly to the row level and work there...
I'm actually quite partial to Alex's idea of
ditching the page cache altogether and use the dbcache for
managing individual rows... As rows are evicted from
dbcache, the underlying page manager can do whatever it wants to
with them (i.e. write rows in place, write them to a new page,
There would still be a need for a page
cache in the underlying page manager, but this would be mostly for
read optimization purposes only - not transaction management (in
other words, the dbCache would operate transactionally at a level
above the page manager, so the page manager would not need to do
any transaction handling itself - rollback or transaction failure
would be managed entirely at the row level in the dbCache).
If we go down that path, then we have a very big
question to ask about the Isolation requirement of ACID...
If we strictly enforce isolation, then we must, by
definition, create an object per transaction (which will require one
copy of each row per transaction, plus the 'current' copy).
The alternative is to ditch isolation in favor of
application level consistency semantics (e.g. synchronization at
the object level to ensure that simultaneous changes can not force
an object into an invalid state).
At the current
time, I don't see a good way around this tradeoff - maybe there is
one I'm just not thinking of...
it's worth, the documented dbCache design does allow for multiple
copies of a given page to exist in memory. The safe itself
would only contain one version of the page - the 'current'
version. Basically, you have to be able to reconstruct the
state of the cache, with all transactions rolled back, by
restoring from the safe.
I've been doing some
modeling and some prototyping in order to understand
DBCache design a little bit better. DBCache used page
therefore never had more than one image of a page
in the cache. This means
that: (a) the pageId was a
unique key for the cache -- and we need a unique
key in order
to support an MRU for the cache; and (b) an original image of
page was no longer available once a transaction had written
on that page --
even if the transaction has not committed
Imagine that we have two transactions, one of which is
read only. Here is
one possible serialization of
Tx1 begins (readOnly).
Tx1 reads the state
Tx2 updates page10 (copy on
write converts an original of page10 into a
reads page10 and is aborted since page10 is not an
This shows how a transaction
which starts earlier can be aborted by an
updated on a page
which it has not read yet. Some stores allow this
expose the original state of the store to transactions
which have begun
before the writer commits. This could be
done by maintaining the original
image of page10 during the
copy-on-write. Any transaction wanting to read
still do so until Tx2 commits. I am not clear on the effect
Tx2 committing. Does it result in Tx1 seeing the
version of page10 from
when Tx1 started or the updated version
of page10 which was committed by Tx2
-- presumably the
I think that we might want to create or research
such serializations of
events in multiple transactions in order
to make some decisions about the
polices we want to support for
jdbm. Those cases can be readily captured as
I have also been thinking about strategies
to support row locking. My
inclination is to expose the
page image as an nio ByteByffer wrapping the
makes it possible to expose an immutable view of the data.
also makes it possible to expose only a region of the
image for a page. In
the context of jdbm, and with some
additional state for row locks on the
page, this could be used
to request only the region of a page corresponding
physical row. This would effect a read (or write) lock on
that part of
the page. Since a physical row can span more
than one page in jdbm, row
locking in DBCache would be in terms
of an offset and length on a page
image, so jdbm would create a
row lock on the appropriate region of each
page that was
spanned by the physical record.
I've done a little thinking
about the data structure for maintaining such
I am inclined to start out with an insertion sort on the
into the page and a constraint test during the insert to
verify that the
offset + range is less than the offset of next
row lock on the page.
It seems to me that both allowing
page reads of originals which have been
written on by
uncommitted transactions and allowing row locking will
that we allow up to 2C page images into the cache,
where C is the capacity
of the safe. DBCache as published
only permitted a single image for a page
SF.net email is sponsored by: Splunk Inc. Do you grep through log
for problems? Stop! Download the new AJAX
search engine that makes
searching your log files as easy as
surfing the web. DOWNLOAD SPLUNK!
This SF.net email is sponsored by: Splunk Inc. Do you grep through
log files for problems? Stop! Download the new AJAX search engine
that makes searching your log files as easy as surfing the web.
DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
mailing list Jdbmfirstname.lastname@example.org https://lists.sourceforge.net/lists/listinfo/jdbm-developer