I'm not sure that I understand the distinction between the "row" and the "object" level for locking.  The
physical row is the space on one more pages where a serialized object is written.  The logical row is
the slot in the translation page for a given OID.  The object is the transient Java instance from either
object creation or de-serialization of an existing row.  We also have serialized objects which might
not be written onto a page image yet -- buffering serialized objects is especially useful for clustering
since you can use more batch oriented allocation strategies for laying down the serialized objects
onto pages (one of the hotspots in jdbm today).
 
-bryan
-----Original Message-----
From: jdbm-developer-admin@lists.sourceforge.net [mailto:jdbm-developer-admin@lists.sourceforge.net] On Behalf Of Kevin Day
Sent: Wednesday, January 18, 2006 10:10 AM
To: JDBM Developer listserv
Subject: re[9]: [Jdbm-developer] DBCache discussion

I want to follow up on something that Bryan mentioned - it didn't really sink in until now...
 
If we write all changed records to *new* pages in the database, and we allocate new pages on a per-transaction basis, what impact will that have on the overall system behavior?
 
I'm wondering if we need to do something special with the translation pages (maybe do slot level locking on just those pages)...  This has some interesting implications:
 
1.  Grabbing a lock on a translation slot is effectively the same as locking the row that slot points to
2.  Rows that are used in the same transaction will tend to be clustered onto similar pages
3.  If more than one transaction is working with the same page, the before version of the page is always in the DB for easy retrieval.  A page with no live records on it would only be released to the free list after all transactions that depend on it's "before" data are processed.
 
Comitting a transaction would be achieved by updating the translation table.  Rollback would be achieved by deleting the transactions uncomitted pages from DB (if they were ever written to DB).
 
The only concurrency bottleneck that I can see here is in the manipulation of the page headers to insert the new pages - but that's fairly minor...
 
With this strategy, I don't think it's necessary to mess with locking at the object level - we just have to lock at the row level.  There may be implications of the object cache to this (so a parallel object locking system may be necessary as well).
 
hmmmmm - that feels a whole lot better to me than trying to merge dirty pages...
 
- K
 
 
 
> Bryan-
 
Failing a transaction just because it is trying to write to a page that happens to contain data that another transaction has worked on (even though there is no conflict between the two) seems like a bad idea to me.  It basically creates a failure situation that the application has absolutely no way of protecting against using higher level synchronization mechanisms...
 
Maybe this is OK, but I know I'd drive myself nuts trying to figure out why I had transactions failing for no apparent reason.
 
I also think that this is not at all conducive to management of the translation pages...
 
 
 
I've been doing some noodling on all of this, and I was wondering if any of you can contradict the following suppositions:
 
I put forward that:
 
1.  Page locking is not going to be sufficient to handle multiple transactions (especially if they are long running)
2.  Some sort of byte level conflict detection scheme is going to be required (possibly a bitmap assocaited with each page in the cache?)
3.  If we are using pages for cache management (I'm not even 100% certain that we actually have to do this), then there must be some mechanism for merging non-conflicting changes from a series of transactions
4.  If a transaction attempts to *update* a region of a page that has already been updated by another transaction that comitted during the duration of the transaction, then the updateshould fail.
5.  If a transaction attempt to update a region of a page that has already been updated by another transaction that has *not* comitted yet, then the update will not fail - but the transaction that commits last will fail on commit.
6.  Once a transaction starts, it's view of the database should remain fixed (i.e. changes made by other transactions should not be reflected in any reads once a given transaction begins).
 
 
On #3, does it even make sense to manage dirty pages in the cache?  As an alternative, should the cache be tracking changed regions of pages only?
 
- K
 
 
 
 
>Kevin,

I will check the paper today.  I believe that in this case the page in
the database is marked as "dirty" and the other transaction would fail,
e.g., access to the original page (now on the per-transaction log file)
would not be performed.

In terms of supporting per-object locking, one that that we can do is
to break down a page into its component objects and then re-serialize
them onto new pages based on the writer (even during a read).  This
would provide a means to migrate from page locks to object locks and
an untouched object could still be accessed based on either its byte[]
or its deserialized form.  This is clearly not a full strategy, but
I've been thinking of something similar for clustering during writes.

-bryan

-----Original Message-----
From: jdbm-developer-admin@lists.sourceforge.net
To: JDBM Developer listserv
Sent: 1/17/2006 8:09 PM
Subject: re[6]: [Jdbm-developer] DBCache discussion

Bryan-

Here's my issue with BFIM - if *other* transactions need to make use of
a page that has been updated in the DB (but not comitted), then they
have to access the long running transaction's log file to obtain the
pre-changed version.

I don't have a solid enough understanding of the implications here...
It's quite possible that performance for reading an uncached BFIM page
from the transaction's log will be just as fast as reading from the DB -
but there's definitely going to be a lot more book keeping involved with
tracking whether the current committed version of a given page is in the
DB, or is in an in-process transaction's log...

You are right, though, that the commit would be much, much faster with
BFIM....

- K


   > Fair enough.  I believe that the writing of updated pages from
long
running transactions to the database vs the log might be a bias in
favor of a successfull commit of the long running transaction.  For
example, on a successful commit the store becomes stable immediately
since the contents of the safe and the database are consistent with
the final state of the long-running transaction.  In contrast, the
rollback time for a failed long-running transaction becomes significant
since the BFIM pages on the per-transaction log file have to be paged
back into the database.

If you accept this reading, then the choice of the BFIM logging strategy
for very long transactions is designed to favor fast commits for
successful transactions at the expense of more expensive rollback of a
failed very long transaction.

If you examine page 506, the text contrasts the "shadow" and "logging"
concepts.  The BFIM (logging) never has more than one image of the same
page in the database (as distinguished from the safe or log file) while
the shadow concept results in two copies of the page in the database. I
think the critique developed in this section probably is the motivation
by the author for the proposed strategy (BFIM logging).

-bryan

-----Original Message-----
From:  <mailto:jdbm-developer-admin@lists.sourceforge.net>
jdbm-developer-admin@lists.sourceforge.net
To: ''JDBM Developer listserv ' '
Sent: 1/17/2006 7:01 PM
Subject: re[4]: [Jdbm-developer] DBCache discussion

Bryan-

Comments below...

- K


  
> Kevin,

I will have to read up on this topic (record-level locking).  However
I think that transactions may be selected for abort even if they might
commit first.  The best example is the long-running transaction, e.g.,
some data load.  It should succeed even at the expense of short running
transactions which could (except for locking, dead-lock, etc.
strategies)
commit first.


Presumably, the short running transactions would have already comitted
during the run of the long transaction.  If they have already comitted,
then you have no choice but to fail the long transaction commit (doing
so would violate the Durability contract).  So, the only way to make the
long running transaction be guaranteed to commit is to explicitly lock
all rows that it has updated and to fail any updates from other
transactions.  I guess I'm fine with that - it's really a question of
whether you look for conflicts during the transaction (i.e. during
update() ) or after the transaction (i.e. during commit() ).  Having it
fail early would be desirable from the application's point of view, so
that makes sense to me.



(1) follows from ACID, right?  yup

(2) yes on long-lived transactions -- and without the kinds of buffering
problems which are currently being discussed in this list.

(3,4) see my caveat above.

(5) yes - concurrent transactions.

I think that DBCache handles all of this except (3,4) which deal with
object-level locking.



The DB Cache paper I read does not really handle #2 either...  they show
that the concept of the cache's ring buffer doesn't conflict with long
transaction strategies, but I have serious doubts about the particular
strategy they describe in the paper.  I'm still not understanding why
they would want to ever write un-committed changed pages into the DB -
all that does is make transaction isolation, and restart from a crash
during a failed transaction, a complete nightmare.  This is probably a
reflection of my lack of familiarity with the strategy, but I really
want to explore how this is going to work before we just assume that the
DB Cache implementation is the best way to go...

-bryan

-----Original Message-----
From:    <mailto:jdbm-developer-admin@lists.sourceforge.net>
<mailto:jdbm-developer-admin@lists.sourceforge.net>
<mailto:jdbm-developer-admin@lists.sourceforge.net>
jdbm-developer-admin@lists.sourceforge.net
To: 'JDBM Developer listserv '
Sent: 1/17/2006 3:04 PM
Subject: re[2]: [Jdbm-developer] DBCache discussion

Bryan-

General ramblings:

In my mind, I'm thinking about the following as a list of requirements
for multiple transaction support (in addition to the regular ACID
requirements):

1.  Once a transaction begins, the view of the data from that
transaction's perspective is guaranteed not to change (due to actions
taken by other transactions.
2.  Transactions should able to be long (many hundreds of thousands of
record changes)
3.  If multiple transactions attempt to change the same object (row) at
the same time, the transaction that commits last will fail with a thrown
exception
4.  If multiple transactions attempt to change objects on the same page
at the same time (and those objects are not overlapping), the
transactions should both commit without error
5.  Where possible, transactions should be able to run and commit
asynchronously
6.  Others???


If page locking is occuring, then I think that the page locking should
only happen during the actual commit (not as soon as a given transaction
uses the page).  Otherwise, you have potential for nasty race conditions
that are impossible for the application to manage (you can't synchronize
on an object when the object has an implied lock due to locking on a
*different* object that happens to be stored on the same page).

This may imply synchronous commits (or asynchronous commits if there is
no overlap of effected pages, but synchronous if there is overlap?)...

Another point to consider is the possibility of using asymetric page
operations.  Read operations occur at the page level, write operations
occur at a sub-page level.  Locks would then be ocurr at the sub-page
level.

In this scheme, each page in a transaction would have to capture
information about which byte ranges have actually been changed...  Some
sort of versioning of page content also becomes necessary to detect
conflicts during later commits.

This, however, implies a decent amount of coordination between the page
manager and the record manager, which leads into Alex's comments on the
matter.


What do you guys think?  What are some other strategies for attacking
the issue of simultaneous transactions?

- K




 >
Ok.  Let's get some references on the problem and do some more reading.

I believe that the design-now vs design-later question revolves around
whether or not their is, in fact, an interaction between the "segment"
API (treats the contents of the segment (pages) as untyped data, but
reserves some bits on each page for the DBCache header) and the record
API with record-level locking.

If these are isolatable, then DBCache can be implemented without regard
to the record locking strategy.  This seems to be the tread in recent
store architectures.

-bryan

-----Original Message-----
From: Alex Boisvert
To: Thompson, Bryan B.
Cc: Kevin Day; JDBM Developer listserv
Sent: 1/17/2006 2:00 PM
Subject: Re: [Jdbm-developer] DBCache discussion

Thompson, Bryan B. wrote:
> Overall my thinking on row/page/segment locking is that we need to get

> engaged in a new transaction
> engine, which will be of direct benefit.  With that in hand we can
> consider row locking strategies.  I
> would rather duplicate DBCache first and then examine row locking
solutions.

My sentiment is that we should consider object-level locking (or
versioning) head-first.   I anticipate it would be difficult and
wasteful to retrofit a DBCache implementation with such concept because
it necessitates a departure from page-level management where you don't
have to think much about object relocation to a system where allocation
and indexing concerns are fundamental to achieving high performance and
high levels of concurrency.

alex


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!



<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
>
<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
>

<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
>
<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>


<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
>
<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>

<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Jdbm-developer mailing list
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
Jdbm-developer@lists.sourceforge.net
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<
 
------------------------------------------------------- This SF.net
email is sponsored by: Splunk Inc. Do you grep through log files for
problems? Stop! Download the new AJAX search engine that makes searching
your log files as easy as surfing the web. DOWNLOAD SPLUNK!


<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
>
<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>

<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________ Jdbm-developer mailing
list    <mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
<mailto:Jdbm-developer@lists.sourceforge.net>
Jdbm-developer@lists.sourceforge.net
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<


------------------------------------------------------- This SF.net
email is sponsored by: Splunk Inc. Do you grep through log files for
problems? Stop! Download the new AJAX search engine that makes searching
your log files as easy as surfing the web. DOWNLOAD SPLUNK!

<http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2>
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________ Jdbm-developer mailing
list  <mailto:Jdbm-developer@lists.sourceforge.net>
Jdbm-developer@lists.sourceforge.net
<https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<
   
------------------------------------------------------- This SF.net
email is sponsored by: Splunk Inc. Do you grep through log files for
problems? Stop! Download the new AJAX search engine that makes searching
your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________ Jdbm-developer mailing
list Jdbm-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<
<
------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Jdbm-developer mailing list Jdbm-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jdbm-developer