I realize that a temporal database is not for everyone.  What I would like to do is to factor out the
allocation strategy so that WORM and RW stores can be realized using the same transactional
engine.
 
-----Original Message-----
From: Kevin Day [mailto:kevin@trumpetinc.com]
Sent: Friday, January 13, 2006 9:30 AM
To: Thompson, Bryan B.; JDBM Developer listserv
Subject: re[4]: [Jdbm-developer] Some higher level thoughts for jdbm 2

Bryan-
 
I agree that providing multi-index support is not part of the core jdbm implementation, but probably should be an optional/serparate package - my strong suspicion is that a huge number of developers would bennefit from it.  This is really about providing a higher level of abstraction to provide users with a usable object store built on top of the jdbm engine.
 
 
Rolling back object state is a tricky business - the details of the persistence layer is supposed to be isolated as much as possible fromthe business logic, and requiring an application to "know" which objects have been rolled back may not be palatable.  One solution that comes to mind is to create all objects with a per-transaction scope.  Once the transaction is closed (or rolled back/failed/etc...), the object references would no longer be valid - this is similar to what JDO is doing.  This has some implications in terms of how objects stored in jdbm are used in application code (object references would have to *always* be transient, and an application would not be able to use regular Java synchronization symantecs to manage access to the objects).
 
I've started thinking about how to handle transactional control over the tranlation pages, and I don't have any good suggestions - I'll keep noodling on it...
 
Use of a temporal data strategy (pretty close to WORM behavior) would make jdbm considerably less attractive for my projects (and probably many other developers).  There may be advantages to this strategy (I think I'd have to have them spelled out explicitly), but without a mechanism for controlling file growth, I view it as a non-starter...  If there are significant advantages to this strategy, then perhaps there is a hybrid approach - use a temporal implementation but allow some sort of packing algorithm to keep the file size down?
 
 
- Kevin
 
 
> Common  practice is to not implement object locking.  DBCache will rollback a  transaction if it requires a concurrent  
write  on the same page as another transaction. This is especially problematic  for the pages containing the logical to
physical row id mapping.  My thought there was to allocate new  logical ids from a translation page dedicated to a given
transaction, but that will not help with updating the translation page  entry mapping the a logical row id to a diferent physical
row  id.
 
Some commercial OODBS systems of which I am aware  handle locking at the segment level (a segment being some  #of
pages) and provide a many-reader one-writer strategy in  which the read state of the segment remains consistent until  (a)
the readers have closed their transactions and the  writer has commited its transaction.  At that point the new state of  the
store is made available to the next  reader.
 
These are important design decisions.  Another one  which is related is whether to support a "temporal database" in  which
old records are not overwritten and the historical  consistent states of the store can always be accessed.  Persistent  
allocation polices such as this interact with  transactional semantics and locking in interesting ways.  We need to figure  out
which set of polices have the "right"  behavior.
 
I agree that the BTree needs a transient listener  mechanism so that cursors over the btree can remain synchronized  with
structural changes to the btree.  This is near the  top of my list of features to be implemented.  We should also  support
btree removal -- currently the code does not clear out  the nodes of the btree.
 
I think that indices which are synchronized with the  state of records falls outside of what I see as the boundaries of  jdbm.
I think that this is the perview of an application  framework which utilizes jdbm.
 
I believe that the common practice for rolling back  object state is to release all objects whose state is now invalid and  to
start a new transaction in which you read the state of  persistent objects from the store.
 
-bryan
 
-----Original Message-----
From:  jdbm-developer-admin@lists.sourceforge.net  [mailto:jdbm-developer-admin@lists.sourceforge.net] On Behalf Of Kevin  Day
Sent: Thursday, January 12, 2006 6:19 PM
To: JDBM  Developer listserv
Subject: re[2]: [Jdbm-developer] Some higher level  thoughts for jdbm 2


Alex-
 

OK - my thinking and yours are in line.  The  vetoable change listener is really centered around enforcing business rules  (basically, these would be valueconstraints) - throwing a runtime exception  for this kind of thing would make sense.
 
 
 
One other line of thought on object  transactions:
 
If we cancel a transaction, what mechanism is  there to roll back the object state (and not just the state of the data in the  database)?
 
In JDO, they actually re-write the byte code to  allow for resetting object state in a failed transaction, but I don't think  that jdbm wants to go there (that starts to void the "lightweight"  aspect)...
 
This has always been the sticking point in my  mind when the concept of rolling back transactions has been raised - I can  easily see rolling back the data file, but trying to do this at the object  level may not be possible without code injection (or some nasty reflection  mechanisms, which throws out the entire current jdbm serialization  mechanism).  Or maybe there is a way using reflection to completely  replace the state of a given object with that of another object...  hmmmm....
 
 
 
My particular applications have never had a need  for concurrent modifications (the current synchronization code works fine for  us, but we aren't going for performance speed records, either), so I haven't  really considered the impact of all of this on reverse indexes.  If you  have multiple threads concurrently modifying indexes like I'm talking about,  then you have a nightmare on your hands - a change to a given object could  cause 8 or 10 structural changes to a collection, and those changes could  easily conflict with changes made by another thread...
 
In my mind, I've always envisioned a locking  mechanism that allows many reads, but only one write at a given time...   The reads would all query the state of the data store *before* the write  transaction started.  From my reading it sounds like this is how MVCC  works.
 
What was not clear from my reading was whether it  was possible for multiple update transactions to occur  simultaneously....
 
If simultaneous updates are a requirement, then  I suppose one option would be to construct a  sync/locking mechanism at the object level, and have updates proceed until  they hit a lock.  The problem with that is that it is almost impossible  to prevent race conditions...
 
 
I suppose I need to grab a hold of the dbCache  paper and read up on how they are approaching the problem of simultaneous  update transactions, because there is nothing at all obvious to me about how  to make such a beast work.
 
 
Cheers all,
 
- K
 
 
  
>Kevin Day  wrote:
> Yes - I was thinking of adding a listener interface to  the BTree to
> accomplish #1.  I think we need something  like this anyway, because the
> current tuple iterator interface  fails unpredictably if the state of the
> BTree changes at any  point after the iterator is created.

The way I see it, BTree is a  specific case for a collection.  We may
want to have listeners  on any collection (btree, hash, bag, list...)

> #2 is a very  tricky wicket - because we are working with cohesive
> objects  and not rows with fields, the level of atomic change is quite
>  coarse relative to a regular table based database.  We currently  have no
> mechanism for updating an object once it has been  created, so I don't
> see a decent way of handling the situation  where two simultaneous
> processes work on the same object, but  in isolation from each other -
> and still provide a mechanism  for conflict resolution - especially if
> two changes to the same  object are compatible with each other.

Yes, it's tricky and  complex.  Not easy at all in fact.  But that's what
you  need if you really want to process concurrent transactions on the  
same data structures, e.g. processing multiple orders at the same  time.

> I suppose the object caches could be created thread  local, but then you
> have two copies of the same object flying  around...

Yes, that what I had in mind.  I don't know of a  better solution, unless
you know the transaction is read-only in  which case you can share
objects between transactions  (threads).

> With respect to your suggestion of vetoable  listeners:  I'm trying to
> wrap my head around this...   The changes that would be vetoed would be
> the changes to  the object itself (not just it's indexes).  If a veto
>  occured, then the application would have to throw out all copies of it's  
> current instance of the object and re-retrieve it from the data  store.  
> This seems to me like a huge burden on the user of  the API (basically,
> every call to the record manager could  throw a ChangeVetoedException
> that the application would have  to properly handle).

A listener could veto the addition, update  or removal of an object from
a collection to support domain rules,  e.g.  a customer may have a
maximum of 10 outstanding orders,  totaling no more than $100K.

The veto hopefully happens before  any changes are done to the data
structures, such that the  application may be able to recover from it.
Integrity checks are  always performed before commit and therefore would
throw an  exception.

It's now common practice for these exceptions to be  runtime exceptions
(such as Spring's JDBC framework). The  application is able to catch them
if desired and may recover from it  if it has the necessary logic to do
so.  Nonetheless, a  transaction should never commit if one of its
integrity checks fail.   If the exception is not caught, the transaction
should be  rolled back.

We could offer mechanisms to do early (explicit)  integrity checks but
generally speaking, the application should be  able to perform the checks
before doing updates.   Again,  this is a common practice in
well-designed transaction  systems.

Makes  sense?

alex

<
-------------------------------------------------------  This SF.net email is sponsored by: Splunk Inc. Do you grep through log files  for problems? Stop! Download the new AJAX search engine that makes searching  your log files as easy as surfing the web. DOWNLOAD SPLUNK!  http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click  _______________________________________________ Jdbm-developer mailing list  Jdbm-developer@lists.sourceforge.net  https://lists.sourceforge.net/lists/listinfo/jdbm-developer <