Thread: RE: [Jdbm-general] Sharing objects

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I agree that the core should be kept small. However, I think it can be made
smaller. When I say a simple persistence mechanism can built from a small
core, I mean that the existing functionality can be developed on top of a
core implementation that operates simply on byte arrays.

Although RecordManager does operate on arrays or objects, BTree and HTree
cannot operate efficiently on byte arrays. If they did, it would help pave
the way for other kinds persistence models. For now, what I would like is an
efficient mapping of byte array keys to values. I don't want the overhead of
object serialization because I don't need it.

What I see in version 0.11 is three layers. JDBMHashtable -> HTree ->
RecordManager. JDBMhashtable offers the most convenience, and most users
should start there. More advanced users will look to the BTree or even the
RecordManager directly. Things that I'd like to try out:

1: More use of interfaces. For example, with a RecordManager interface,
caching layers can be designed as wrappers instead of built directly into
it. In addition, different kinds of RecordManagers could be plugged into the
trees.
2: Use of JDK1.2 collections interfaces where appropriate. You've also
mentioned this.
3: Common interface for BTree and HTree.
4: Split BTree and HTree layers into two. One operates on byte arrays, and
the other operates on objects.
5: Explore sharing object references among tree nodes.

As JDBM becomes more popular it will inevitably grow larger. A cleaner, more
flexible foundation will make this growth go much easier. The current
version number is very low, which indicates to me that you haven't yet
locked into any designs.

-----Original Message-----
From: Alex Boisvert [mailto:boi...@in...]
Sent: Monday, June 25, 2001 08:54 PM
To: Brian O'Neill
Cc: jdb...@li...
Subject: Re: [Jdbm-general] Sharing objects

Brian O'Neill wrote:

> I think JDBM is a great little persistence engine with tons of potential.
> Its nice to see such a project like this being actively developed. I do
see
> some things that can be improved.

I'm certainly open to improvements, either to the existing core or 
adding extra functionality.

Something I've articulated in the past is that I'd like to see the core 
of JDBM remain small, so that it remains an interesting choice for small 
projects that only want a simple persistence engine.  However, we can 
bundle a number of optional helpers/utilities around it to make it 
attractive to those who need a few more features.

For example, the support for (or emulating) the Java2 Collection classes 
has been discussed here in the past and I still believe it would make a 
nice addition.

> When I save a key or value into a JDBM table, the object is serialized in
> its own stream. If this object has a reference to another shared object,
the
> original object graph is not preserved across hash table entries. The
> approach that JDBM uses to serialize also has space overhead from all the
> stream headers and class info written to each record.

You're right.  Currently, the object is serialized in its own stream 
(along with its complete object graph) and converted into a byte[] for 
storage in the RecordManager.

Like you mention, this implies an overhead for each object placed into 
the hash table.  This overhead is roughly 25 bytes for user-defined 
serialized objects.  For strings, it's 7 bytes (including string 
length).   It's not so bad, but can be a concern databases where a large 
number of small objects are stored.

> I like the simple interfaces provided in the HTree and BTree, but since
they
> do nothing special to preserve object graphs, I think an interface that
> operates on byte[] keys and values is more flexible. A simple object
> serialization strategy could be placed above this level, or a more
> sophisticated persistence model could be developed with less storage
> overhead.

Well, from my experience, the 'simple serialization strategy' you 
mention can become quite complex, depending on what exactly you have in 
mind.  I'm not against the approach.  In fact, I'd like to have such a 
feature if it really is not too complicated and doesn't impact the size 
of the core too much.

If you have ideas in mind, please send them on this list for discussion.

> Separating the HTree and BTree from the RecordManager is really nice. I
> think more levels of layering will make it easier to develop many kinds of
> persistence models. Making object persistence available at the
RecordManager
> level might not be the best place for this, since I consider this to be a
> very "high level" function.

Again, the current object persistence in the RecordManager is really 
just a utility function which serialiazes the object into a byte[].  If 
you want more functionality, you build it on top.

This goes for any type of service generally found in OODBMS such as 
collection classes, concurrency policy (pessimistic/optimistic locking), 
transaction models (nesting, rollback ability, ...), etc.

I've argued in the past that many of these features should be part of a 
separate project, in order to keep JDBM small and simple.  I guess it 
depends on the scope of the overall project and whether you want to 
create a full-blown OODBMS or simply extend JDBM in ways that are 
compatible with its original objectives.

> 
> I'm going to start mucking with JDBM, to see how feasible it is to have
> higher level tree implementations built upon simpler ones. Has there been
> any other talk of this, or have there been any such implementations?
> 

The Exolab folks have built some services on top of JDBM in the past.  I 
remember seeing locking and extended transaction management in the core 
library.  I should check again to see if we could merge some of the code 
back into JDBM.

cheers,
alex

Thread: RE: [Jdbm-general] Sharing objects

jdbm-general