<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content=text/html;charset=ISO-8859-1>
<META content="MSHTML 6.00.2900.3059" name=GENERATOR></HEAD>
<BODY text=#000000 bgColor=#ffffff leftMargin=1 topMargin=1 rightMargin=1><FONT face=Arial size=2>
<DIV>There were discussions awhile back about adding another container type object to jdbm - I can't remember what Bryan called them - but the idea was to provide list storage using a linked list of paged elements. It's basically a degenerate B+Tree. This kind of thing would be fairly simple to implement, and would be similar to what you associate described (with the exception that all of the pages of the list would be in a single jdbm database file).</DIV>
<DIV> </DIV>
<DIV>The idea of using linear, write-only strategies for journaling was much discussed (and is still, in my opinion, a very valid option) for some future jdbm log file implementation. I believe that there are a ton of emails about this in the developers listserv history. I know that Bryan had some fascinating research articles on the subject.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>In terms of meta data, in one of our apps, we do capture in-memory meta data models of records as the records are read from disk. This is done in a layer above jdbm. In our case, meta data is constructed entirely from the record contents, so there's no need to store it, per se. We actually use BTrees to capture indexes of records stored directly into the record manager, so the meta data is necessary for updating the indexes when records change.</DIV>
<DIV> </DIV>
<DIV>- K<BR></DIV>
<DIV> </DIV></FONT>
<DIV style="FONT-SIZE: x-small; FONT-FAMILY: Tahoma">
<DIV>----------------------- <B>Original Message</B> -----------------------</DIV>
<DIV> </DIV>
<DIV><B>From:</B> Mark Proctor <A href="mailto:mproctor@... color=#0000ff><mproctor@...>
<DIV><B>To:</B> <A href="mailto:jdbm-general@... color=#0000ff>"jdbm-general@...> >> JDBM General listserv" <A href="mailto:jdbm-general@... color=#0000ff><jdbm-general@...>
<DIV><B>Cc:</B> </DIV>
<DIV><B>Date:</B> Tue, 17 Jun 2008 19:01:25 +0100</DIV>
<DIV><B>Subject: <U>[Jdbm-general] how JDBM works (was extensible serializer)</U></B></DIV>
<DIV> </DIV></DIV>Mark Proctor wrote:
<BLOCKQUOTE cite=mid:4857619A.2070105@... type="cite">Kevin Day wrote:
<BLOCKQUOTE cite=mid:RERPQVMzRShWVmBVJi0+MjA3MjA0OTQ4NA@... type="cite"><PRE wrap="">Cees-
I have no objections. I actually did a pretty thorough code review of Bryan's work and found his changes to be very well though through. Haven't used it in projection yet, but probably will. Also, as he points out, it's easy to drop back to the old serialization mechanism (or even the *shudder* Java serialzation mechanism).
</PRE></BLOCKQUOTE>for me I just want to be able to efficiently write my own byte[], without having to go through any serialisation mechanism. When I last looked the Serializer interface allowed this, so should be fine. <BR>
<BLOCKQUOTE cite=mid:RERPQVMzRShWVmBVJi0+MjA3MjA0OTQ4NA@... type="cite"><PRE wrap="">It would be nice to be able to factor things so alternative serializers could be specified, but as Bryan points out, a *lot* of tweaks and hooks were required to add this one other serialization mechanism. I suspect that a serious refactoring of the entire jdbm codebase would be required to make it truly pluggable (and there are, I believe, more desirable goals for future development - like true transaction isolation and rollback support).
</PRE></BLOCKQUOTE>Getting JDBM to play in JTA transactions is important, someone said in an old posting they had created some code for this and submitted it - anyone know what happened to that?<BR><BR>Interestingly the other day I was talking to someone who built a key/value store for journal logging, they use to use BDB, but moved away sa the btree was overkill. They built a system that created 10 x 10mb files (configurable). You would insert the byte[] and it would return a long handle, the byte[] would be written to the head location, which currently pointed to a specific file, if there wasn't enough space it would point to the next file, they do not allow spanning and that free space would not be written to later as they do not seek, they only ever write to the head location. Whlie you can remove entries the indivual entry filespace is not reclaimed, only if all entries for that file are marked as removed would it delete the entire file. They claim this approach gives
blistering fast speed, as there is no seek at write time, and only seek once at read time to find the start position, because they do not fragment their files to fit in gaps. Most journalling systems have entries only possibly numbering in their hundreds and won't exist forever, so you get something that is less efficient with space but much faster. I was wondering if the RecordManager in JDBM could be extended to do something similar, as another possible backend?<BR><BR>Talking of which would it be possible for someone to write some docs on how RecordManager and JDBM works in general?<BR></BLOCKQUOTE>Chatting to my collegue about his approach, his stuff is here <A class=moz-txt-link-freetext href="http://anonsvn.jboss.org/repos/messaging/trunk/src/main/org/jboss/messaging/core/journal/">http://anonsvn.jboss.org/repos/messaging/trunk/src/main/org/jboss/messaging/core/journal/</A>. The files are made small enough to fit into a disk cylinder and each one is append only, this m
aximises throughput. This is geared up more for journalling though, but it does do long/value add, remove and update. They have record types, so the TX info can be appended to the same log file, to avoid the seek between the .db and .log. Anyway thought the idea of a single appending log only idea might be a good "optimisation" backend for JDBM.<BR><BR>I've been going through the JDBM code and it's quite well written, so I'm able to understand it. Interesting I can see that RecordManager can be used directly without BTree, for a basic store using a long key, so thats interesting. There seems to be no repeated disk seeking on write, as the location is determined in memory and then it's a continuous write. A delete again is an in memory lookup and a single write to mark the location free, it doesn't have to actually free all bytes on the disk. This looks pretty optimial to me. Buffering is optional, out of interest when does this pay off? I can't imagine the logic and physical
mapping has any measurable cost. Down side is that byte[] size needs to be known ahead of time so async streaming for writes won't work. I saw that in the docs it mentions replacing some of the code with DBCache, but I couldn't find the mailing list discussion on this - any details?<BR><BR>There is obviously the issue of multithreading. With the move to JDK1.5, does that help some? JDBM should probably atleast allow multiple reads, regardless of write, sorta like concurrent hashmap. I'm guessing at this point it might be preferable to split up dbs, to allow concurrent writes too, via striping?<BR><BR>One of the use cases I'd like to support is the idea of in memory meta-data for each long, without having to iterate over the entire .db reading in all files. I will probably do this as a second db, that holds only meta data - although then I need to get transactions to span across both dbs. This would hold the long id to the read data. I'd then do a permanent in memory cache o
f data, as we'd only ever have a few hundred items anyway. I'm wondering if the idea of "meta" data for records could be built into the main .db. And it should be possible to startup a record manager and load in all meta data. This way the meta data and record can be written continously together.<BR><BR>Mark<BR>
<BLOCKQUOTE cite=mid:4857619A.2070105@... type="cite">
<BLOCKQUOTE cite=mid:RERPQVMzRShWVmBVJi0+MjA3MjA0OTQ4NA@... type="cite"><PRE wrap="">- K
Kevin Day
Trumpet, Inc.
<A class=moz-txt-link-abbreviated href="http://www.trumpetinc.com" moz-do-not-send="true">www.trumpetinc.com</A>
<A class=moz-txt-link-abbreviated href="mailto:kevin@..." moz-do-not-send="true">kevin@...>
480-961-6003 x1002
----------------------- Original Message -----------------------
From: "Cees de Groot" <A class=moz-txt-link-rfc2396E href="mailto:cdegroot@..." moz-do-not-send="true"><cdegroot@...>
To: "Bryan Thompson" <A class=moz-txt-link-rfc2396E href="mailto:bryan@..." moz-do-not-send="true"><bryan@...>
Cc: Mark Proctor <A class=moz-txt-link-rfc2396E href="mailto:mproctor@..." moz-do-not-send="true"><mproctor@...>, <A class=moz-txt-link-abbreviated href="mailto:jdbm-general@..." moz-do-not-send="true">jdbm-general@...>, <A class=moz-txt-link-abbreviated href="mailto:cowtowncoder@..." moz-do-not-send="true">cowtowncoder@...>
Date: Tue, 10 Jun 2008 20:34:25 +0200
Subject: Re: [Jdbm-general] extensible serializer
If the license issue is sorted, I'm more than happy to keep it in. The
original JDBM interfaces are still there, and I'll just move it to
jdbm.extser or similar.
Any objections? I mean, upside, no license problems, and through code
inclusion no dependencies issues. Sounds like ok to me...
On Tue, Jun 10, 2008 at 7:11 PM, Bryan Thompson <A class=moz-txt-link-rfc2396E href="mailto:bryan@..." moz-do-not-send="true"><bryan@...> wrote:
</PRE>
<BLOCKQUOTE type="cite"><PRE wrap="">Interesting.
I had to do some significant work in order to get jdbm to persist the
serializer state. I would suggest that you look at the code more carefully
rather than just rolling it back. Assuming that xstream uses a stateful
serializer, you are going to want to preserve the integration points.
By "stateful" serializer, I mean one that maintains persistent state NOT
recorded in the individual serialized records. extser factors out what
serializers are declared, the class ids (int's) assigned to each class for
which there is a registered serializer, and the corresponding serializer
version(s) and puts that all into a persistent record accessed off of one of
the named roots for the store. This makes it extremely compact when
serializing object graphs. The shared state is all factored out. In order
to support that I had to put in a bunch of hooks that you will want to keep
around.
Another issue with versioned serializers is that they basically have to be
inner classes in order to access the various fields (unless you want the
overhead of reflection during serialization!). One of the changes that I
introduced with the extser integration was transparently versioning for the
btree nodes and leaves for stores that choose to enable extser. If that
forward versioning is important then you are going to wind up with
something that's tightly coupled regardless.
If the broader issue is the dependency, then Cees already imported extser
and I "authorize" its relicensing under the license for the jdbm project.
-bryan
________________________________
From: <A class=moz-txt-link-abbreviated href="mailto:jdbm-general-bounces@..." moz-do-not-send="true">jdbm-general-bounces@...>
[<A class=moz-txt-link-freetext href="mailto:jdbm-general-bounces@..." moz-do-not-send="true">mailto:jdbm-general-bounces@...>] On Behalf Of Mark
Proctor
Sent: Tuesday, June 10, 2008 12:54 PM
To: Bryan Thompson
Cc: 'Cees de Groot'; <A class=moz-txt-link-abbreviated href="mailto:jdbm-general@..." moz-do-not-send="true">jdbm-general@...>;
<A class=moz-txt-link-abbreviated href="mailto:cowtowncoder@..." moz-do-not-send="true">cowtowncoder@...>
Subject: Re: [Jdbm-general] extensible serializer
Bryan Thompson wrote:
Well, easy come, easy go - but you might want to see who's using it before
you drop it out.
There is zero overhead when extser is not enabled.
it's not so much the overhead, its the extra dependency. Even if you don't
use it you are forced to include it, because Serialiser extends it. So if we
are to use it, it needs to be "plugged" in, so that it's optional.
The LGPL is a wierd issue, basically Apache takes a stand against LGPL and
will not allow any of it's projects to depend on an LGPL dependency. This
would force Apache DS to have to fork JDBM to maintain a version without
that LGPL dependency. It's not that they think that LGPL is causing any
wierd violation, they just don't like some of the ambiguity, and thus fall
on the side of caution.
-b
________________________________
From: <A class=moz-txt-link-abbreviated href="mailto:jdbm-general-bounces@..." moz-do-not-send="true">jdbm-general-bounces@...>
[<A class=moz-txt-link-freetext href="mailto:jdbm-general-bounces@..." moz-do-not-send="true">mailto:jdbm-general-bounces@...>] On Behalf Of Mark
Proctor
Sent: Monday, June 09, 2008 10:58 AM
To: Cees de Groot
Cc: <A class=moz-txt-link-abbreviated href="mailto:jdbm-general@..." moz-do-not-send="true">jdbm-general@...>; <A class=moz-txt-link-abbreviated href="mailto:cowtowncoder@..." moz-do-not-send="true">cowtowncoder@...>
Subject: Re: [Jdbm-general] extensible serializer
Cees de Groot wrote:
On Mon, Jun 9, 2008 at 4:24 PM, Cees de Groot <A class=moz-txt-link-rfc2396E href="mailto:cdegroot@..." moz-do-not-send="true"><cdegroot@...> wrote:
I grabbed the extensible serializer source code and added it to the
source tree - the original project doesn't seem to exist anymore so I
thought this was the quickest way to get rid of a binary-only
dependency.
Great, I'l update and look over it. For me I just want JDBM to write my
byte[], I don't want it to go anywhere near a serialistion method call, I'm
currently just trying to find out if that is possible.
On second thought, I agree it's not good to have JDBM depend on a
single serializer. So I'm removing the dependency (rolling back to the
pre-extser tag in CVS and checking what happened after that)
</PRE></BLOCKQUOTE><PRE wrap=""><!---->
</PRE></BLOCKQUOTE><BR><PRE wrap=""><HR width="90%" SIZE=4>
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
<A class=moz-txt-link-freetext href="http://sourceforge.net/services/buy/index.php">http://sourceforge.net/services/buy/index.php</A></PRE><PRE wrap=""><HR width="90%" SIZE=4>
_______________________________________________
Jdbm-general mailing list
<A class=moz-txt-link-abbreviated href="mailto:Jdbm-general@...>
<A class=moz-txt-link-freetext href="https://lists.sourceforge.net/lists/listinfo/jdbm-general">https://lists.sourceforge.net/lists/listinfo/jdbm-general</A>
</PRE></BLOCKQUOTE><BR>
<STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE>
<FONT face=Tahoma size=2>
<DIV>-------------------------------------------------------------------------<BR>Check out the new SourceForge.net Marketplace.<BR>It's the best place to buy or sell services for<BR>just about anything Open Source.<BR>http://sourceforge.net/services/buy/index.php<BR><BR></DIV></FONT>
<STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE>
<FONT face=Tahoma size=2>
<DIV>_______________________________________________<BR>Jdbm-general mailing list<BR>Jdbm-general@... href="https://lists.sourceforge.net/lists/listinfo/jdbm-general<BR><BR><BR></DIV></FONT></BODY></HTML" target="_new">https://lists.sourceforge.net/lists/listinfo/jdbm-general<BR><BR><BR></DIV></FONT></BODY></HTML>
|