From: Alex B. <boi...@in...> - 2001-06-26 03:45:43
|
Brian O'Neill wrote: > I think JDBM is a great little persistence engine with tons of potential. > Its nice to see such a project like this being actively developed. I do see > some things that can be improved. I'm certainly open to improvements, either to the existing core or adding extra functionality. Something I've articulated in the past is that I'd like to see the core of JDBM remain small, so that it remains an interesting choice for small projects that only want a simple persistence engine. However, we can bundle a number of optional helpers/utilities around it to make it attractive to those who need a few more features. For example, the support for (or emulating) the Java2 Collection classes has been discussed here in the past and I still believe it would make a nice addition. > When I save a key or value into a JDBM table, the object is serialized in > its own stream. If this object has a reference to another shared object, the > original object graph is not preserved across hash table entries. The > approach that JDBM uses to serialize also has space overhead from all the > stream headers and class info written to each record. You're right. Currently, the object is serialized in its own stream (along with its complete object graph) and converted into a byte[] for storage in the RecordManager. Like you mention, this implies an overhead for each object placed into the hash table. This overhead is roughly 25 bytes for user-defined serialized objects. For strings, it's 7 bytes (including string length). It's not so bad, but can be a concern databases where a large number of small objects are stored. > I like the simple interfaces provided in the HTree and BTree, but since they > do nothing special to preserve object graphs, I think an interface that > operates on byte[] keys and values is more flexible. A simple object > serialization strategy could be placed above this level, or a more > sophisticated persistence model could be developed with less storage > overhead. Well, from my experience, the 'simple serialization strategy' you mention can become quite complex, depending on what exactly you have in mind. I'm not against the approach. In fact, I'd like to have such a feature if it really is not too complicated and doesn't impact the size of the core too much. If you have ideas in mind, please send them on this list for discussion. > Separating the HTree and BTree from the RecordManager is really nice. I > think more levels of layering will make it easier to develop many kinds of > persistence models. Making object persistence available at the RecordManager > level might not be the best place for this, since I consider this to be a > very "high level" function. Again, the current object persistence in the RecordManager is really just a utility function which serialiazes the object into a byte[]. If you want more functionality, you build it on top. This goes for any type of service generally found in OODBMS such as collection classes, concurrency policy (pessimistic/optimistic locking), transaction models (nesting, rollback ability, ...), etc. I've argued in the past that many of these features should be part of a separate project, in order to keep JDBM small and simple. I guess it depends on the scope of the overall project and whether you want to create a full-blown OODBMS or simply extend JDBM in ways that are compatible with its original objectives. > > I'm going to start mucking with JDBM, to see how feasible it is to have > higher level tree implementations built upon simpler ones. Has there been > any other talk of this, or have there been any such implementations? > The Exolab folks have built some services on top of JDBM in the past. I remember seeing locking and extended transaction management in the core library. I should check again to see if we could merge some of the code back into JDBM. cheers, alex |