Thread: RE: [Jdbm-developer] extensible serialization

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Kevin,

I've tried a variation of this in which it did extend Serializer and
backed out of it since it introduced what appeared to be additional
complexity.  If we go this way, then we wind up in a position where
we have no common interface, but we could add an ISerializer for that.
So, what you are proposing might look like:

interface ISerializer {}; // marker interface.
interface Serializer extends ISerializer, Serializable {...}; // as now.
interface CompoundSerializer extends ISerializer, Serializable {}; // w/ the
new API methods

I did find that the "state" requirements for the "recman aware"
serializer were close to those for the "serialization handler", which
is another reason that I didn't introduce a serialization handler
interface and backed out of a design in which I used another interface
for handling compound records.  It seemed a bit confusing and I was
not sure that the additional interfaces would clarify anything for
people.  Another reason that I hesitate to do this is that the additional
interfaces might make it more complicated to define a stream-based
serialization API since that could be orthogonal to the question of
"recman aware" resulting in a 2 x 2 design, which just seems too much.

With respect to the record header, the extensible serialization
approach is not placing any data into the record header.  It all
winds up in a "data" header before the rest of the serialized
data.  I was initially using the record header for this metadata,
but that (a) breaks binary compatibility and (b) does not support
the use of the extensible serializer within compound records, hence
the revision to use a "data" header.

Can you elaborate on your point a bit more given that the metadata
is part of the record and not the record header?

Thanks,

-bryan

-----Original Message-----
From: Kevin Day
To: Thompson, Bryan B.; 'jdb...@li... '
Sent: 10/13/2005 1:16 PM
Subject: re: [Jdbm-developer] extensible serialization

Bryan-

I'm really of two minds about this myself...

There are two competing requirements:  Making the system easy to use for
end users (people who are just using it as a data store), and making it
easier for developers who are creating their own containers...

The tradeoff in the design so far has been to lean towards the end
users' requirements.

I have one suggestion that may work around all of this.

What do you think about adding a RecManAwareSerializer interface?  It
would *not* inherit from Serializer, and it's interface would be:

byte[] serialize(RecMan, recid, Obj)
Object deserialize(RecMan, recid, byte[])

The record manager can do a quick instanceof check on the supplied
serializer (or the recovered serializer if we are going to store that
info in the record header) and call the appropriate method.

The reality of things is that even my technique of storing meta data for
each object requires that the recman be made available during
deserialization of the ObjReference objects (but that's the ONLY time it
is needed).  I'm using a factory to deal with this right now, but your
comments on the limitations of factories in the BTree and HTree
construction are dead on, and I am sick of jumping through the hoops
that requires.

I think that if we recognize that there are actually two layers of
objects that need to get stored in the record manager, with two
completely different serialization needs, then we can address both
needs.

Business objects continue to use the Serializer interface (which allows
them to migrate quickly over to one of the containers if the developer
decides to do that).  Container objects will use the new interface
(we'll have to update the BTree and HTree implementation - but that is
well worth it if it allows us to easily sub-class them).  This keeps the
persistence logic separation for the business objects, but doesn't
artificially prevent it for lower level container type objects that
actually have valid reason to have access to the record manager
implementation.

None of the above, of course, gets at whether we need to store the class
and serializer in the record header (or whether it should be encoded by
some sort of serialization handler interface)...  What do you think
about my comment about including the serializer id in the record header,
but not the class ID (Because the class will be recoverable from the
serializer if it's needed)?  That would help to keep record header
overhead down, and I don't think we are actually losing any
functionality/information by doing so.

It also doesn't address the question of adjusting the API to include
hints (I'm actually quite interested in Alex's comment that this
behavior really belongs in the serializer...) - but hopefully it will
give some food for thought.

What do you guys think?  Any problems with the instanceof technique that
I'm not seeing  ( researched performance hit of instanceof (just web
research - no actual testing), and the literature says that instanceof
check overhead is very, very low)?

- K

 > Kevin,

Thanks for your excellent feedback.  I think that the question goes
(1) to the complexity of supporting serializers that can encapsulate
the required state (recman and possibly recid) vs having that state
in the API with a stateless serializer and (2) whether the recman and
recid are required to be passed through to support compound records
(the practice of using serializers within a record as well as for the
overall record).

My position on (1) is that it is more complicated to provide for 
serializer constructor using MySerializer( RecordManager recman,
long recid ) or to provide a callback on ISerializer that notes
the recman and recid than it is to pass these through the API.  In
fact I require Serializers used by the extensible serialization
(the serialization handler) to be stateless and to have a public
zero argument constructor.  The serialization handler is the only
thing with state.

My position on (2) is that I have a use case for compound records.
Without passing through the required state (whether it is encapsulated
or not) it is not possible to use compound records.  BPage succeeds at
this practice because the recman and recid are available as transient
state on the BPage.  By addressing (1) we also open up the possibility
of writing constructors that insert objects (including BTrees) into the
store, which means that people can now subclass BTree - a significant
advantage in my mind.

With respect to the interesting practice of using a weak hash map to
recover the recid (or a reference object), that is a nice way of
handling things.  Of course it does require access to the recman,
which means that we can't practice that inside of a serializer if
the intention is to mark that information on the object, which, I
know, goes against your recommendation.  I am not against the practice
of hiding the persistence layer, quite the contrary!  However I handle
encapsulation in a different manner with a framework over jdbm (or
other persistence layers).  Without arguing for (or against) any
specific encapsulation technique, I feel that it raises a significant
barrier to other practices by not have this information (recman, recid)
in the Serializer API.

What I would like to do at this point is update the write up the
extensible serialization framework, incorporate your recommendation on
how to encapsulate references using a weak hash map approach, develop
a runtime RecordManager option to support that approach, and commit the
existing extensible serialization code which passes the recman and recid
through a modified Serializer API.  I think that this supports multiple
approaches to encapsulating persistence and guides people towards some
alternatives.

If you had some code that you would like to contribute to support the
practice you have outlined I would be happy to integrate and test that
as part of this effort.

Thanks,

-bryan

-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads,
discussions,
and more.  <http://solutions.newsforge.com/ibmarch.tmpl>
http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jdbm-developer mailing list
 <mailto:Jdb...@li...>
Jdb...@li...
 <https://lists.sourceforge.net/lists/listinfo/jdbm-developer>
https://lists.sourceforge.net/lists/listinfo/jdbm-developer

<