If you do not update the indices immmediately within the scope of the tx in
which the attribute was modified then the indices become stale with respect
to the runtime data. Is that acceptable?
on behalf of Kevin Day
Sent: Wed 4/12/2006 2:33 PM
To: JDBM Developer listserv
Subject: re: [Jdbm-developer] primary and secondary indexes in the
My approach to the indeces question is that the indexes are part of the
'updated' state of the database. If an update call hasn't been made,
the changes caused by that update call are not reflected in the index.
is consistent with cursor based system behavior.
Forcing a user to modify their core code is certainly the easiest thing to
do - but it just doesn't fly in many, many development scenarios - it
severely violates IOC, makes testing much harder, etc... EJBs went down
this path, and there is a massive developer backlash against it right now.
If immediate index update notification is required, then using aspects and
configuring a set of method calls that result in index updates could do it -
but if we go down that path, then I would challenge the use of the update()
call being part of the recman api at all. I think that having changes
marked by an update() call is a reasonable trade-off in the design of the
recman, and having that be the trigger to update indexes is probably also
In terms of the public APIs for jdbm, I agree that they should offer
simplicity for the "common man" who just needs persistence and
current API is well suited to this, and I would be inclined to make it even
more bear bones. The real trick here is going to be devising the
APIs so that we partition the different areas of concern in a flexible and
effective manner. Creating a concrete class encompassing a good general
solution for access to persistent records and indices will not be hard, but
some applications will want to get under the hood and I want to support that
With regard to the question of "aware" indices, the case that is
to support is when a user modifies an indexed field on a persistent object.
Unless you require property event notification for indexable attributes,
this is always going to go unnoticed and that will result in incoherent
indices. What about requiring people to implement the appropriate
change event mechanism. That way you get the old and new property value
the event. That is all you really need.
On Behalf Of Kevin Day
Sent: Wednesday, April 12, 2006 1:35 PM
To: JDBM Developer listserv
Subject: re: [Jdbm-developer] primary and secondary indexes in the record
Let's take a crack at how to handle self-concurrency managed objects.
can not use the regular recman api for working with these objects (BTree or
BPage), because the recman will cause the object to interact with the
standard concurrency control mechanism (MVCC coupled with either 2PL or
From an architecture perspective, do we need to have the recman expose an
advanced api for allowing direct access to the logical record store
(bypassing the locking sub-system and version manager)?
My preference would be to keep the recman API for regular users as simple as
As for managing indexes, I certainly see ways of doing it (heck, I'm doing
it now in my own code) - the point of the discussion was to talk about what
options are available to us. The options I currently see are:
1. Require user to add behavior to their objects to support triggering
2. Capture the index key set of each record when it is fetched, and cache
it in the record cache
3. Use a reverse index for mapping recids to the index key set
4. Use a reverse index for mapping recids to the index page and slot
5. During update, deserialize the source object (get an 'as stored'
vs the 'as changed' object that resides in the cache at update time), then
capture the key index set based on the 'as stored' object.
6. Require that the serialized form of the object include sufficient
information for extracting the index key set without having to deserialize
the entire object
I'm wondering if there are others.
#1 is most definitely not palatable - a tool like jdbm should not force
users to completely change their coding style or class hierarchies.
#2 works (it's what I'm doing now), but it is inneficient because we have to
deserialize the key set every time any record is fetched from the store.
also requires that a lot of data be kept in memory for records that probably
won't ever be updated
#3 doubles the size of the indexes (the keys must be stored twice - once in
the index and once in the reverse index)
#4 requires a significant amount of cooperation between the index and the
reverse index - but would be more space efficient than #3
#5 will slow down inserts and updates (we basically have to deserialize the
stored byte stream before we can serialize the new version of the object),
but requires no additional space in the db, and does not impact fetch
#6 requires the user to change their coding practices, but only in the
serializer, so it won't impact the core of the user's design. This
completely precludes the use of default java serialization.
I'm quite interested to see if there are any other options that you guys can
I would rather keep the existing features and promote the design to a b-link
I actually have a use case for generalized values for the btrees. In
to improve read performance over indexed statements some of the attributes
of the statement are redundantly persisted in the btree values.
In terms of an indexing and constraint system, I am fine with that as a
feature but I see it as layered on, separable and not critical path for a
jdbm2 initial release. Plus I just don't see how you are going to get
without one thing or another that you seem to find distasteful.
------------------------------------------------------- This SF.Net email is
sponsored by xPML, a groundbreaking scripting language that extends
applications into web and mobile media. Attend the live webcast and join the
prime developer group breaking into this new coding territory!
_______________________________________________ Jdbm-developer mailing list