1.1 RC Feature List

Kevin Day
2005-08-13
2013-06-03
  • Kevin Day
    Kevin Day
    2005-08-13

    I'd like to start gathering a list of changes/improvements to consider for the 1.1 RC build (and possibly the 2.0 build).  Once we have a list together, we can decide which ones are most important, and who will be involved in the development of each.

    For starters, here are some things that have occured to me, and that have been mentioned in these forums by others:

    1.  Key compression at the byte[] level (i.e. generalized key compression - not just String key compression)

    2.  Inclusion of an additional standard key type (MultiPartLongValue) along with comparitor and serializer.

    3.  Implement true roll-back symantecs in the transaction logic

    4.  Implementation of a weak reference queue (WRQ) to ensure object identify when the same object is restored from disk multiple times.  I'm not entirely convinced that this should be performed at the RecordCache level...

    5.  Creation of a standard method/interface for index clustering and synchronization - i.e. use multiple BTree (or HTree) objects to allow indexed retrieval of the same set of objects.

    6.  Add getCount(startKey, endKey) to BTree to efficiently determine value count between two keys (inclusive)

    7.  Add delete(startKey, endKey) to BTree to efficiently remove all values between two keys (inclusive)

    8.  Replace HashMap with primitive hash map implementations in performance sensitive areas

    9.  Change file access to use NIO

    10.  Support for optimized montonically increasing key insertions

    11.  Add support for concurrent modification to TupleBrowser

    If you have others, please post...

    - K

     
    • mpinl
      mpinl
      2005-08-13

      12. Span a RecordManager over more than one physical file (i.e. divide the file chunks to work around file system limits)

       
      • Alex Boisvert
        Alex Boisvert
        2005-08-15

        I'm not a big fan of adding this feature in JDBM since modern file systems now support very large files (1TB+) and you also have a large array of choices when it comes to virtual file systems / volume managers on Windows, Linux, Solaris, ...

        I don't see the advantage of doing this in JDBM versus having the operating system handle it.

        Are you running in a special environment (e.g. J2ME) where there is no support for volume management?

        alex

         
    • Kevin Day
      Kevin Day
      2005-08-14

      13.  Add pack/rebuild/optimize/defrag type operations

      14.  Allow specifying the full database and log filename (instead of automatically appending extensions)

      15.  Addition of another data structure type - LargeBlobHolder - to store large chunks of data using the file system as the data store (e.g. instead of storing huge data sets in the record manager, store a reference in the record manager and store the data in individual files in the file system).  The trick here will be to manage this all in a transactionally safe manner...

      16.  Add support for encrypting BTree and BPage data.

       
      • Bryan Thompson
        Bryan Thompson
        2005-08-15

        17. Concurrent I/O, including asynchronous pre-fetch support.

         
    • Bryan Thompson
      Bryan Thompson
      2005-08-16

      BTree:

        - Expose the {@link Comparator} using a public method.  This makes it possible to apply the comparator when writing a wrapper over TupleBrowser that stops after it exceeds (or proceeds) a specified key.

      - Efficient count of the #of keys in a key range.

      - Return the last key in the BTree (or a Browser positioned immediately before that key).

      - Range browsing (I have implemented this in a wrapper class).

      - Support BTree traversal with concurrent modifications (already on the feature request list).

      - Support compressed keys for BTree (already on the feature request list).

       
    • Bryan Thompson
      Bryan Thompson
      2005-08-16

      - BLOB/CLOB support.

      I have implemented a sketch of this that uses a linked list of records (called "segments") to store the blob.  Each record is up to a given size, e.g., 10k or 100k or whatever.  The client gets an output stream (or writer) and writes on it.  The data are streamed into a buffer.  When the buffer is full a new blob segment is written to disk.

      This strategy could be extended to use write ahead (more than one buffer) and read ahead (prefetch).

      I'm not sure why we would want to handle blobs in the filesystem vs in the store, though I've heard Alex say this before.  I like the notion that everything is in the store file and I think that we can get the I/O rates up quite high with write ahead and prefetch strategies.

      There is, of course, a cost today if the in memory transaction buffer is used since a copy of the blob is still in memory.  So, I would also like to add:

      - Review the transaction management strategy.

      to the list.

      -bryan

       
    • Bryan Thompson
      Bryan Thompson
      2005-08-16

      - ObjectManager.

      I've been working on an object manager layer for jdbm.  This keys two transient pieces of information for each runtime surrogate: the recid of the persistent object and the reference to the object manager.  The result is a much simpler API for managing objects, but it has the cost that persistent objects managed by the object manager all extend a common base class (JDBMObject).

      I guarentee reference testing for equality (which requires a weak reference cache).

      The ObjectManager API does not throw IOExceptions.  I find these exceptions clutter up the code since they are never trapped except by some ultimate routine driving the transaction and they are only handled by rolling back the transaction.  We don't need to change the RecordManager interface to masquerade IOExceptions (though I would not mind), but I find that they make no sense if you are trying to hide the persistence layer behind an object interface.

      -bryan

       
    • Kevin Day
      Kevin Day
      2005-08-19

      17.  Inclusion of a standard StringSerializer class.  The current method of using default serialization is nasty (ObjectOutputStream embeds the full name of the class in the output byte array)