First query stems from not being sure wether I should create a separate RecordManager database for every HashMap I'm replacing or simply create different BTrees for these from a single RecordManager. Each of these hashes contain a different set of object types and I cant appear to have separate caches for each BTree. Not sure if this would make any difference so I plumped for the latter.
Second issue was while building my BTree representation I've copied one of the example verbatim and commit() every 1000 inserts or so.. Is this necessary and would there be any leeway in disabling transactions before attempting to populate the data and only commit at the end? which is likely to provide the most efficient route, both in speed and running memory consumption?
Another problem came after creating my BTree and committing the bulk of the data seems to reside in the transaction log (.lg) and not the database file (.db). Changes were made to an earlier version on Rickards bequest to be allow for the transaction log to be purged. But the given example relies on getting the BaseRecordManager, which in turn relies on having reference to the current cache. But in the CVS version rather than explicitly creating a cache we now pass a cache size property to the RecordManager constructor. So how do we get the BaseRecordManager? I've written some cheesy code to check the instances and cast the active RecordManager into a CachedRecordManager and gotten the BaseRecordManager from this - but it feels dirty. After jumping through these hoops I can purge the transaction log, but do I get any benefit from doing this? does JDBM still work as efficiently regardless of which file the data exists in? you would expect the transaction log to be slightly less efficient.
First question: Yes, I believe it's better to have a single RecordManager and multiple BTrees. If you need individual caches for your HashMap's, you can add it transparently by placing a caching HashMap in front of them.
Second, the commits every 1000 inserts are necessary (although the number is arbitrary) because JDBM holds the last 10 transactions in memory for performance reasons. You can disable transactions on the record manager during large batch loading operations to achieve greater performance.
Third, JDBM will perform as fast during operation even if the log grows larger (but keep in mind that a copy is kept in-memory so you don't want it to grow indefinitely). The size of the log file only affects recovery time where the log is replayed and committed to the main database file. Controlling the size of the log file falls into the optimization category and is therefore a tradeoff between operational speed and recovery time. As for the casting of the BaseRecordManager/CacheRecordManager, you are dealing with implementation details so it is currently unavoidable.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.