Wanted to float an intermediate idea here - it
depends on the database utlization, but may provide a quick out.
The primary issue with the first fit algorithm
was that, in certain reasonable usage models, it resulted in massive amounts
of wasted space. This was more a result of the fact that the allocator
doesn't support splitting or coallescene than anything. The downside of
the current 'fix' for this issue is that other (quite reasonable) usage
scenarios result in a huge amount of wasted CPU cycles (dontcha love
these kinds of tradeoffs?).
The ultimate fix for this, of course, is to
re-design the allocation strategy completely, and include splitting and
coallescing (or have that behavior be implicit to the algorithm). There
has been quite a bit of discussion about strategies in this area, but things
move slowly, so I'm not holding my breath for any of them.
It seems to me that the immediate workaround here
is to add pack capability to jdbm. I believe that this could be done,
even with the current data structures, by creating a temporary lookup table
for reverse mapping physical offsets to recids. The downside to this
approach is that it might require some sort of quiessence in the system to get
it right, so if the database is going to be hammered 24x7, then it may not be
the right solution - but I thought I'd toss it out.
Here is how the algorithm could
The pack operation is triggered
jdbm creates a new, separate jdbm database with
transactions turned off, optimized purely for speed (this is a temporary
jdbm adds a BTree to act as a reverse lookup
table. Keys are Long, physical offset. Values are Long,
jdbm cranks through the translation table and
inserts physical offset->recid mapping into the BTree
jdbm then grabs the first data page, and checks
to see if the first record on the page has identical length and used
values. If not, then in a single transaction, it moves the second record
to consume all space, updates the length and used value of the first record,
lookups up the recid of the second physical record, updates the offset of the
second physical record in the translation table, closes
As an additional check, the system should check
the physical offset in the current translation table against the physical
offset of the actual record. If they are not the same, then the record
shouldn't be moved, and the next record should be considered.
I'm pretty sure that it should be possible to
implement this in a manner that allows for partial pack
The database will certainly grow, but the pack
operation will shrink it back down.
Anyway, just wanted to get my thoughts out there
on this in case anyone wants to run with it. It should be
considerably easier (and less error prone) than implementing a new
allocator, and it should provide the basis of a migration path into a new file
format when that happens...
If such a system were in place, it would be
possible to go back to a first fit algorithm (with significantly better
performance across a much wider field of applications), and still (hopefully)
keep the fragmentation down.
All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully
trained technicians. The highest number of Red Hat certifications in the
hosting industry. Fanatical Support. Click to learn more
_______________________________________________ Jdbm-developer mailing list
I am not
seeing as much grief from the allocation strategy as you
I expect that there is an interaction with the allocations
are being performed by your application. The change was
since it produced smaller stores for our RDFS database
without a loss in
speed, but clearly it is a loose for you. Let
me look at the code and
see what we can change.
to record copying, you should be able to refactor what you
of the dump utility. It has mechanisms for scanning
from a page perspective, but it does not attempt to do
inserts with a
pre-defined logical record id. I can look into
that I suppose, but let's
see where the update (below) gets
I've attached a modified version of FreePhysicalRowIdPage.
it out and let me know what impact it has on your
application. It passes
the jdbm test suite, but there are no
performance tests that specifically
target this feature. It has
no performance impact (time or disk space) on
our RDFS database