re[2]: [Jdbm-developer] Free page selection bug

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
 Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR,
 MENU, PRE { margin: 0 auto;}</STYLE>

<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY leftMargin=1 topMargin=1 rightMargin=1><FONT
 face=Tahoma>
<DIV><FONT face=Arial size=2>Alex-</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2></FONT></DIV>
<DIV><FONT face=Arial size=2>Yes - I think
 that Bryan's idea of making the '128 bytes'
 configurable will achieve the same thing
 (set that value to Integer.MAX_VALUE, and
 you've effectively reverted back to first
 fit...) - or am I missing something subtle?</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>- K</FONT></DIV>
<DIV><FONT face=Arial size=2>&nbsp;</FONT>

<TABLE>
<TBODY>
<TR>
<TD width=1 bgColor=blue></TD>
<TD>&gt;
 A complementary idea would be to define a
 property that provides a hint to the allocator
 algorithm to enable/disable the first-fit
 strategy. alex Kevin Day
 wrote: Gents- &nbsp; Wanted to
 float an intermediate idea here - it depends
 on the database utlization, but may provide
 a quick out. &nbsp; The primary issue
 with the first fit algorithm was that, in
 certain reasonable usage models, it resulted
 in massive amounts of wasted space. &nbsp;This
 was more a result of the fact that the allocator
 doesn't support splitting or coallescene
 than anything. &nbsp;The downside of the
 current 'fix' for this issue is that other
 (quite reasonable) usage scenarios result
 in a huge amount of wasted CPU cycles (dontcha
 love these kinds of tradeoffs?). &nbsp; The
 ultimate fix for this, of course, is to re-design
 the allocation strategy completely, and include
 splitting and coallescing (or have that behavior
 be implicit to the algorithm). &nbsp;There
 has been quite a bit of discussion about
 strategies in this area, but things move
 slowly, so I'mnot holding my breath for any
 of them. &nbsp; &nbsp; It seems
 to me that the immediate workaround here
 is to add pack capability to jdbm. &nbsp;I
 believe that this could be done, even with
 the current data structures, by creating
 a temporary lookup table for reverse mapping
 physical offsets to recids. &nbsp;The downside
 to this approach is that it might require
 some sort of quiessence in the system to
 get it right, so if the database is going
 to be hammered 24x7, then it may not be the
 right solution - but I thought I'd toss it
 out. &nbsp; Here is how the algorithm
 could work: &nbsp; The pack operation
 is triggered jdbm creates anew, separate
 jdbm database with transactions turned off,
 optimized purely for speed (this is a temporary
 file) jdbm adds a BTree to act as a reverse
 lookup table. &nbsp;Keys are Long, physical
 offset. &nbsp;Values are Long, recid jdbm
 cranks through the translation table and
 inserts physical offset-&gt;recid mapping
 into the BTree jdbm then grabs the first
 data page, and checks to see if the first
 record on the page has identical length and
 used values. &nbsp;If not, then in a single
 transaction, it moves the second record to
 consume all space, updates the length and
 used value of the first record, lookups up
 the recid of the second physical record,
 updates the offset of the second physical
 record in the translation table, closes transaction
 &nbsp; As an additional check, the
 system should check the physical offset in
 the current translation table against the
 physical offset of the actual record. &nbsp;If
 they are not the same, then the record shouldn't
 be moved, and the next record should be considered.
 &nbsp; I'm pretty sure that it should
 be possible to implement this in a manner
 that allows for partial pack operations.
 The database will certainly grow,
 but the pack operation will shrink it back
 down. &nbsp; &nbsp; Anyway, just
 wanted to get my thoughts out there on this
 in case anyone wants to run with it. &nbsp;It
 should be considerably easier (and less error
 prone) than implementing a new allocator,
 and it should provide the basis of a migration
 path into a new file format when that happens...
 &nbsp; If such a system were in place,
 it would be possible to go back to a first
 fit algorithm (with significantly better
 performance across a much wider field of
 applications), and still (hopefully) keep
 the fragmentation down. &nbsp; Cheers,
 &nbsp; - Kevin &lt; </TD></TR></TBODY></TABLE></DIV></BODY></HTML>