<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR,
MENU, PRE { margin: 0 auto;}</STYLE>
<META content="MSHTML 6.00.2900.2722" name=GENERATOR></HEAD>
<BODY leftMargin=1 topMargin=1 rightMargin=1><FONT
face=Tahoma size=2>
<DIV>Bryan-<BR></DIV>
<DIV>Brief comments:</DIV>
<DIV> </DIV>
<DIV>FreePhyiscalRowIdPage does appear
to stick around - that's the reason that
my hack for getFirstFiree works (I was actualy
concerned about this when I first proposed
it, but it does seem to work out).
You actually wrote FreeLogicalRowIdPage in
your email, but FreeLogicalRowIdPage doesn't
even have a getFirstGreaterThan method, so
I think you may be getting the Logical/Phyiscal
free lists mixed up (it took me some time
before I finally got my head around the fact
that these two are quite different from each
other).</DIV>
<DIV> </DIV>
<DIV>Keep in mind that getFirstFree and getFirstGreaterThan
are *completely* different in what they do.
getFirstFree looks for an available *slot*
in the free list (for inserting additional
free record meta data into the free
list). getFirstGreaterThan looks for the
first populated slot in the firee physical
list that has size meta data larger
than the desired value.</DIV>
<DIV> </DIV>
<DIV>Finally, the FreePhyiscalRowIdPage absolutely
holds size meta data for the free rows that
it points to (see FreePhysicalRowId.getSize()
and .setSize()).</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>In terms of the in-memory data structure
- a map like that will work, but it
will require a decent chunk of memory,
and non-trivial overhead to keep it all in
sync with the disk contents - definnitely
doable, but it's going to require a
lot of unit tests to make sure that all of
the bases are covered.</DIV>
<DIV> </DIV>
<DIV>My general thinking on this is that
the effort required to develop a data structure
like this is going to be on the same order
as developing a best fit (or near-best-fit)
allocator, and that it might be better if
we put our efforts into that - but I could
be wrong on the relative effort required.</DIV>
<DIV> </DIV>
<DIV>If we do decide to get into this effort,
I would encourage use of a non power series
for determing the sequence of buckets for
the linked lists. Powers of two grow
very quickly, and I think you would wind
up with a significantly sub-optimal allocation.
A modified Fibonacci series (implemented
as a lookup table) may better serve our needs
(say, starting with 6 and 11 - so 6, 11,
17, 28, etc...). Then hold a static
lookup table for sizes 6 through ~4000 to
get the offset into an array holding the
linked lists.</DIV>
<DIV> </DIV>
<DIV>Anyway, just thinking out loud...</DIV>
<DIV> </DIV>
<DIV>- K</DIV>
<DIV> </DIV>
<DIV><BR><BR> > Kevin,<BR><BR>Let
me propose an alternative design in a similar
spirit. Let's read<BR>the data from
the free physical page into a transient data
structure<BR>on the first getFirstGreaterThan
(and getFirstFree?) for that page.<BR>The
data structure would be maintained while
the page was cached and<BR>would mirror the
state of the page (all updates write through
to the<BR>page). When the page is released,
the transient data is discarded.<BR>The free
page list scan is as before, but getFirstFree
and<BR>getFirstGreaterThan will operate against
the transient data structure<BR>for each
in memory page.<BR><BR>The data structure
is a hash map of lists, where each list holds<BR>metadata
about free records described on that page
whose capacity is<BR>some power of 2 up to
the maximum record size (so, this is up to
63<BR>lists per page). We could either
use the same hack that you did on<BR>the
FreeLogicalRowIdPage for the free slots or
keep one more list of<BR>the free slots.
Since we need to know the slot number,
we can either<BR>add that as a field to FreePhysicalRowId
(a transient data structure)<BR>and hold
instances of that in our lists or just wrap
the slot as an<BR>Integer and hold the slot#s
in our lists. Note that FreePhysicalRowId<BR>does
NOT cache the record metadata (physical location
and capacity)<BR>for a slot - it just knows
the location of the slot on the page and<BR>uses
BlockIo to read the metadata from the page.
I wonder if this is<BR>something that
would be worth changing?<BR><BR>Once the
data structure is built, we add and remove
entries in the<BR>free( int slot ) and alloc(
int slot ) methods. The getFirstGreaterThan<BR>method
would find the appropriate list and then
return either the first<BR>fit or best fit
slot on that list. The slot would remain
on the list<BR>until it was released by free(
int slot ). If there was no suitable<BR>allocated
slot, then the next page of the free list
would be checked<BR>by the caller. The
transient data structure would remain in
memory<BR>until the FreePhysicalRowIdPage
was released, so more than one page of<BR>the
free list might be described in this manner
at a time.<BR><BR>This suggestion appears
to hinge on how long a FreePhysicalRowIdPage<BR>actually
remains in memory. This class extends
PageHeader which holds<BR>a hard reference
to the BlockIo containing the data, and the
BlockIo<BR>holds onto the most recently created
view (some type of PageHeader).<BR>Given
what I read into the operations of the RecordFile,
this suggests<BR>that a FreePhysicalRowIdPage
might be pretty likely to stay in memory<BR>for
a while.<BR><BR>Thoughts?<BR><BR>-bryan<BR><BR><BR>-----Original
Message-----<BR>From: Kevin Day <BR>To: Thompson,
Bryan B.<BR>Sent: 10/7/2005 8:11 PM<BR>Subject:
re[4]: [Jdbm-developer] re: [jdbm - Open
Discussion] RE: jdbm<BR>hotspo t in (re-)allocation
of record<BR><BR>Bryan-<BR><BR>gotcha. I
still think we need to apply my getFirstFree
patch to the<BR>physical pages (there are
some operations where that becomes a hot
spot<BR>as well), but you are right that
the getFirstGreaterThan is the biggest<BR>problem.
That's actually what started the whole
allocator discussion<BR>for me...<BR><BR>-
K<BR><BR><BR><BR> > No,
we are on the same page. I was focused
on the<BR>getFirstGreaterThan<BR>method --
that is what shows up in the profiler for
me. -bryan<BR><BR>-----Original Message-----<BR>From:
<A href="mailto:jdbm-developer-admin@...
color=#0000ff><mailto:jdbm-developer-admin@...
href="mailto:jdbm-developer-admin@...
color=#0000ff>jdbm-developer-admin@...:
'JDBM Developer listserv '<BR>Sent: 10/7/2005
2:58 PM<BR>Subject: re[2]: [Jdbm-developer]
re: [jdbm - Open Discussion] RE: jdbm<BR>hotspo
t in (re-)allocation of record<BR><BR>Bryan-<BR><BR>I
think you might be missing something. getFirstGreaterThan
returns an<BR>available RECORD of a given
size. My optimization has nothing to
do<BR>with finding available records - it
has only to do with finding open<BR>slots
in the list that holds on to available records.<BR><BR>An
open slot has no size associated with it,
so you would never search<BR>for an available
slot of a certain size. getFirstGreater
is COMPLETELY<BR>different thatn getFirstFree.
getFirstGreater searches the list for<BR>used
slots of a certain size. getFirstFree
searches the list for empty<BR>slots.<BR><BR>I
agree that getFirstGreater needs to be fixed
(in fact, it probably<BR>needs to be completely
thrown out and replaced with an efficient<BR>allocator),
but my optimization will absolutely fix the
performance hot<BR>spot of getFirstFree.<BR><BR>Cheers,<BR><BR>-
K<BR><BR><BR>> Kevin,<BR><BR>No problem.
Enjoy your weekend. My point
is that getFirstGreaterThan<BR>has different
semantics than getFirstAllocated, resulting
in scanning<BR>over allocated slots that
do not satisify the size parameter. <BR><BR>-bryan<BR><BR>-----Original
Message-----<BR>From: <A
href="mailto:jdbm-developer-admin@...
color=#0000ff><mailto:jdbm-developer-admin@...
href="mailto:jdbm-developer-admin@...
color=#0000ff><mailto:jdbm-developer-admin@...
href="mailto:jdbm-developer-admin@...
color=#0000ff><mailto:jdbm-developer-admin@...
href="mailto:jdbm-developer-admin@...
color=#0000ff>jdbm-developer-admin@...:
JDBM Developer listserv<BR>Sent: 10/7/2005
2:22 PM<BR>Subject: [Jdbm-developer] re:
[jdbm - Open Discussion] RE: jdbm hotspot<BR>in<BR>(re-)allocation
of record<BR><BR>Bryan-<BR><BR>Sorry I haven't
been keeping up with your posts this week
- it's a<BR>really bad week for work.<BR><BR>I
will review this email (hopefully over the
weekend) and get back to<BR>you - but for
now, I have put together a plan for addressing
this<BR>bottleneck (and the FreeLogicalRodIdPage
class) - the current management<BR>of the
free lists is nowhere near as efficient as
it should be.<BR><BR>Note that the changes
I made to FreeLogicalRowIdPage were directed
at<BR>reducing the time it takes to find
an available SLOT in the list (i.e.<BR>an
available slot in which to place the reference
to the free physical<BR>row id). The
same situation exists in FreePhysicalRowIdPage,
and it<BR>should have nothing to do with
the size parameter - the search that I<BR>optimized
is geared at managing open slots in the free
list - not in<BR>actually determining which
entry in the free list to pull during an<BR>allocation
request.<BR><BR>To do any optimization on
the allocation of records, we need to get
into<BR>the redesign of the allocator. I'm
still quite fond of the idea of<BR>using
a set of linked lists, one for each size
range, that the allocator<BR>can quickly
pull from.<BR><BR>I'll catch up on all of
the posts soon - sorry for the delay.<BR><BR>Cheers,<BR><BR>-
K<BR><BR><BR>> <BR>Read and respond to
this message at: <BR><A href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff><https://sourceforge.net/forum/message.php?msg_id=3372174></FONT></A><BR><A
href="https://sourceforge.net/forum/message.php?msg_id=3372174"><FONT
color=#0000ff>https://sourceforge.net/forum/message.php?msg_id=3372174</FONT></A><BR>By:
thompsonbry<BR><BR>Kevin,<BR><BR>I've started
looking at the FreePhysicalRowIdPage class.
The changes<BR>that you<BR>made to
FreeLogicalRowIdPage do not carry over directly
since the test<BR>is not<BR>for an "allocated"
slot, but for one which is allocated and
where the<BR>record<BR>has at least a given
size.<BR><BR>I tried out the same method
anyway, but it results in both more time<BR>(slower<BR>execution)
and more space (larger store) since it is
skipping over<BR>allocated<BR>slots for free
physical records which do not satisify the
current<BR>request, but<BR>which might satisify
a subsequent request.<BR><BR>I'm going to
think about some alternative approaches that
we might try<BR>here.<BR>For example, maintaining
a more complex transient data structure than<BR>the
two<BR>counters you added to FreeLogicalRowIdPage
or reordering the slots so<BR>that the<BR>slots
are dense and sorted by size, in which case
we could just use a<BR>binary<BR>search for
both getFirstFree() and getFirstGreaterThan(int
size). We<BR>could<BR>re-order when
slots are freed or allocated.<BR><BR>There
may be other solutions as well, but I would
like to resolve this<BR>bottleneck.<BR><BR>-bryan<BR><BR>______________________________________________________________________<BR>You
are receiving this email because you elected
to monitor this forum.<BR>To stop monitoring
this forum, login to SourceForge.net and
visit: <BR><A href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff><https://sourceforge.net/forum/unmonitor.php?forum_id=12569></FONT></A><BR><A
href="https://sourceforge.net/forum/unmonitor.php?forum_id=12569"><FONT
color=#0000ff>https://sourceforge.net/forum/unmonitor.php?forum_id=12569</FONT></A><BR><BR><<BR>-------------------------------------------------------
This SF.Net<BR>email is sponsored by: Power
Architecture Resource Center: Free content,<BR>downloads,
discussions, and more.<BR><A href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff><http://solutions.newsforge.com/ibmarch.tmpl></FONT></A><BR><A
href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff><http://solutions.newsforge.com/ibmarch.tmpl></FONT></A><BR><A
href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff><http://solutions.newsforge.com/ibmarch.tmpl></FONT></A><BR><A
href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff>http://solutions.newsforge.com/ibmarch.tmpl</FONT></A><BR>_______________________________________________
Jdbm-developer mailing<BR>list <A
href="mailto:Jdbm-developer@...
color=#0000ff><mailto:Jdbm-developer@...
href="mailto:Jdbm-developer@...
color=#0000ff><mailto:Jdbm-developer@...
href="mailto:Jdbm-developer@...
color=#0000ff><mailto:Jdbm-developer@...
href="mailto:Jdbm-developer@...
color=#0000ff>Jdbm-developer@...
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR>-------------------------------------------------------
This SF.Net<BR>email is sponsored by: Power
Architecture Resource Center: Free content,<BR>downloads,
discussions, and more.<BR><A href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff><http://solutions.newsforge.com/ibmarch.tmpl></FONT></A><BR><A
href="http://solutions.newsforge.com/ibmarch.tmpl"><FONT
color=#0000ff>http://solutions.newsforge.com/ibmarch.tmpl</FONT></A><BR>_______________________________________________
Jdbm-developer mailing<BR>list <A href="mailto:Jdbm-developer@...
color=#0000ff><mailto:Jdbm-developer@...
href="mailto:Jdbm-developer@...
color=#0000ff>Jdbm-developer@...
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR> <BR><BR><</DIV></FONT></BODY></HTML>
|