I don't think we actually have the logical id -> physical id map that
we need... We'd have to have all records on the source and page in
cache, read from their logical row ids, in order for it to work. This is
not at all likely.
This, of course, assumes that we are just moving pages around.
I suppose it would also be possible to actually move records around
instead of entire pages... The strategy would be to just order the
entire file by the order of logical row ids. This would take a huge
amount of disk thrashing, but it would wind up with a fully packed
database. My idea of doing this at the page level would be much faster
(in fact, it might be fast enough to do on the fly), but would not result in
as tight of a packing.
Even if we sweep the logical row id list, we still have to have the
ability of moving a record where we only have the physical location, so a
reverse map is still needed...
There is one time when we have the logical id --
that is when we are
in the middle of a jdbm operation. We could do on
the fly relocation
of pages. The main goal of relocation is, I think,
to migrate pages
from further down in the store into free pages earlier in
If we always had the block of the first free page in the store,
could we do runtime relocation of any fetched page with a
But that seems wildly suboptimal. Unless we are
trying to get all
pages of a certain type up front, it seems that it is
only pages at
the end of the store that we want to swap for free pages
the store. There is probably a very simple operation that
we can do
to test if a page qualifies. I.e., there are N free pages
so the N
pages from the end of the store are valid candidates for
If the page is already a free page, then we don't relocate it.
we swap it with the first free page from earlier in the
Not a complete algorithm, but it seems possible to devise
JDBM Developer listserv
Sent: 9/27/2005 12:30 AM
[Jdbm-developer] More thoughts on new translation page
So I've been
thinking... The main purpose of re-implementing the
is to allow it to be moved around without messing
started thinking about what other things would prevent us from moving
around, and, unless I'm missing something, the current
not allow for moving data pages around at all... The
problem is that
there is no efficient way to look up the logical row id
given a physical
row id. If we are trying to re-arrange blocks, we will
have to do a
linear search of the entire translation table for every
single block we
move. Ugly. Same problem applies if we are wanting to
record inside a block.
I'm pretty sure that this calls for reverse
loopup capability, and that
the best place to do it is going to be in the
record header. We could
store it in an external lookup list, but that
just complicates things
(and the same amount of disk space still gets
So, this got me thinking about the record header...
are going to implement a split() operation on a record, then I
think we can
actually drastically change the way the current record
Instead of storing the available and used lengths (as integers),
store available and unused lengths. available will still be an
but we should be able to make the unused length a single
When we add in the logical row id, the record header becomes 13
instead of the current 8. We may also want to carefully
actual # of bytes we need to capture the available range of
ids that can exist in a file.
Let's say that we have a
file completely full of records with an average
of 8 bytes each (I think
this is a very conserative average). If we had
only data pages in the
file, and we honor the current jdbm limit on the
maximum block id (2^47),
we will need only 7 bytes to capture all
possible logical row ids.
So, we can actually drop the new header size
down to 12 bytes.
Just in case anyone cares, the average data size
would have to be
4096 bytes before we could recover an additional byte.
Another way to
approach this is to think about the biggest size that we
see being used (the addressing above allows for file
sizes up to ~8,000,000
TB - yup 8 million Terabytes). This seems quite
excessive, but I'm
not sure where the reasonable break point should be -
I'm sure that 20 MB
used to sound like a lot...
Anyway, I think that implementing an
effective defrag system is going to
require this kind of reverse pointer -
I'd love to hear if any of you
have any other ideas.
BTW - I haven't
heard from Alex in awhile - is he OK? Did he maybe get
hit with one
of these hurricanes? Or is he just ignoring all this
we'll go away?
SF.Net email is
sponsored by: Tame your development challenges with
App Server. Download it for free - -and be entered to win
a 42" plasma
tv or your very own Sony(tm)PSP. Click here to play:http://sourceforge.net/geronimo.php