I think that we have a lot of latitude with the design. We could keep the existing page view mechanisms and simply augment them along the lines that we are discussing, or we could go to the design that I am proposing and only support the translation page variation for now.
In terms of GC, I have been thinking of deferring physical row deletions and then batching them.
On Behalf Of Kevin Day
Sent: Tuesday, April 11, 2006 5:48 PM
To: JDBM Developer listserv
Subject: re: [Jdbm-developer] record managment
I certainly agree that writing to a clean page is a dead simple strategy when you are doing a bunch of inserts and makes it especially easy to cluster objects since there is only one tx doing the allocation and with data on the page.
KD - What is unknown is whether this strategy is any worse than any other strategy. It's so simple that it seems like it almost has to be worse - but I can't really put my finger on exactly how it would worse. I'm wondering if there is any research in the literature that critiques this approach.... ?
One improvement to this strategy would be to have a bit of interaction with the page cache. We evict pages in chunks, and perform a GC run across all the pages in a given chunk. If GC actually results in reclaimed space, the changed pages get pushed back into the safe at the top of the cache.
Tuning of the cache then becomes (in part) choosing a size to where page evictions result in entire pages being removed from the write set.
Like I said, this may be a complete loser approach - but all of that extra GC, etc... is ocurring in a background process, so maybe not... My biggest issue is that it requires multiple writes of the pages to the safe... And when we add translation pages to the mix, a single optimization of a given data page could result in a large number of pages having to be re-saved to cache.
An optimization would be to do what you are proposing and have an allocator strategy that only tracks open record ranges in cached pages (with slots on dirty pages being given higher priority).
Certainly, having the translation info and data on the same page would fix some of this. How about the following as a possible compromise that may give the best of both worlds:
Maintain a PAGE translation system at the beginning of the db file, but keep record location inside the page itself with the data. A lookup would still consist of translation lookup to determine the physical page, but the position of a given record on it's page could change dynamically. This would allow us to quickly move pages around without having to adjust potentially scattered OIDs, and allow us to quickly coallesce freed records to maximize available space on a page.
I'm still noodling on this idea - I think there is something here - I'm just not sure about the details.
------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Jdbm-developer mailing list Jdbmemail@example.com https://lists.sourceforge.net/lists/listinfo/jdbm-developer