All,

 

I have been re-thinking record management some.  I want to share some of my thoughts here.  I also want to see whether the version management requirements for kevin’s MVCC module could be satisfied by such a design.

 

The main insight is that the existing translation page design could be rethought as one possible extreme in a more flexible, and hopefully more efficient, design.  The basic change is that each page would incorporate the capability to store either and/or logical to physical translation slots or directly encode the physical record.  The same slot map would be reused for both purposes with appropriate coding to indicate an on-page record vs. a record that was indirected to another page.

 

Setting aside the page header for the minute, the design would have records growing up from the bottom of the page (lower address) and the slot map growing down from the top of the page.  The lower bits of the OID would address slots in the page.  Those slots would either give an on-page offset or the pageId and slot off the record on another page.  The data space and the slot map space on the page would be kept dense, so there would never be free space between records.  This should be a consistent data structure and reconciliation of page images should be possible.

 

This design allows us to directly lookup records by the OID without indirection.  In the best case this means that we do one fetch for an object vs two.  If the record is one page, then we are done.  Otherwise we indirect and the effect is exactly as with the existing translation pages.  In the extreme, if all records are indirected then the design is essentially the same as the existing translation page design.

 

I am still playing around with record management rules for the page and there are clearly many designs that are possible and clustering remains an important issue.

 

Got to run.

 

-bryan