Menu

Reference_guide

silex6

this is preliminary documentation and can be subject to change.

File format

Whitebear database files are implemented as a container with content

  • The container is a set of 8kb pages, linked together into a master B-Tree index structure. Each page have a unique number, a collection number and a changeset number. A collection is a group of pages related to a table, an index or a BLOB. A changeset is a set of recent changes, and contains shadow copies of the pages recently updated. The container also have bitmaps, where the removed pages are set to '0' = the space may be reused.
  • The content may be table data, index data, BLOBs, schema objects... Table data may be clustered - hence using the container's master index for the primary key. The index data consist of B-Tree nodes. The database schema is stored in a set of system tables and indexes. views are stored in a system BLOB collection.

ACID transaction management is implemented in the container layer using multi-version concurrency control. It applies to tables, indexes, BLOBs and schema.

  • The HealthCheck tool will check for consistency of both the content and the container. The tool allows to fix errors in the container and rebuild indexes. The tool will check table data, indexes and schema data. The tool is able to rebuild the container - i.e the master B-Tree index.
  • The Backup tool visit the container, extract content as it - without checking for consistency, and copy it to a backup file. It will skip temporary data and open transactions.
  • The Restore tool use content of a backup file, and restore raw content - update the container. Full restore process will first create a blank container.

ISAM files ?

ISAM indexed sequential access method is a principle, a library, and a file format first implemented by IBM in 1973. Whitebear storage engine provide functionalities of IBM's ISAM library, as well as ACID transaction management, on a completely different file format based on B-tree structures.

Disk caching and memory usage

In order to lower physical I/O operations, every page read from the database file will be retained into memory until the following event occurs:

  • Java virtual machine's internal garbage collector remove the page in order to save space into memory.
  • disk caching background thread remove the page, based on a first-in first-out strategy.

This cause the disk caching component to allocate large amount of memory in case of heavy system use. There is no built-in limit on the maximum amount of memory allocated for caching. The maximum amount of memory that will be used can be configured through Java virtual machine's -Xmx parameter. And some parameters of the disk cache can be used in order to tune how much time a page will be retained into memory.

data is written to the physical file in an asynchronous manner. Disk cache and physical file will be synchronized on transaction commit.

Transactions and multi-version concurrency

The database engine allows several versions of the same data to be stored into the database file. Every new transaction will cause a new version number to be generated.

During a transaction, when a data page is updated, a temporary shadow copy of the page will be created and stored into the transaction state.

A vacuum cleaner process check closed transactions, look for temporary copies and make the changes permanent - copy content of shadow copy back to the original page. The vacuum cleaner will also check rollback-ed transaction, remove unused shadow copies and reclaim free space.

version conflict may occurs when trying to commit a transaction that contains outdated shadow copies: if a concurrent transaction has already committed more recent version of the same pages.

the SERIALIZABLE transaction mode enforce transactions to be run in sequence order if needed to avoid version conflict. In this mode time-outs prevents a transaction to be frozen by the serialization process.

Temporary tables

Temporary data are made of shadow copies stored into a transaction state that will never be committed.

API reference

API reference can be found in javadoc style documents included in folder /javadoc of the package.

Online API reference is available at <http://whitebear.sourceforge.net/javadoc>.


Related

Wiki: Whitebear

MongoDB Logo MongoDB