In this database software stack, the file layer implements storage structures manipulation. It implements a container with content.
The layer is split in two different layers:
The file layer implementation is in org.whitebear.file. The file layer is stable since 2024
The container layer implements virtual file manipulation. A virtual file is a set of virtual pages. Each file has a unique collection number.
It implements shadow copies and multi-version concurrency. When the content layer write data to a page, the container first duplicate the page, and then applies changes to the copy. There can be more than one copy of each page
The changes will be applied to the origin page when the changeset is committed
When a changeset is committed, the shadow copies it contains will immediately be visible to all threads.
The container layer do not keep the full history of changes in the database: FileSpaceRecyclerThread is a background thread that will merge together all committed changesets - it will visit changeset's shadow copies, apply the changes to the original page and delete the shadow copies.
The container layer implements a primary B-Tree index of the pages, in order to retrieve pages using their virtual location. The B-Tree structure is maintained by the container layer. This critical database structure may be rebuilt if broken. This B-Tree structure is also used by the content layer to implements clustered storage
The container layer also implements space allocation, and reuse of free space in the database file. It maintains bitmaps where '1' means the page is in-use. The bitmaps may be rebuilt if broken. The container implements the banker's algorithm in order to secure concurrent changes of the database file - avoid deadlocks and overwrite that may lead to a broken database
The container layer also provide file backup and restore API, as well as API to check the database structure, recover data and repair damages
In the file content layer, the Transaction API give access to indexes, records and BLOBs, while keeping track of the changes - creating shadow copies
The population API give access to a collection of tuples. A tuple is a set of key/value pairs used to store record data or object properties. The API allows to read, insert, delete tuples, and move to a specific location in the population.
The index API implements B-Tree structures manipulation. B-Trees are used for indexes. The API provide operations like search, add and remove of indexed keys.
B-Tree is a sophisticated tree structure. A tree node will be split in two parts if there is too many keys inside. The algorithm will try to merge together tree nodes that have too few keys. This will keep the tree B = balanced, even if there are many duplicates
The LobSet API give access to a collection of BLOBs. Each BLOB has a unique number. The API provide standard java.io.InputStream and java.io.Reader interfaces to read / write BLOBs. BLOBs typically exceeds the page size - 8kb - and will be stored in several pages. BLOB feature is used by the catalog layer to store views and stored procedures
When data is written to a population, index or BLOB, the file layer will create shadow copies and the change will be applied permanently to the database only after the transaction has been committed. The shadow copy feature also limit the number of locks on database pages
Access to the data use the datatype framework layer, that provide a generic interfaces to load / store data regardless of the type of the data
file layer specification
Wiki: Catalog_Implemented
Wiki: DataType_Implemented
Wiki: General_architecture
Wiki: Physical_engine