[srvx-devel] draft: srvx-2.0 database abstraction interface
Brought to you by:
entrope
From: Entrope <en...@us...> - 2002-10-19 03:37:51
|
Feedback, if you have any :) I do not know how clear this will be to people besides me; if it is not clear, or if you have concerns about what it can do, let me know where. Assuming nobody finds serious problems in it, I'll start implementing it this weekend. -- Entrope Database Abstraction -------------------- srvx-2.0 uses (will use) a database abstraction layer to hide the structural and representation differences between different supported database backends, and to make it easier to manage the database behavior at a high level. This document describes the behavior and interfaces of that layer -- both what it exposes to the higher levels of srvx, and what it expects from the database backends. Database Model -------------- The top level of the database consists of tables. These are named mappings of keys to rows. Each row has one primary key column, other columns, and child tables. A column has a name, and data of one of several types: - unsigned integer - datetime - string - string list - child table (of a specific type) Within a given table, each column contains the same type of data in all rows (the special value 'null' may exist in any column except the primary key column; whether it occurs, and the meaning if it does, is table-specific). A child table is distinguished by having a parent row in another table; rows in top-level tables do not have parents. For relational databases, this usually means the child table has additional columns in the primary key to identify the parent row. Child objects cannot look up their parents directly, although if they do not have a parent object pointer, they must have a list of parent keys (to know where their children are and to write themselves out). Schema Representation --------------------- Each table type is represented by a descriptor and control object in srvx. These are created by specifying the table name, column names and types, primary key column name, and factory to load objects from the database store. These type descriptors are created by a factory method on the database backend object. Object Loading -------------- The table type descriptor object can load a database object, given the child primary key and either parent object or full list of parent keys. When a row is loaded from the database, all its child rows may be loaded in the same query (whether or not is specified in the load call, and defaults to loading children), but its parent (if one exists) is not loaded. The object is entered into a per-table and database-wide cache. The representation passed to the object loading factory should be similar to the "struct record_data" in srvx-1.x. Object Storing -------------- A database object can be marked dirty (and probably should be marked dirty by mutator methods) and moved to the dirty list in the database. The base database object class has a (virtual) method to write the object out, using an interface similar to the saxdb write interface in srvx-1.x. The database backend must also support transaction levels, and not permanently write dirty objects until the transaction is committed. If the transaction (or a parent transaction) is rolled back, the object must be reverted to its pre-transaction state. [Q: how will rollback interact with adding or dropping rows? Probably just reference the added/dropped rows in the transaction record.] Caching ------- Each table type contains a map of loaded objects. The whole database contains LRU lists of clean and dirty objects. The desired size of these lists can be tuned at runtime. The database can be told to reload its cache (for example, if a third party updates the data store). Backends -------- This abstraction layer is designed to efficiently support three specific backends: - a plain text backend that can read srvx-1.x format recdb files (and write similarly formatted files -- just without "" for strings that are clearly tokens), - Berkeley/Sleepycat DB databases, - a SQL backend (that knows how to do RFC1459 casemapping). The major provisos for the backends are: - the plain text backend must load and store the entire database at once, and thus caches the entire thing, - plain text and Sleepycat DBs are vulnerable to in-process corruption, - SQL backend is susceptible to high IPC traffic and serialization cost. |