[ooc-compiler] Announcing libpobj

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Introduction
============

   Persistent Objects for OOC is a framework for storing object data
in a RDBMS.  Or, if one starts with a database, a framework that
presents relational data as a set of objects.  It consists of a thin
library ("thin" because aspects of RDBMS shine through in parts) and a
code generator that provides the glue code bridging the gap from
internal to external data representation.  For the most part, the
"glue" are get and set methods on otherwise hidden record fields
within classes representing table rows.

   A number of basic assumptions guide the design and implementation of
the persistence layer:

* The goal is to make the use of persistent objects in a program as
  convenient as possible.  Developers should be able to operate on
  persistent objects similar to normal objects.

* Referential integrity between table rows is preserved in memory.
  That is, all foreign key references to a particular row in the
  relational database are mapped to pointers to a single object in
  memory, and vice versa.

* Access is provided to table data, views, and arbitrary selects.  The
  API provides a similar interface for all of these variants.  A
  subset of SQL is made available to the application as a simple query
  language.  Depending on the type of application, the scope of this
  subset is powerful enough to cover a large part of the queries it
  needs to run.  If an application manipulates symbolic data, then all
  of its queries can typically formulated in this simple language.  On
  the other hand, for an application including number crunching, the
  aggregation of data must be done with manually crafted SQL selects.

* The framework offers sufficient flexibility to tune the tradeoff
  between convenience and performance.  For example, the running time
  of most applications is determined by the number of interactions
  between client program and database server.  Accordingly, the
  library offers ability to change from fine grained access to single
  objects to coarse grained data access internalizing thousands of
  objects with a single `select' statement.

* libpobj does not provide an object oriented view on *any* SQL
  database schema.  The persistence layer supports a limited number of
  mappings between relational data and objects.

* For the persistence mechanism to work, classes must be explicitly
  programmed for it.  That is, the library is not a mechanism that
  magically allows to store existing classes in a database.  Instead,
  the persistence capable classes must be derived from a common base
  class, `PObj.Object'.

* Not all aspects of persistence are hidden from the developer.
  Persistent objects differ in some details from normal objects.  For
  example, concurrency is not an issue if an object is under the full
  control of a program, but becomes very important if object data is
  serialized in a database, where several applications and users may
  access it simultaneously.  The persistence layer exposes details of
  the lower level database access if these details cannot be
  abstracted away without loss of control.  Most notably, the
  application must deal with transactions.

Basic Concepts
==============

   Three basic concepts summarize the operation of the persistent
object abstraction layer:

  1. Every table corresponds to a class.

  2. Every table row corresponds to an instance of the table's class.

  3. Every attribute in a table row corresponds to a member of the
     table's class.

   As far as the application is concerned, the database is a container
of objects

   * from which it can retrieve objects that were made persistent
     earlier,

   * where it can create new objects or can make changes to objects
     persistent, and

   * which it can query for lists of objects satisfying a given set of
     criteria.

   The authoritative source of persistent data is always the database.
In general, it must be assumed that multiple processes are reading and
writing to the same database at the same time.  This means, that all
objects and their state held locally be the application are only
copies of the original data, and may become out of date immediately
after they were taken from the database.  (This is another aspect of a
shared database that is _not_ abstracted away by libpobj, although
some features of its implementation try to lessen the impact of this
fact.)

   As far as possible, data items from the database are presented to the
application in a format suitable for easy access by the programming
language.  For example, a table attribute like `VARCHAR(n)' is mapped
onto string objects, while a foreign key constraint is translated into
a object reference (aka pointer) representing the row of the key's
target table.

Examples
========

The most basic example assumes a table representing a person.  It has
two data columns, `firstName' and `lastName', both of type string.
Additionally, it has an integer primary key column (implicitly named
`id') that is fed from a database sequence `Seq_Person'.

Class definition in the meta-data file:

    <class name="Person">
      <idField sequenceName="Seq_Person"/>
      <field name="firstName" schemaName="first_name">
	<string size="128"/>
      </field>
      <field name="lastName" schemaName="last_name">
	<string size="128"/>
      </field>
    </class>

The table definition for our RDBMS system (Postgres in this case)
looks like this:

    CREATE SEQUENCE Seq_Person;
    CREATE TABLE Person (
      id INTEGER CONSTRAINT nn_Person_id NOT NULL,
	CONSTRAINT pk_Person PRIMARY KEY(id),
      first_name VARCHAR(64) CONSTRAINT nn_Person_first_name NOT NULL,
      last_name VARCHAR(64) CONSTRAINT nn_Person_last_name NOT NULL
    );

Now to application's view on this table and its rows.  First, create
our object container `cnt' on top of the database connection `conn'
and start a transaction:

    cnt := NEW(PostgresContainer.Container, conn);
    cnt.BeginTransaction();

Add an object to the database table `Person':

    person := NEW(DB.Person, "Michael", "van Acken");
    cnt.MakePersistent(person);
    person.Store();

Retrieve all `Person' objects from the corresponding table:

    q := cnt.NewQuery();
    q.AddTableExpr(DB.person, FALSE);
    res := q.Execute();

At this place, `res[0,0]' holds the first object, `res[1,0]' the
second, and so on.  To retrieve a subset of all persons matching a
particular pattern, one needs to add a filter predicate to the query:

    q := cnt.NewQuery();
    q.AddTableExpr(DB.person, FALSE);
    q.AddConj(PObj.Like(PObj.Member1(0, DB.personFirstName),
                        NEW(PObj.Literal, "Mich%")));
    res := q.Execute();

This returns all `Person' objects whose first name begins with `Mich'.
Removing a table row:

    person.Delete();
    Store();

And finally, closing the transaction:

    cnt.CommitTransaction();

--------

The text above can also be found in the project's README file at 
http://cvs.sourceforge.net/viewcvs.py/ooc/libpobj/

One might ask why I am going to the trouble of announcing this
particular library.  After all, this project is more than a year old,
as an inspection of the CVS repository does confirm.

The reason for this is that I may be able to improve and to add to
this project in the next time, due to some synergies between this
Oberon-2 library and tools and my payed job.

To take full advantage of this, the library needs more people that
actively use it.  There are a lot of different kinds of applications
out there, all with their own particular view on their data.  I am
able to provide one or two different kinds of applications with
persistent data (and libpobj is perfectly suited to serve these
applications as it is :-), but experience shows that this is not
enough to get to a design that is both simple and powerful for a wide
range of applications.

This mail is a poll to find out if there are people on this mailing
list who are able and willing to actively use this library, and to
make sure that it covers a wide range of needs.  Without such
participation, I can do without such tedious things like
documentation, examples, tutorial, automated tests, versioning,
releases and the like, making things vastly simpler for me and saving
lots and lots of time :-)

-- mva