[Ogdl-core] Schema Specification ideas

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi guys,

I've never used mailing lists before, so I apologise in advance if I've
sent this to the wrong place.

I like the OGDL specification; it's simple and versatile. I'm using it in a
project, however, I have a different idea on how the schema should be
implemented. An example config file specifying the columns in a SQL
database:

user
	id
		type integer
		required

	name
		type string
		required

	friends
		type array( reference( user ) )

The "user" is the table name,  and the "id", "name" and "friends" are the
columns in the table.
As a specification for this file, you could use the following file:

# where @identifier is used, it is replaced by /^[a-zA-Z0-9_]*$/
&identifier /^[a-zA-Z0-9_]*$/

# note that this one reference has many children, and all of it's
children become children
# of the "type" node under the column node
&types
	reference one: @types
	array one: @types
	int
	string
	bool
	timestamp

document any:

	# the node below describes a table
	@identifier (repeatable) any:

		# the node below describes a column
		@identifier (repeatable) any:
			type (required) one: @types
			required

This also introduces references to nodes; so the node &identifier isn't
part of the document, but it can be used later by writing @identifier and
having the contents of the reference replaced the "@identifier" text. It
also has a circular reference, which should be legitimate.

Anything under the node "document" describes the structure of the document.
Each node can have options, eg. it might be "required" or it might be
"repeatable", and each node can have children. These are indicated by the
"any:", "maybe_one" and "one:" nodes. So, the document can contain any
nodes which match the regex specified by @identifier, and these nodes are
"repeatable" and can contain "any:" child nodes (which specify the
columns). These child nodes themselves are repeatable and can contain any
of the column configuration options, which for this example there are only
two: "type" and "required". The "type" option is required and it can only
have one child node, which can be any of "@types". The "required" node is
optional; columns are by default not required,

Like any good specification format, it can specify itself:

&node
	/.+/
		repeatable

		any:
			# where 'one:' is used, it is required that one
			# (and only one) child is specified in the configuration file
			# the child can match any of the children
			# of the 'one:' node, matching them in the order they were specified
			one: @node

			# where 'maybe_one:' is used, one or no
			# child may be specified in the configuration file
			# so only one can be specificied but this
			# also allows no child to be specified
			maybe_one: @node

			# any means that any child may be included
			# or omitted, unless they are tagged with "required" in which case
			# they must be included.
			any: @node

			# note that the above nodes can be used in conjunction with one another, eg.
			# to specify a group of nodes where
			# only one can be included AND have optional parameters

			# this cannot be used on wildcard identifiers, ie. * or **
			# this is the default child node (or nodes)
			# if no node has been specified
			default: @node

			# by default, a node is unique (not repeatable) and not required.
			repeatable
			required

document any: @node

So, the only thing that would be retained from the original specification
would be the regex syntax /regexp/
The wildcard character behaviour could be emulated, I think, using the
following snippet:

&* /.+/

&** @* (repeatable)
	@* (repeatable)

Then any use of @* would indicate any characters, and any use of @** would
indicate any graph.

When you parse a file in your program, you could specify a specification
for it, and then if the file doesn't match the specification an error could
be thrown.

Let me know your thoughts; I'm going to try and implement this behavior in
PHP as it suits my project.

Cheers,
Stewart