[Ogdl-core] Schema Specification ideas
Brought to you by:
rveen
|
From: Stewart B <st...@gm...> - 2015-02-03 01:56:56
|
Hi guys, I've never used mailing lists before, so I apologise in advance if I've sent this to the wrong place. I like the OGDL specification; it's simple and versatile. I'm using it in a project, however, I have a different idea on how the schema should be implemented. An example config file specifying the columns in a SQL database: user id type integer required name type string required friends type array( reference( user ) ) The "user" is the table name, and the "id", "name" and "friends" are the columns in the table. As a specification for this file, you could use the following file: # where @identifier is used, it is replaced by /^[a-zA-Z0-9_]*$/ &identifier /^[a-zA-Z0-9_]*$/ # note that this one reference has many children, and all of it's children become children # of the "type" node under the column node &types reference one: @types array one: @types int string bool timestamp document any: # the node below describes a table @identifier (repeatable) any: # the node below describes a column @identifier (repeatable) any: type (required) one: @types required This also introduces references to nodes; so the node &identifier isn't part of the document, but it can be used later by writing @identifier and having the contents of the reference replaced the "@identifier" text. It also has a circular reference, which should be legitimate. Anything under the node "document" describes the structure of the document. Each node can have options, eg. it might be "required" or it might be "repeatable", and each node can have children. These are indicated by the "any:", "maybe_one" and "one:" nodes. So, the document can contain any nodes which match the regex specified by @identifier, and these nodes are "repeatable" and can contain "any:" child nodes (which specify the columns). These child nodes themselves are repeatable and can contain any of the column configuration options, which for this example there are only two: "type" and "required". The "type" option is required and it can only have one child node, which can be any of "@types". The "required" node is optional; columns are by default not required, Like any good specification format, it can specify itself: &node /.+/ repeatable any: # where 'one:' is used, it is required that one # (and only one) child is specified in the configuration file # the child can match any of the children # of the 'one:' node, matching them in the order they were specified one: @node # where 'maybe_one:' is used, one or no # child may be specified in the configuration file # so only one can be specificied but this # also allows no child to be specified maybe_one: @node # any means that any child may be included # or omitted, unless they are tagged with "required" in which case # they must be included. any: @node # note that the above nodes can be used in conjunction with one another, eg. # to specify a group of nodes where # only one can be included AND have optional parameters # this cannot be used on wildcard identifiers, ie. * or ** # this is the default child node (or nodes) # if no node has been specified default: @node # by default, a node is unique (not repeatable) and not required. repeatable required document any: @node So, the only thing that would be retained from the original specification would be the regex syntax /regexp/ The wildcard character behaviour could be emulated, I think, using the following snippet: &* /.+/ &** @* (repeatable) @* (repeatable) Then any use of @* would indicate any characters, and any use of @** would indicate any graph. When you parse a file in your program, you could specify a specification for it, and then if the file doesn't match the specification an error could be thrown. Let me know your thoughts; I'm going to try and implement this behavior in PHP as it suits my project. Cheers, Stewart |