ogdl-core Mailing List for Ordered Graph Data Language (OGDL)
Brought to you by:
rveen
You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
|
Feb
(105) |
Mar
(74) |
Apr
(55) |
May
(9) |
Jun
(6) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(69) |
Dec
(3) |
| 2005 |
Jan
(7) |
Feb
(4) |
Mar
(26) |
Apr
(72) |
May
|
Jun
(18) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(36) |
| 2006 |
Jan
(13) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2007 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Rolf V. <rol...@gm...> - 2015-02-04 10:31:41
|
Hi, Stewart.
What I mean is that the parenthesis themselves are lost, as you see in
your example.
An then there is the fact that nodes after parenthesis are not
allowed, as you correctly point out. The reason for that is that a
parenthesis can contain a tree, and there is no clean and clear way to
connect the nodes in that tree to the node that comes after the
parenthesis.
So my advise is that you don't use parenthesis in the syntax of your
schema language, because they are used in the underlying language
(OGDL in this case). In fact, you could begin thinking in canonical
OGDL, the most simple way of writing it: each node on exactly 1 line:
document
any:
@identifier
repeatable:
any:
@identifier
repeatable:
any:
And after that, see how it behaves in compact form (using parenthesis
and serveral nodes on one line).
There is one unclear behavior that has beed debated but is not
clarified yet. Until recently, these two fragments where equivalent:
a b
c
and
a
b
c
What has been proposed (and implemented in the experimental Go version) is that
a b
c
is equivalent to
a
b
c
I'm not convinced of either of them.
Cheers,
Rolf
On Tue, Feb 3, 2015 at 11:24 PM, Stewart B <st...@gm...> wrote:
> Hello Rolf,
>
> I'm working on a parser, but I don't expect to release it as it only parses
> a subset of the specification so far. If it gets to the point where I am
> happy with the reliability of the code, I'l release it - but like all my
> projects, it's just been hacked together.
>
> How come parenthesis doesn't appear in the final nodes? Does that mean it
> cannot appear in final nodes, or that it shouldn't according to best
> practice? My parser interprets the following snippet:
>
> ----------
> hello (world) child
> subchild
>
> parent (child 1)
> child 2
> ----------
>
> as:
>
> --------
> hello
> world
> child
> subchild
>
> parent
> child
> 1
> child
> 2
> ---------
>
> but I just realised that your specification doesn't allow nodes after
> parenthesis.
>
> Cheers,
> Stewart
|
|
From: Stewart B <st...@gm...> - 2015-02-03 22:24:59
|
Hello Rolf,
I'm working on a parser, but I don't expect to release it as it only parses
a subset of the specification so far. If it gets to the point where I am
happy with the reliability of the code, I'l release it - but like all my
projects, it's just been hacked together.
How come parenthesis doesn't appear in the final nodes? Does that mean it
cannot appear in final nodes, or that it shouldn't according to best
practice? My parser interprets the following snippet:
----------
hello (world) child
subchild
parent (child 1)
child 2
----------
as:
--------
hello
world
child
subchild
parent
child
1
child
2
---------
but I just realised that your specification doesn't allow nodes after
parenthesis.
Cheers,
Stewart
|
|
From: Rolf V. <rol...@gm...> - 2015-02-03 13:23:44
|
Hi, Stewart. The Schema spec on the OGDL website is not field tested, and thus is not a fixed thing. Maybe several domain specific schema languages instead of one official one is a better solution. Thus, I encourage you to go ahead and implement your proposal. For the moment I have only one comment, and that is that parenthesis doesn't appear in the final nodes, because they are used to structure the text. If you are working with PHP, does this mean that you have implemented an OGDL parser for it ? Kind regards, Rolf. On Tue, Feb 3, 2015 at 2:56 AM, Stewart B <st...@gm...> wrote: > Hi guys, > > I've never used mailing lists before, so I apologise in advance if I've sent > this to the wrong place. > > I like the OGDL specification; it's simple and versatile. I'm using it in a > project, however, I have a different idea on how the schema should be > implemented. An example config file specifying the columns in a SQL > database: > > user > id > type integer > required > > name > type string > required > > friends > type array( reference( user ) ) > > The "user" is the table name, and the "id", "name" and "friends" are the > columns in the table. > > As a specification for this file, you could use the following file: > > # where @identifier is used, it is replaced by /^[a-zA-Z0-9_]*$/ > &identifier /^[a-zA-Z0-9_]*$/ > > # note that this one reference has many children, and all of it's children > become children > # of the "type" node under the column node > &types > reference one: @types > array one: @types > int > string > bool > timestamp > > document any: > > # the node below describes a table > @identifier (repeatable) any: > > # the node below describes a column > @identifier (repeatable) any: > type (required) one: @types > required > > > This also introduces references to nodes; so the node &identifier isn't part > of the document, but it can be used later by writing @identifier and having > the contents of the reference replaced the "@identifier" text. It also has a > circular reference, which should be legitimate. > > Anything under the node "document" describes the structure of the document. > Each node can have options, eg. it might be "required" or it might be > "repeatable", and each node can have children. These are indicated by the > "any:", "maybe_one" and "one:" nodes. So, the document can contain any nodes > which match the regex specified by @identifier, and these nodes are > "repeatable" and can contain "any:" child nodes (which specify the columns). > These child nodes themselves are repeatable and can contain any of the > column configuration options, which for this example there are only two: > "type" and "required". The "type" option is required and it can only have > one child node, which can be any of "@types". The "required" node is > optional; columns are by default not required, > > Like any good specification format, it can specify itself: > > &node > /.+/ > repeatable > > any: > # where 'one:' is used, it is required that one > # (and only one) child is specified in the configuration file > # the child can match any of the children > # of the 'one:' node, matching them in the order they were specified > one: @node > > # where 'maybe_one:' is used, one or no > # child may be specified in the configuration file > # so only one can be specificied but this > # also allows no child to be specified > maybe_one: @node > > # any means that any child may be included > # or omitted, unless they are tagged with "required" in which case > # they must be included. > any: @node > > # note that the above nodes can be used in conjunction with one another, > eg. > # to specify a group of nodes where > # only one can be included AND have optional parameters > > > # this cannot be used on wildcard identifiers, ie. * or ** > # this is the default child node (or nodes) > # if no node has been specified > default: @node > > # by default, a node is unique (not repeatable) and not required. > repeatable > required > > > document any: @node > > > So, the only thing that would be retained from the original specification > would be the regex syntax /regexp/ > The wildcard character behaviour could be emulated, I think, using the > following snippet: > > &* /.+/ > > &** @* (repeatable) > @* (repeatable) > > > Then any use of @* would indicate any characters, and any use of @** would > indicate any graph. > > When you parse a file in your program, you could specify a specification for > it, and then if the file doesn't match the specification an error could be > thrown. > > Let me know your thoughts; I'm going to try and implement this behavior in > PHP as it suits my project. > > Cheers, > Stewart > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming. The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take a > look and join the conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Ogdl-core mailing list > Ogd...@li... > https://lists.sourceforge.net/lists/listinfo/ogdl-core > |
|
From: Stewart B <st...@gm...> - 2015-02-03 01:56:56
|
Hi guys, I've never used mailing lists before, so I apologise in advance if I've sent this to the wrong place. I like the OGDL specification; it's simple and versatile. I'm using it in a project, however, I have a different idea on how the schema should be implemented. An example config file specifying the columns in a SQL database: user id type integer required name type string required friends type array( reference( user ) ) The "user" is the table name, and the "id", "name" and "friends" are the columns in the table. As a specification for this file, you could use the following file: # where @identifier is used, it is replaced by /^[a-zA-Z0-9_]*$/ &identifier /^[a-zA-Z0-9_]*$/ # note that this one reference has many children, and all of it's children become children # of the "type" node under the column node &types reference one: @types array one: @types int string bool timestamp document any: # the node below describes a table @identifier (repeatable) any: # the node below describes a column @identifier (repeatable) any: type (required) one: @types required This also introduces references to nodes; so the node &identifier isn't part of the document, but it can be used later by writing @identifier and having the contents of the reference replaced the "@identifier" text. It also has a circular reference, which should be legitimate. Anything under the node "document" describes the structure of the document. Each node can have options, eg. it might be "required" or it might be "repeatable", and each node can have children. These are indicated by the "any:", "maybe_one" and "one:" nodes. So, the document can contain any nodes which match the regex specified by @identifier, and these nodes are "repeatable" and can contain "any:" child nodes (which specify the columns). These child nodes themselves are repeatable and can contain any of the column configuration options, which for this example there are only two: "type" and "required". The "type" option is required and it can only have one child node, which can be any of "@types". The "required" node is optional; columns are by default not required, Like any good specification format, it can specify itself: &node /.+/ repeatable any: # where 'one:' is used, it is required that one # (and only one) child is specified in the configuration file # the child can match any of the children # of the 'one:' node, matching them in the order they were specified one: @node # where 'maybe_one:' is used, one or no # child may be specified in the configuration file # so only one can be specificied but this # also allows no child to be specified maybe_one: @node # any means that any child may be included # or omitted, unless they are tagged with "required" in which case # they must be included. any: @node # note that the above nodes can be used in conjunction with one another, eg. # to specify a group of nodes where # only one can be included AND have optional parameters # this cannot be used on wildcard identifiers, ie. * or ** # this is the default child node (or nodes) # if no node has been specified default: @node # by default, a node is unique (not repeatable) and not required. repeatable required document any: @node So, the only thing that would be retained from the original specification would be the regex syntax /regexp/ The wildcard character behaviour could be emulated, I think, using the following snippet: &* /.+/ &** @* (repeatable) @* (repeatable) Then any use of @* would indicate any characters, and any use of @** would indicate any graph. When you parse a file in your program, you could specify a specification for it, and then if the file doesn't match the specification an error could be thrown. Let me know your thoughts; I'm going to try and implement this behavior in PHP as it suits my project. Cheers, Stewart |
|
From: Rolf V. <rol...@gm...> - 2014-01-22 12:15:23
|
Hi, all. I announced recently a library for handling OGDL in the Go language. Some minor changes to the specs have been made also, at the same time, non of them incompatible with earlier version, except for the cycle production, which I'm not aware anyone is using. The gpath command line version in Go is far better than the old C version. See http://godoc.org/github.com/rveen/ogdl for the documentation. Code is here: https://github.com/rveen/ogdl Bes regards, Rolf |
|
From: Rolf V. <rol...@gm...> - 2011-09-19 15:44:16
|
Hi, all. I was never happy with the level 2 grammar (the part that
allows cycles).
It was two complex from my point of view, and difficult to implement.
I've updated
the spec with a small change in that section, so that only one syntax for cycles
remains:
a
b
c
#{2
Here, 'c' links to the node 2 lines above. Since in the canonical form of OGDL
there are not more than one node on each line, the number is a coordinate in
the vertical direction and identifies uniquely a node. Negative numbers point
to nodes after the current node.
Kind regards
Rolf.
|
|
From: Bennett T. <be...@ra...> - 2007-01-27 18:13:02
|
2007-01-27T16:02:48 Llu=EDs Batlle i Rossell: > What are the people here using OGDL for the most? Really like RPC, > or for storing data? Different people are using it for different things. I'm using the tiniest possible subset of OGDL as the language for software packaging spec files in Bent Linux. A typical Bent Package Manager (bpm) spec looks like this: pkg zile-2.2.27 url http://superb-east.dl.sourceforge.net/sourceforge/zile/zile-2.2.27.ta= r.gz build \ tar xf zile-2.2.27.tar.gz cd zile-2.2.27 CC=3Dgcc CFLAGS=3D-Os LDFLAGS=3D'-static -s' ./configure --prefix=3D/= usr \ --mandir=3D/usr/share/man --infodir=3D/usr/share/info touch doc/AUTODOC make mkdir -p $BPM_ROOT/usr/share/info $BPM_ROOT/usr/man/man1 $BPM_ROOT/us= r/bin make DESTDIR=3D$BPM_ROOT install rm -f $BPM_ROOT/usr/share/info/dir isa editor emacs -Bennett |
|
From: <vir...@gm...> - 2007-01-27 16:02:54
|
Hi, I see (in the repository) there isn't any C++ implementation of an OGDL parser. I'd like to have one :) I like the Java approaches between the parser and file reader objects. What are the people here using OGDL for the most? Really like RPC, or for storing data? (Maybe I should get into ogdl-users or something like that :) |
|
From: Rolf V. <rol...@er...> - 2007-01-10 10:54:44
|
Lluís Batlle wrote: > Well, they already are in the svn repository, in the 'java' directory, > but not in the 'j2me'. Ok, I thought you made some modifications. SVN updated now. > (Així que l'OGDL ha nascut de les terres de Sagunt, eh? Impressionant :) :-) Saludos. Rolf. |
|
From: <vir...@gm...> - 2007-01-10 10:08:54
|
Well, they already are in the svn repository, in the 'java' directory, but not in the 'j2me'. You better do a svn copy ( I think the operation exists ). iirc, the files are: Characters.java OgdlParser.java SyntaxException.java They should be placed in the src/ogdl directory of the j2me package. (Aix=ED que l'OGDL ha nascut de les terres de Sagunt, eh? Impressionant :) Atentament, Llu=EDs. 2007/1/10, Rolf Veen <rol...@er...>: > Llu=EDs Batlle wrote: > > > I started using OGDL in a J2ME application, and I found that there is > > only a OgdlBinary parser there. I took the code for OgdlParser and > > Characters from the "java" version, and got the text parser working in > > CLDP. > > > > Maybe someone wants to check those changes into svn... > > Yes, of course. Just send the files to the list or to myself, and > I'll upload them. > > Saludos. > Rolf. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share y= our > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3D= DEVDEV > _______________________________________________ > Ogdl-core mailing list > Ogd...@li... > https://lists.sourceforge.net/lists/listinfo/ogdl-core > |
|
From: Rolf V. <rol...@er...> - 2007-01-10 10:02:26
|
Lluís Batlle wrote: > I started using OGDL in a J2ME application, and I found that there is > only a OgdlBinary parser there. I took the code for OgdlParser and > Characters from the "java" version, and got the text parser working in > CLDP. > > Maybe someone wants to check those changes into svn... Yes, of course. Just send the files to the list or to myself, and I'll upload them. Saludos. Rolf. |
|
From: <vir...@gm...> - 2007-01-10 09:18:53
|
Hi, I'm just new in OGDL, but I find it's really great. I started using OGDL in a J2ME application, and I found that there is only a OgdlBinary parser there. I took the code for OgdlParser and Characters from the "java" version, and got the text parser working in CLDP. Maybe someone wants to check those changes into svn... (I've not really *strongly* tried the text parser - I only have a simple OGDL text file in my JAR, and I can load the graph without any problems) Regards, Llu=EDs. |
|
From: Rolf V. <rol...@er...> - 2006-08-14 16:01:49
|
Hi, all. It's a long time, but I'm still here :-). I've just set up the subversion repository for the project, meaning that CVS is from now on deprecated. New additions to the code are those related to the binary version of OGDL and an OGDL/RF protocol implementation on top of TCP. There are RFClient and RFServer implementations for each language, making possible to brigde language barriers with very low overhead (compared to SOAP, for example). There is also a Java Microedition version that is able to load into a Java device and remotely connect it to an OGDL/RF server to interchange (binary) OGDL objects. The java/ and j2me/ are Eclipse projects (the second one needs EclipseME). Cheers. Rolf. PS. Didge: your code is preserved as is, and available in java-digde/. |
|
From: Rolf V. <rol...@er...> - 2006-01-13 08:12:45
|
didge wrote: > After 10 years of Java development, I haven't yet seen a practical benefit > to starting the package hierarchy off with a TLD. Oracle doesn't do it and > even Sun does not follow the practice consistently. I also wanted to make > the .NET and Java versions as consistent as possible and .NET namespaces do > not use TLDs as a practice. Good to hear. I'll do the same. >>I think that meta information (#?) can be safely scrapped from spec. > > > I think that there is some benefit to some sort of parser instruction that > is not included as part of the data for version identification, if nothing > else. Let's pick it up in another thread. Ok. Lets keep #? and discuss it further. -- Rolf |
|
From: Rolf V. <rol...@er...> - 2006-01-13 08:05:36
|
Patrick Doane wrote: > Two comments on this: > > - It might be nice to support "random access" of the file. This could > be implemented by including more size information. For example, each > node could contain an offset to the the next sibling and the next child. The binary version is intended as a file format as well as a message protocol, and this would add complexity at the emitter side and also add quite an amount of extra bytes. I'm thinking for example in using it to exchange information between a small device (a measuring instrument, an inductrial device, etc) and a host, through USB, CAN or whatever. > - What is the reason for distinguishing between text-node and > binary-node? Is it to save space by not including length data? We need to identify text nodes without extra (external) type or schema information because they are used to name other nodes. We would not be able to do much with only untypified binary nodes. Normal OGDL is pure text, its seems natural if binary OGDL is a superset of that. -- Rolf. |
|
From: didge <di...@fo...> - 2006-01-12 17:19:35
|
Rolf wrote: > > didge wrote: > > Folks, > > > > I've just committed a number of new files to the project on a new branch > > called DIDGE_001. > > ... > > Really great work! Thanks! > > 1. Package names. > > I've simplified the package names so as to place the most important > classes > > and interfaces at the top of the hierarchy. I believe this is more > natural > > to Java programmers and is more similar to the organization of similar > open > > source projects. > > You have used ogdl.* instead of org.ogdl.*. Is this more or less > accepted now? Sun wanted the full domain name. After 10 years of Java development, I haven't yet seen a practical benefit to starting the package hierarchy off with a TLD. Oracle doesn't do it and even Sun does not follow the practice consistently. I also wanted to make the .NET and Java versions as consistent as possible and .NET namespaces do not use TLDs as a practice. > > 5. Not implemented > > org.ogdl.reference.OgdlPullParser does not yet implement meta > information or > > references because I have a few questions about them. > > I think that meta information (#?) can be safely scrapped from spec. I think that there is some benefit to some sort of parser instruction that is not included as part of the data for version identification, if nothing else. Let's pick it up in another thread. > The level 2 productions are indeed unimplemented in the current > code, and we have no experience with them. They are analog to > symbolic links. > > > 1. I hope that this spurs plenty of discussion. > > I think this is a step forward. What we can discuss is a way to merge > your code into the main branch (while respecting your branch). > > -- > Rolf. It would be great to get some more input from other Java users. Also, I'll have the C# version ready shortly and maybe we'll get some interest from that camp, though it doesn't sound like there are many .NET users involved with OGDL right now. Regards, didge |
|
From: Patrick D. <pa...@wa...> - 2006-01-12 16:15:21
|
Two comments on this: - It might be nice to support "random access" of the file. This could be implemented by including more size information. For example, each node could contain an offset to the the next sibling and the next child. - What is the reason for distinguishing between text-node and binary-node? Is it to save space by not including length data? Rolf Veen wrote: > Hi, all. > > I've uploaded a new spec of a binary version of > OGDL, and an associated 'Remote functions' spec. > > The binary format is intended to be exchanged > between machines, as it allows to intermix normal > text nodes with images, etc. > > The RF spec defines a way to expose functions as > remote services, by making the call and the response > be OGDL binary objects. > > > -- > Rolf. > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > _______________________________________________ > Ogdl-core mailing list > Ogd...@li... > https://lists.sourceforge.net/lists/listinfo/ogdl-core |
|
From: Patrick D. <pa...@wa...> - 2006-01-12 16:08:10
|
Rolf Veen wrote: >> 5. Not implemented >> org.ogdl.reference.OgdlPullParser does not yet implement meta >> information or >> references because I have a few questions about them. > > > I think that meta information (#?) can be safely scrapped from spec. We use the encoding and schema keys in our documents. Obviously it's not a problem to remove from the spec - it'll be interpreted as a comment by another parser. |
|
From: Rolf V. <rol...@er...> - 2006-01-12 08:30:48
|
didge wrote: > Folks, > > I've just committed a number of new files to the project on a new branch > called DIDGE_001. As stated previously, the purpose of this branch is to > avoid confusion with the current public release, especially since the new > files represent experimental thinking that the whole team may not want as a > whole. Really great work! > 1. Package names. > I've simplified the package names so as to place the most important classes > and interfaces at the top of the hierarchy. I believe this is more natural > to Java programmers and is more similar to the organization of similar open > source projects. You have used ogdl.* instead of org.ogdl.*. Is this more or less accepted now? Sun wanted the full domain name. > 5. Not implemented > org.ogdl.reference.OgdlPullParser does not yet implement meta information or > references because I have a few questions about them. I think that meta information (#?) can be safely scrapped from spec. The level 2 productions are indeed unimplemented in the current code, and we have no experience with them. They are analog to symbolic links. > 1. I hope that this spurs plenty of discussion. I think this is a step forward. What we can discuss is a way to merge your code into the main branch (while respecting your branch). -- Rolf. |
|
From: Rolf V. <rol...@er...> - 2006-01-12 07:44:11
|
didge wrote: > For me, this is a robustness issue. Being tolerant is great, but not at the > cost of robustness. But, being new here, I'm not sure what exactly what > being tolerant means to you. > > Being tolerant, to me, means that users are free to use white space, > indentation, grouping, quoting or blocking as it makes sense for their needs > and tastes. For me tolerance and robustness in this case are related. I prefer a parser that doesn't exit in the mentioned cases. That doesn't exclude a 'strict' option, as you propose (in your case the default). > Disallowing extra punctuation in my humble opinion does not make OGDL any > less tolerant according to my definition and makes OGDL more robust because > OGDL without extra punctuation is easier to read and can't mask any > potential errors. One thing are extra commas, and another is () and ((a)). I think these are part of the language as is (((1+2))) in a math expression. Anyway, let the parsers evolve and w'll see what is more coherent. -- Rolf. |
|
From: didge <di...@fo...> - 2006-01-11 23:18:41
|
I just completed checking in new changes to the DIDGE_001 branch. I'm currently working on a port to C# (almost complete) and I've been simultaneously changing the Java version to make the two more consistent, where possible. Regards, didge |
|
From: didge <di...@fo...> - 2006-01-05 22:04:44
|
Folks,
I've just committed a number of new files to the project on a new branch
called DIDGE_001. As stated previously, the purpose of this branch is to
avoid confusion with the current public release, especially since the new
files represent experimental thinking that the whole team may not want as a
whole.
My main goals with this branch were as follows:
1. Learn OGDL.
2. Create a pull parser that could reproduce the input stream exactly.
3. Provide a serializer for convenient generation of OGDL streams.
4. Clarify some of the ambiguities in the grammar.
5. Experiment with a simpler package structure.
To access these files, I suggest that you create a new working directory and
then perform a fresh checkout using:
cvs checkout -r DIDGE_001 java
I've organized the files in this branch as an OGDL Java SDK. Included are
ant build scripts, examples, documentation, and of course the source. After
you check it out, you can use Ant to build it by simply executing 'ant' in
the new working directory which will build everything including javadocs.
I did not attempt to bring forward everything that currently exists for Java
into this new branch. Please don't be alarmed by this.
What's New and Different
1. Package names.
I've simplified the package names so as to place the most important classes
and interfaces at the top of the hierarchy. I believe this is more natural
to Java programmers and is more similar to the organization of similar open
source projects.
2. Grammar
The grammar is mostly unchanged from a practical standpoint. The grammar is
listed in doc/grammar.txt.
Changes include:
1. Comments are not treated as data.
2. Extraneous commas and parens are not allowed, but a single
trailing comma or empty group is.
3. Parsers
OgdlPullParser
This new class parses OGDL files using the pull style discussed in previous
posts. There is a standard interface, org.ogdl.OgdlPullParser and a
reference implementation, org.ogdl.reference.OgdlPullParser.
The purpose of the separate interface is to encourage alternative but
interchangeable implementations.
org.ogdl.reference.OgdlPullParser is also executable. It parses either
stdin or a list of OGDL files specified on the command line. It then
outputs the result to stdout, stopping on the first syntax error. If there
is a syntax error, it reports the line number, column number, cause, and the
current line with a pointer to the error. For example,
$ java -cp lib/ogdl.jar org.ogdl.reference.OgdlPullParser
a\\
a
<stdin>: Unexpected token @ 0:3
a\\
^
OgdlPushParser
This class is similar to the existing OgdlParser, but is implemented using
the OgdlPullParser to ensure consistent behavior. Like OgdlPullParser,
there is both a standard interface and reference implementation provided.
Like org.ogdl.reference.OgdlPullParser, org.ogdl.reference.OgdlPushParser is
also executable and behaves exactly the same.
4. Serializer
org.ogdl.reference.OgdlSerializer supports convenience methods for streaming
OGDL.
5. Not implemented
org.ogdl.reference.OgdlPullParser does not yet implement meta information or
references because I have a few questions about them.
Also not implemented is a flag to the parser to allow it to accept extra
punctuation as discussed in previous posts.
6. Documentation
There is an introduction to the SDK and the javadocs.
Next Steps:
1. I hope that this spurs plenty of discussion.
2. I plan to start on a C# version shortly.
3. I need to ask some questions regarding meta information, references, and
quotes.
Regards,
didge
|
|
From: didge <di...@fo...> - 2006-01-03 21:59:21
|
For me, this is a robustness issue. Being tolerant is great, but not at the cost of robustness. But, being new here, I'm not sure what exactly what being tolerant means to you. Being tolerant, to me, means that users are free to use white space, indentation, grouping, quoting or blocking as it makes sense for their needs and tastes. But I'm still interested in hearing anyone's experience in which extra punctuation was being produced (and why) because I think that since extra punctuation has no useful purpose, it can only be the product of sloppiness, laziness or errors on the part of the producer. Disallowing extra punctuation in my humble opinion does not make OGDL any less tolerant according to my definition and makes OGDL more robust because OGDL without extra punctuation is easier to read and can't mask any potential errors. I think I've made my point regarding extra punctuation, so I won't belabor it anymore. But please read my other responses below to points you make, if you have the time :) Regards, didge Rolf wrote: > > didge wrote: > > > Allowing for a single trailing comma and empty end of group, each of > these > > examples is suspicious because the extra punctuation suggests that there > is > > something missing (denoted by the '?'): > > > > ?,a > > > > a,?, > > > > ?(a) > > > > a(?(b)) > > > > a(?()) > > > > a,?() > > > > Are there any examples of systems or individuals that currently author > OGDL > > with extra punctuation and why? > > I want the OGDL parser to be robust and tolerant, that is the main idea. > > Case by case: > > 1) ,a +1 to signal this as an error > > 2) a,, -0. I don't care to much, but tolerance is for me desirable. > Just be sure that this will not round trip, no empty nodes > will be generated. Shouldn't a round trip reproduce as accurately as possible the input? In other words, if the input contains case 2, should I expect this to be preserved in a round trip? What is the definition of a round trip? > 3) () -1. Empty group should be allowed. No nodes are generated. > 4) (a) -1. Equivalent to scalar 'a'. > 5) ((a)) -1. Equivalente to scalar 'a'. > > Once thing is that we disallow nodes after groups, and other one is to > limit groups to more than one element. > > > What would be the downside of not allowing extra punctuation? > > Programs that generate OGDL do not need to care for this special cases, > that can occur, especially groups with less than 2 element. To clarify, a group with one element is fine by me. The problem with cases 3, 4 and 5 is that the grouping suggests that parent nodes for each of the groups are missing. > > > For my needs, I want to treat extra punctuation as possible errors in > order > > to eliminate real errors in my systems as quickly as possible. > Therefore, > > my parser will by default disallow extra punctuation (and probably > trailing > > commas and empty groups, too), but may include an option to allow them > if > > desired. > > Totally acceptable of course, if the option remains. What you require > from OGDL streams could be considered as a stricter and unambiguous > syntax, canonical in some sense. > What I want is to clarify the usage of extraneous commas and parens such that readers and parsers can read some OGDL, see an extra comma and immediately know that it is an error and not just sloppy data. I'm not sure what canonical means however. I do think there is a form that an OGDL document can have, that might be called canonical form (though I call it 'normal form' borrowing from RDBMS terminology) in which the stream is composed of nodes, each of which is separated by line breaks and the minimum number of space characters necessary to determine structure. So, for example, the normal form of: a (b) is a b |
|
From: Rolf V. <rol...@er...> - 2006-01-03 18:26:39
|
didge wrote:
> Allowing for a single trailing comma and empty end of group, each of these
> examples is suspicious because the extra punctuation suggests that there is
> something missing (denoted by the '?'):
>
> ?,a
>
> a,?,
>
> ?(a)
>
> a(?(b))
>
> a(?())
>
> a,?()
>
> Are there any examples of systems or individuals that currently author OGDL
> with extra punctuation and why?
I want the OGDL parser to be robust and tolerant, that is the main idea.
Case by case:
1) ,a +1 to signal this as an error
2) a,, -0. I don't care to much, but tolerance is for me desirable.
Just be sure that this will not round trip, no empty nodes
will be generated.
3) () -1. Empty group should be allowed. No nodes are generated.
4) (a) -1. Equivalent to scalar 'a'.
5) ((a)) -1. Equivalente to scalar 'a'.
Once thing is that we disallow nodes after groups, and other one is to
limit groups to more than one element.
> What would be the downside of not allowing extra punctuation?
Programs that generate OGDL do not need to care for this special cases,
that can occur, especially groups with less than 2 element.
> For my needs, I want to treat extra punctuation as possible errors in order
> to eliminate real errors in my systems as quickly as possible. Therefore,
> my parser will by default disallow extra punctuation (and probably trailing
> commas and empty groups, too), but may include an option to allow them if
> desired.
Totally acceptable of course, if the option remains. What you require
from OGDL streams could be considered as a stricter and unambiguous
syntax, canonical in some sense.
--
Rolf.
|
|
From: didge <di...@fo...> - 2006-01-03 17:40:47
|
Rolf, Note: I use the term 'extra punctuation' from here on to mean any extraneous commas and groups, other than a single trailing comma or single trailing empty group. The reason I'm strongly opposed to extra punctuation is because the presence of such in the stream suggests that the graph may be missing vital information. Take these examples again: ,a a,, (a) a((b)) a(()) a,() Allowing for a single trailing comma and empty end of group, each of these examples is suspicious because the extra punctuation suggests that there is something missing (denoted by the '?'): ?,a a,?, ?(a) a(?(b)) a(?()) a,?() Are there any examples of systems or individuals that currently author OGDL with extra punctuation and why? What would be the downside of not allowing extra punctuation? For my needs, I want to treat extra punctuation as possible errors in order to eliminate real errors in my systems as quickly as possible. Therefore, my parser will by default disallow extra punctuation (and probably trailing commas and empty groups, too), but may include an option to allow them if desired. Regards, didge Rolf wrote: > > didge wrote: > > > The next set would be disallowed because even though one might argue > that > > the extra commas and parens are benign, my assumption is that the fact > that > > no person or program would likely author them under normal circumstances > > makes them look suspicious: > > ,a > > +1 to disallowing this. > > > a,, > > +0 to disallowing this. > > > (a) > > > > a((b)) > > > > a(()) > > > > a,() > > I don't see a problem with those. Groups formed by zero to n elements > should be allowed. I see the rest as white space. > > -- > Rolf |