From: Wacek K. <wa...@id...> - 2010-08-19 14:23:41
|
Erick Antezana wrote: > David, > > I believe that you agree that any hand editing process is > error-prone... those simple constraints (such as ordering) could, on > the one hand, help us to actually avoid (or minimise) those possible > issues (mainly syntactic) with the help of strict parsers that could > point them out before our ontology goes public (so we keep our users > happy, or at least they don't have to spend time re-checking the > syntax, submitting bugs, etc...); on the other hand, the experience > some of us have with "relatively relaxed" formats/specs demonstrates > that the imagination of parser developers or in general tool > developers, and/or data providers in a given format could have a > negative impact on the evolution of a specification which in turn has > an impact in the systems that had adopted a given "dialect" of a > specification... Let's take for instance the case of the GFF format, > which is a very comprehensive and useful format, however, you may know > that there are few issues related to the spec GFF 2 (fixed in GFF3) > due to a lack of "strictness" in a few spec details (I think column 9?)... > > Anyway, as Chris mentioned, the OBO files will still be valid... but > the recommendation will still be there... Erick, While there should be no doubt that a precise, unambiguous syntax specification is desirable and greatly helps to keep various tools based on the same format interoperable, I wonder how much you'd gain by insisting on a specific tag order, in the particular case of OBO. In principle, if there were some ordering imposed on tags as t1, t2, t3, ..., tn, then parsers could discover, before being done with the whole stanza, the following issues: 1. missing tags, e.g., t3 found after t1; 2. unordered tags, e.g., t1 found after t3. This might make sense in that a parser could discover a missing obligatory tag (oops, the lack of a tag) as soon as it finds a tag placed further down in the order. This might also make sense if there were any dependences between tags, e.g., it made no sense to include a tag if another tag is not present. (However, I'd imagine that partial ordering would be more appropriate here.) *But*, OBO allows one to spread the specification of an object across multiple stanzas, and this would obviously be in conflict. As the parser would have to wait with its reaction until it has read the whole document (or even a batch of documents), there would be no obvious gain here. After having parsed the whole document, a parser can decide if obligatory tags are missing, or if tag dependencies are violated. True, waiting until the end of the document and the need for a second pass through the whole data (its internal representation) postpones error reporting, but again, it's enforced by the design of the language (multiple stanzas per object). I can imagine that repetitive error reports while parsing OBO files due to wrong tag order (e.g., 'missing t1' when t3 is found first) would be more annoying than helpful for curators. On the other hand, it should not be a problem to have a tool reorder the tags for you -- and if you assume all OBO files are in fact automatically reordered, you could indeed have faster error detection at parse time. Order constraints are essential in languages that demand, e.g., declaration of a variable before it is used. (Though it's not necessarily a syntactic issue.) OBO is a declarative language, and as much freedom as possible, within an unambiguous specification, seems rather a virtue. vQ |