From: Clark C. E. <cc...@cl...> - 2004-08-31 09:33:25
|
On Wed, Aug 25, 2004 at 09:39:08PM +0100, Damian Cugley wrote: | I put some ideas about a parser API on the YAML kwiki at | <http://yaml.kwiki.org/index.cgi?YamlEventParser>. SAX like interface? | My original idea was | to describe a clear-cut interface (called YEP) between a pure parser | and the part of one's YAML library that worries about implicit types | and (eventually) schemas. Rereading the specification draft, I noticed | that it mentions a parsing-events interface between the Parser and the | Composer steps, which I guess describes exactly where my hypothetical | API would sit. (Is there already a specification for this?) I was picturing a 'pull' interface for the parser, and a 'push' like interface for the emitter; converting from pull2push is a simple do-while-loop, going from push2pull requires threads or coroutines. Rather than use a bunch of arguments for each call, I'd use a 'event' object, with an enumeration of which kind you have (begin-map, end-map, etc.) and any extra data. The advantage of an object is that push and pull can share the same event objects. def pull2push( input, output): while True: event = input.read() if not event: break output.send(event) The 'event' object also has the advantage of having a 'syntax' facet for style information, etc. The base class does 'representation' model, a derived class does 'serialization', and a grand-child does 'presentation' (see the layering in the spec). | My main aim was to convince myself that it is reasonable to layer a | parser this way, with most of the YAML syntax (including directives and | the names of anchors) invisible to the next layer (composer/builder). *nod* | Partly this is because as it happens I have not needed implicit typing | yet, so I would like it to be an entirely optional module. Also, we | want to keep the composition layer 'honest'; that is, we do not want | aspects of syntax leaking in to the node graph in the way that in XML | namespace prefixes and scopes are visible in the XML infoset. Right. This is exactly the goal. People will break the model, but, as long as they know they are cheating, and don't expect other tools to cheat in the same way, then its, well, ok. | I have outlined the interface in Python; I hope the equivalent C | interface (using a lot of void* pointers and passing a struct | containing function pointers at the start) is fairly obvious. I posted a "C" interface for something like the above on 26-SEP-2003 http://sourceforge.net/mailarchive/forum.php?thread_id=3194151&forum_id=1771 I have a very very old "C" interface you can view, http://cvs.sourceforge.net/viewcvs.py/yaml/libyaml/yaml.h?view=markup | So why am I posting this? Well, obviously I'd like to know if the YEP | interface looks plausible and encodes the correct information. Also, I | wonder if a similar pattern of callbacks is used in the existing YAML | implementations and whether we want to try to standardize this... Syck has a completely different model. Instead of pre-order traversal, he uses post-order. It's quite clever since it actually perfectly enforces the model. --- { a: b, c: d } is passed (roughly) as 5 events, - create scalar 'a' - create scalar 'b' - create scalar 'c' - create scalar 'd' For each scalar, you return an integer handle, id() in python, then, for the mapping, he passes you a full mapping where you can iterate over... you have no way to actually get at the original key order in the document. - create mapping [1:2,3:4] etc. Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |