From: Clark C . E. <cc...@cl...> - 2001-06-05 12:35:53
|
About a week ago I had worked a bit more on yaml.h, seeing how we have another participant (Jason), I figure I'd post what I had, perhaps it might help spur an API discussion. This is a bit simpler from the API proposed earlier... but not that much simpler. Also, it does not include asScalar(), although the implementation of this function would be rather trivial. In particular, this API does not reflect our recent talks about the Class as implemented through colors. Best, Clark ... /************************************************************************* ** RCS-ID: $Id: yaml.h,v 1.3 2001/05/27 16:03:13 cce Exp $ ** Copyright: (c) 2001 Clark Evans (cc...@cl...) ** License: You may use under the Python, Perl, or LGPL license ** Credits: Brian Ingerson and many people on the SML-DEV ** including Oren Ben-Kiki and Sjoerd Visscher ** Version: This is a prototype API and is not yet ** ready for implementing ************************************************************************/ #ifndef __YAML_H #define __YAML_H #include <stddef.h> /************************************************************************* * GENERAL DEFINITIONS ************************************************************************/ #ifdef YAML_UTF16 typedef unsigned short yaml_char_t; #else typedef char yaml_char_t; #endif struct yaml_node; typedef struct yaml_node yaml_node_t; struct yaml_node { enum { YAML_LIST, YAML_MAP, YAML_REF, YAML_SCALAR } type; enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, YAML_MULTLINE, YAML_BLOCK } style; const yaml_char_t *cls; /* class attributes */ const yaml_char_t *key; /* key if node has parent map */ const yaml_char_t *anchor; /* anchor used for referencing */ const yaml_node_t *parent; /* pointer to the parent node */ }; typedef enum { YAML_ERROR = -1, YAML_SUCCESS = 0, } yaml_result_t; /************************************************************************* * VISITOR API (EMITTER) ************************************************************************/ struct yaml_visitor; typedef struct yaml_visitor yaml_visitor_t; typedef /* When entering a node, returns visitor for children */ yaml_visitor_t *(*yaml_begin_t) ( const yaml_visitor_t *self, const yaml_node_t *node ); typedef /* When exiting a node (map, list, or scalar) */ void (*yaml_end_t)( const yaml_visitor_t *self, const yaml_node_t *node); typedef /* For scalar content, may be called several times */ size_t (*yaml_write_t)( const yaml_visitor_t *self, const void *ptr, const size_t size ); struct yaml_visitor { yaml_begin_t begin; /* function to handle start node event */ yaml_end_t end; /* function to handle finish node event */ yaml_write_t write; /* function to handle scalar content */ }; /************************************************************************* * ITERATOR API (PARSER) ************************************************************************/ struct yaml_iterator; typedef struct yaml_iterator yaml_iterator_t; typedef /* pulls first child, with iterator for subsequent children */ yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self, yaml_node_t **node ); typedef /* iterates through list of siblings in a list or map */ void (*yaml_next_t)( const yaml_iterator_t *self, yaml_node_t **node); typedef /* allows consumer to pull scalar data, returns amount read */ size_t (*yaml_read_t)( const yaml_iterator_t *self, void * ptr, const size_t size ); struct yaml_iterator { yaml_next_t next; /* function to ask for the next node */ yaml_first_t first; /* function to ask for first child */ yaml_read_t read; /* function to handle scalar content */ }; /************************************************************************* * PARSER/EMITTER OPEN/EXEC/CLOSE ************************************************************************/ typedef /* generic read function for an input stream */ size_t (*yaml_sread_t)( void *ptr, size_t size, size_t nobj, void *fp); typedef /* generic write function for an input stream */ size_t (*yaml_swrite_t)( const void *ptr, size_t size, size_t nobj, void *fp); typedef struct { void * fp; yaml_sread_t read; } yaml_input_t; typedef struct { void * fp; yaml_swrite_t write; } yaml_output_t; yaml_iterator_t* yaml_open_parser( yaml_input_t *in ); yaml_visitor_t* yaml_open_emiter( yaml_output_t *out ); void yaml_exec(yaml_iterator_t *i,yaml_visitor_t *v); void yaml_close_parser(yaml_iterator_t *i); void yaml_close_emitter(yaml_visitor_t *v); /* * yaml_input_t in; * yaml_output_t out; * in.fp = fopen("file.in","r"); * out.fp = fopen("file.out","w"); * in.read = fread; * out.write = fwrite; * * yaml_iterator_t *parser = yaml_open_parser(&in); * yaml_visitor_t *emitter = yaml_open_emitter(&out); * yaml_exec( parser,emitter); * yaml_close_parser(parser); * yaml_close_emitter(emitter); */ #endif |
From: Jason D. <ja...@in...> - 2001-06-05 16:23:40
|
Hi, Clark. (Thanks to you and to Oren for the welcome, by the way.) I found some time to do some exploratory coding last night in C# to see how simple YAML really is. Needless to say, I was pleasantly surprised. Within a few hours I had a working parser (with tests) that correctly parses the three major node types. I haven't written much more than a pretty printer with it yet but I suspect deserializing into general-purpose strings, lists, and maps should be easy and then after that, into native objects. I found some issues with some of the productions and have some other comments in general but I wanted to post the API that came out of last night's session before I headed off to work. Comments would be most appreciated. (I'm particularly not too fond of the Level property. My thoughts were that clients of the interface could notice when Level was decremented in order to know when a list or map ended. It was either that or add artificial EndList and EndMap nodes which seem like a better idea--I'll find out as I use the API more.) By the way, it should look familiar if you've worked with Microsoft's pull-based XmlReader parser. Any class can implement the interface below and become a YAML data source. So far, I've only implemented YamlTextReader (which accepts a Stream or a URI in its constructor) but I think that it might be fun to code up a YamlXmlReader (which would accept an XmlReader) so that we can work with XML as YAML without doing any sort of external translation. Neat, huh? I'll take a closer look at yaml.h after work and see how I can harmonize this interface with yours, if possible. ... namespace YAML { /// <summary>Represents a forward only, stream of YAML nodes.</summary> public interface IYamlReader { /// <summary>Reads the next node from the stream.</summary> /// <returns>true if a node was successfully read, /// false if there are no more nodes to read</returns> bool Read(); /// <summary>Indicates the type of the current node.</summary> /// <value>Scalar, List, Map, or Reference</value> /// <seealso cref="YAML.YamlNodeType"/> YamlNodeType NodeType { get; } /// <summary>The current node's logical level in the document.</summary> /// <value>0 for the root (default) node, /// otherwise, a positive integer.</value> /// <remarks>This is not a literal count of how many tabs or spaces /// was used to indent the current node.</remarks> int Level { get; } /// <summary>The name of the current node within its /// parent map node.</summary> /// <value>null for list elements, otherwise, a non-null string.</value> string Name { get; } /// <summary>The index of the current node within its /// parent list node.</summary> /// <value>0 for non-list elements, /// otherwise, a positive integer.</value> int Index { get; } /// <summary>The value of the current node.</summary> /// <value>null for list and map nodes, /// otherwise, a non-null string.</value> string Value { get; } } /// <summary>Represents a YAML node.</summary> public enum YamlNodeType { Scalar, List, Map, Reference } } ... (I suspect I'll be adding an Anchor property and a BinaryValue property but I don't foresee much more than this. Maybe Class.) Bye, Jason. > -----Original Message----- > From: yam...@li... > [mailto:yam...@li...]On Behalf Of Clark C . > Evans > Sent: Tuesday, June 05, 2001 6:36 AM > To: yam...@li... > Subject: [Yaml-core] updated yaml.h > > > About a week ago I had worked a bit more on yaml.h, > seeing how we have another participant (Jason), > I figure I'd post what I had, perhaps it might help > spur an API discussion. This is a bit simpler from > the API proposed earlier... but not that much > simpler. Also, it does not include asScalar(), > although the implementation of this function would > be rather trivial. In particular, this API does > not reflect our recent talks about the Class > as implemented through colors. > > Best, > > Clark > > ... > > /************************************************************************* > ** RCS-ID: $Id: yaml.h,v 1.3 2001/05/27 16:03:13 cce Exp $ > ** Copyright: (c) 2001 Clark Evans (cc...@cl...) > ** License: You may use under the Python, Perl, or LGPL license > ** Credits: Brian Ingerson and many people on the SML-DEV > ** including Oren Ben-Kiki and Sjoerd Visscher > ** Version: This is a prototype API and is not yet > ** ready for implementing > ************************************************************************/ > > #ifndef __YAML_H > #define __YAML_H > #include <stddef.h> > > /************************************************************************* > * GENERAL DEFINITIONS > ************************************************************************/ > > #ifdef YAML_UTF16 > typedef unsigned short yaml_char_t; > #else > typedef char yaml_char_t; > #endif > > struct yaml_node; > typedef struct yaml_node yaml_node_t; > struct yaml_node { > enum { YAML_LIST, YAML_MAP, > YAML_REF, YAML_SCALAR } type; > enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, > YAML_MULTLINE, YAML_BLOCK } style; > const yaml_char_t *cls; /* class attributes */ > const yaml_char_t *key; /* key if node has parent map */ > const yaml_char_t *anchor; /* anchor used for referencing */ > const yaml_node_t *parent; /* pointer to the parent node */ > }; > > typedef enum { > YAML_ERROR = -1, > YAML_SUCCESS = 0, > } yaml_result_t; > > /************************************************************************* > * VISITOR API (EMITTER) > ************************************************************************/ > > struct yaml_visitor; > typedef struct yaml_visitor yaml_visitor_t; > > typedef /* When entering a node, returns visitor for children */ > yaml_visitor_t *(*yaml_begin_t) ( const yaml_visitor_t *self, > const yaml_node_t *node ); > > typedef /* When exiting a node (map, list, or scalar) */ > void (*yaml_end_t)( const yaml_visitor_t *self, > const yaml_node_t *node); > > typedef /* For scalar content, may be called several times */ > size_t (*yaml_write_t)( const yaml_visitor_t *self, > const void *ptr, const size_t size ); > > struct yaml_visitor { > yaml_begin_t begin; /* function to handle start node event */ > yaml_end_t end; /* function to handle finish node event */ > yaml_write_t write; /* function to handle scalar content */ > }; > > /************************************************************************* > * ITERATOR API (PARSER) > ************************************************************************/ > > struct yaml_iterator; > typedef struct yaml_iterator yaml_iterator_t; > > typedef /* pulls first child, with iterator for subsequent children > */ > yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self, > yaml_node_t **node ); > > typedef /* iterates through list of siblings in a list or map > */ > void (*yaml_next_t)( const yaml_iterator_t *self, > yaml_node_t **node); > > typedef /* allows consumer to pull scalar data, returns amount read */ > size_t (*yaml_read_t)( const yaml_iterator_t *self, > void * ptr, const size_t size ); > > struct yaml_iterator { > yaml_next_t next; /* function to ask for the next node */ > yaml_first_t first; /* function to ask for first child */ > yaml_read_t read; /* function to handle scalar content */ > }; > > /************************************************************************* > * PARSER/EMITTER OPEN/EXEC/CLOSE > ************************************************************************/ > > typedef /* generic read function for an input stream */ > size_t (*yaml_sread_t)( void *ptr, size_t size, size_t nobj, void > *fp); > typedef /* generic write function for an input stream */ > size_t (*yaml_swrite_t)( const void *ptr, size_t size, size_t nobj, > void *fp); > > typedef > struct > { > void * fp; > yaml_sread_t read; > } yaml_input_t; > > typedef > struct > { > void * fp; > yaml_swrite_t write; > } yaml_output_t; > > yaml_iterator_t* yaml_open_parser( yaml_input_t *in ); > yaml_visitor_t* yaml_open_emiter( yaml_output_t *out ); > void yaml_exec(yaml_iterator_t *i,yaml_visitor_t *v); > void yaml_close_parser(yaml_iterator_t *i); > void yaml_close_emitter(yaml_visitor_t *v); > > /* > * yaml_input_t in; > * yaml_output_t out; > * in.fp = fopen("file.in","r"); > * out.fp = fopen("file.out","w"); > * in.read = fread; > * out.write = fwrite; > * > * yaml_iterator_t *parser = yaml_open_parser(&in); > * yaml_visitor_t *emitter = yaml_open_emitter(&out); > * yaml_exec( parser,emitter); > * yaml_close_parser(parser); > * yaml_close_emitter(emitter); > */ > > > #endif > > > > > > > > > _______________________________________________ > Yaml-core mailing list > Yam...@li... > http://lists.sourceforge.net/lists/listinfo/yaml-core > |
From: Clark C . E. <cc...@cl...> - 2001-06-05 20:39:32
|
On Tue, Jun 05, 2001 at 09:23:29AM -0700, Jason Diamond wrote: | I found some issues with some of the productions and have some other | comments in general but I wanted to post the API that came out of last | night's session before I headed off to work. Great. Thank you so much for proposing an alternative. I'd like to hear your issues with the productions... it came from my head (after discussion) to paper... without review. So I'm sure there are problems! I'd love to hear them. | namespace YAML | { | public interface IYamlReader | { | bool Read(); | YamlNodeType NodeType { get; } | int Level { get; } | string Name { get; } | int Index { get; } | string Value { get; } | } | public enum YamlNodeType | { | Scalar, List, Map, Reference | } | } Ok. This is understandable. I specifically like the idea of having the "node" attached to the iterator instead of through the Read() function. With this insight... I've modified the proposed "C" API by moving the current node pointer, which was returned from the "first" and "next" functions into the iterator structure. Also, I've moved the parent pointer from the node structure (where it didn't make sence, since YAML is a graph) to the iterator structure (where it does make sense due to nested iterators). So... with the insight gained from your proposal... here is a modified "C" API.... much simpler! Thanks! ;) struct yaml_node { enum { YAML_LIST, YAML_MAP, YAML_REF, YAML_SCALAR } type; enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, YAML_MULTLINE, YAML_BLOCK } style; const yaml_char_t *key; /* key if node has parent map */ const yaml_char_t *anchor; /* anchor used for referencing */ }; yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self); bool_t (*yaml_next_t) ( const yaml_iterator_t *self ); size_t (*yaml_read_t) ( const yaml_iterator_t *self, void * ptr, const size_t size ); struct yaml_iterator { yaml_next_t next; /* function to ask for the next node */ yaml_first_t first; /* function to ask for first child */ yaml_read_t read; /* function to handle scalar content */ yaml_node_t *curr; /* the current node */ yaml_iterator_t *parent; /* the parent iterator */ }; Specific points include: (a) A node is a separated from the iterator so that the interface may be directly used over an in-memory strucure. The node contains the node type, the "style" used to serialize (if a scalar), the key (if used in a map), and the anchor (if provided). (b) The next() method of the iterator advances the "curr" node to the next in its sequence (a map or list). At the end of a map or list, it returns false, and sets the curr pointer to NULL. (c) There is a first() method to get the first child of a map or list. This returns a subordinate iterator, which will have the "curr" value filled in with the first item in the list or map. If the map or list is empty, then a NULL iterator is returned. By calling "next()" on a list or map node invalidates any subordinate iterators and makes all children unreachable. Thus, by not calling "first()" children of a sequence can be skipped if desired. (d) There is a read() method to read in the scalar value of a given node. Note that the read method is operable for a map or a list that have not been iterated as well as a scalar. For a list, read() calls read() on the lists's first child. For a map, it calls read() on the map entry having the key of "=". On a map or list either first() or read() are exclusive options. Since this is a "C" api, the amount of data recovered by the read must be specified by the caller. Thus "read" behaves like most unix-style "C" read functions, returning the number of bytes moved from the input stream into the buffer. (e) An in-memory version of the node and iterator structure could be implemented fairly easily by adding additional methods and/or data values, such as a way to create an iterator to visit the children of a map or list node! Thus, overall, this requires a bit more work on the implementer of the parser: (1) A stack of iterators (and subsequent nodes) must be maintained by the parser. (2) A smart read() must be implemented to comply with the "substutability principle". Although it may seem more complicated, this reduces the burden on the application developer: (3) Child nodes can be skipped easily if not needed. (4) Substitutability is easy.. if not totally grockable. (5) When processing a node, the entire ancestor stack is available (holding it's content). Best, Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-05 20:47:07
|
(oops! key also belongs inside the iterator) On Tue, Jun 05, 2001 at 04:39:48PM -0500, Clark C . Evans wrote: | struct yaml_node { | enum { YAML_LIST, YAML_MAP, YAML_REF, YAML_SCALAR } type; | enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, YAML_MULTLINE, | YAML_BLOCK } style; | const yaml_char_t *anchor; /* anchor used for referencing */ | }; | | yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self); | bool_t (*yaml_next_t) ( const yaml_iterator_t *self ); | size_t (*yaml_read_t) ( const yaml_iterator_t *self, | void * ptr, const size_t size ); | | struct yaml_iterator { | yaml_next_t next; /* function to ask for the next node */ | yaml_first_t first; /* function to ask for first child */ | yaml_read_t read; /* function to handle scalar content */ | yaml_node_t *curr; /* the current node */ | yaml_iterator_t *parent; /* the parent iterator */ yaml_char_t *key; /* key if node is in a parent map */ | }; | | | Specific points include: | | (a) A node is a separated from the iterator so that the | interface may be directly used over an in-memory strucure. | The node contains the node type, the "style" used to | serialize (if a scalar), the key (if used in a map), | and the anchor (if provided). | | (b) The next() method of the iterator advances the | "curr" node to the next in its sequence (a map or list). | At the end of a map or list, it returns false, and | sets the curr pointer to NULL. | | (c) There is a first() method to get the first child | of a map or list. This returns a subordinate iterator, | which will have the "curr" value filled in with the first | item in the list or map. If the map or list is empty, | then a NULL iterator is returned. By calling "next()" | on a list or map node invalidates any subordinate | iterators and makes all children unreachable. Thus, | by not calling "first()" children of a sequence can | be skipped if desired. | | (d) There is a read() method to read in the scalar value | of a given node. Note that the read method is operable | for a map or a list that have not been iterated as well | as a scalar. For a list, read() calls read() on the | lists's first child. For a map, it calls read() on | the map entry having the key of "=". On a map or list | either first() or read() are exclusive options. Since | this is a "C" api, the amount of data recovered by the read | must be specified by the caller. Thus "read" behaves like | most unix-style "C" read functions, returning the number | of bytes moved from the input stream into the buffer. | | (e) An in-memory version of the node and iterator structure | could be implemented fairly easily by adding additional | methods and/or data values, such as a way to create an | iterator to visit the children of a map or list node! | | Thus, overall, this requires a bit more work on the | implementer of the parser: | | (1) A stack of iterators (and subsequent nodes) must | be maintained by the parser. | | (2) A smart read() must be implemented to comply with | the "substutability principle". | | Although it may seem more complicated, this reduces | the burden on the application developer: | | (3) Child nodes can be skipped easily if not needed. | (4) Substitutability is easy.. if not totally grockable. | (5) When processing a node, the entire ancestor stack | is available (holding it's content). P.S. Thanks again for your proposed API. It made me see the problems in the previous "C" API. Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-05 23:36:51
|
(sorry for so much bandwith, but I figured I'd make one more pass at the "C" interface before I go into hybernation) enum { YAML_LIST, YAML_MAP, YAML_REF, YAML_SCALAR } yaml_type_t; enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, YAML_BLOCK, YAML_BINARY } yaml_style_t; yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self ); bool_t (*yaml_next_t) ( const yaml_iterator_t *self ); size_t (*yaml_read_t) ( const yaml_iterator_t *self, void * ptr, const size_t size ); struct yaml_event { yaml_type_t type; /* the node type */ yaml_style_t style; /* scalar style, if any */ yaml_char_t *anchor; /* anchor if node is part of ref */ yaml_char_t *key; /* key if node is in a parent map */ int index; /* position if in parent list */ yaml_node_t *node; /* in memory repr, if any */ }; struct yaml_iterator { yaml_event curr; /* the current node event (by value) */ yaml_iterator_t *parent; /* the parent iterator, if any */ yaml_next_t next; /* function to ask for the next node */ yaml_first_t first; /* function to ask for first child */ yaml_read_t read; /* function to handle scalar content */ }; yaml_visitor_t* (*yaml_begin_t)( const yaml_visitor_t *self, const yaml_event_t *event); void (*yaml_end_t) ( const yaml_visitor_t *self, const yaml_event_t *event); size_t (*yaml_write_t)( const yaml_visitor_t *self, const void * ptr, const size_t size ); struct yaml_visitor { yaml_begin_t begin; /* function to notify begin of node */ yaml_write_t write; /* function to deliver scalar chunk */ yaml_end_t end; /* function to notify end of node */ } struct yaml_node { // random-access node with following functions: // Type, Value, Child(key), Child(index), // Size(), Children(), Parents(), etc. } ... Older points made about the iterator interface... (a) A node is a separated from the iterator so that the interface may be directly used over an in-memory strucure. The node contains the node type, the "style" used to serialize (if a scalar), the key (if used in a map), and the anchor (if provided). (b) The next() method of the iterator advances the "curr" node to the next in its sequence (a map or list). At the end of a map or list, it returns false, and sets the curr pointer to NULL. (c) There is a first() method to get the first child of a map or list. This returns a subordinate iterator, which will have the "curr" value filled in with the first item in the list or map. If the map or list is empty, then a NULL iterator is returned. By calling "next()" on a list or map node invalidates any subordinate iterators and makes all children unreachable. Thus, by not calling "first()" children of a sequence can be skipped if desired. (d) There is a read() method to read in the scalar value of a given node. Note that the read method is operable for a map or a list that have not been iterated as well as a scalar. For a list, read() calls read() on the lists's first child. For a map, it calls read() on the map entry having the key of "=". On a map or list either first() or read() are exclusive options. Since this is a "C" api, the amount of data recovered by the read must be specified by the caller. Thus "read" behaves like most unix-style "C" read functions, returning the number of bytes moved from the input stream into the buffer. (e) An in-memory version of the node and iterator structure could be implemented fairly easily by adding additional methods and/or data values, such as a way to create an iterator to visit the children of a map or list node! |
From: Clark C . E. <cc...@cl...> - 2001-06-05 21:39:16
|
(rambling thoughts) On Tue, Jun 05, 2001 at 09:23:29AM -0700, Jason Diamond wrote: | Within a few hours I had a working parser (with tests) that | correctly parses the three major node types. I haven't written | much more than a pretty printer with it yet but I suspect | deserializing into general-purpose strings, lists, | and maps should be easy and then after that, into native objects. Way Cool! | I'm particularly not too fond of the Level property. My | thoughts were that clients of the interface could notice | when Level was decremented in order to know when a list or | map ended. It was either that or add artificial EndList | and EndMap nodes which seem like a better idea--I'll | find out as I use the API more. I was contemplating a nested interface, so that Next() moves to the subsequent sibling and does not dive into the children. | I think that it might be fun to code up a YamlXmlReader | (which would accept an XmlReader) so that we can work | with XML as YAML without doing any sort of external | translation. Neat, huh? Yes. The very first spec had an attempted pass at the XML binding. I was thinking that each "element" becomes a map, with attributes map entries, and with the special attribute named "=" which is the list of children. It may not *look* nice via YAML, but it should be very easy to grok the injection. ... Here is a first pass take on an C# (I don't know _any_ C#) interface which has both nested Iterator and Visitors and a seperate Node class to allow for use of the Iterator and Visitor over an in-memory YAML graph. namespace YAML { public interface IYamlIterator { bool Next(); IYamlIterator First(); IYamlIterator Parent { get; } string Name { get; } string Value { get; } IYamlNode Node { get; } } public interface IYamlVisitor { IYamlVisitor Begin( IYamlNode Node, string Name ); void Write( string Value ); void End(); } public enum YamlNodeType { Scalar, List, Map, Reference } public interface IYamlNode { YamlNodeType Type { get; } string Anchor { get; } } // // When the YAML tree is in memory... // public interface IYamlRamNode : extends IYamlNode { string Value { get; } IYamlIterator Children(); IYamlIterator Parents(); void VisitChildren(IYamlVisitor *); void VisitParents(IYamlVisitor *); // the next method throws an exception // if there is more than one parent. IYamlRamNode Parent { get; } } public interface IYamlMapNode : extends IYamlRamNode { IYamlRamNode Child(string name); } public interface IYamlListNode : extends IYamlRamNode { IYamlRamNode Child(int index); int Size() { get; } } } Note: IYamlVisitor::Begin returns a subordinate visitor. If children are not wanted, then NULL can be returned. I'm not sure how the Visitor can signal that it wants the value for a map or list... i.e., support for substitutability. I hope the above doesn't seem too complex... without the use case of in-memory nodes, we do not need to separate the node from the visitor or iterator interface. Do we need an in-memory node interface? Best, Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-05 22:44:48
|
Jason, I don't want to step on toes... so let me know if I am... but, anyway, I spent some time thinking this trough a bit more.... ... First, what a simple "printer" would look like from the application programmer's perspective. Simply connect the dots, i.e., get the reader and writer interfaces and then connect them together. Large processing-pipelines can be built using something like this... IYamlIterator reader = new YamlReader(inputstream); IYamlVisitor writer = new YamlWriter(outputstream); YamlConnect(reader,writer); Second, here is the new improved YAML interface proposal... this seperates out the items common to the Iterator and Visitor and puts them into a single "Event" class... namespace YAML { pubic interface IYamlEvent { string Key { get; } /* if parent is a map */ int Index { get; } /* if parent is a list */ string Anchor { get; } IYamlNodeType Type { get; } IYamlNode Node { get; } /* in-memory repr, if any */ } public interface IYamlIterator : extends IYamlEvent { bool Next(); IYamlIterator First(); IYamlIterator Parent { get; } string Read { get; } } public interface IYamlVisitor { IYamlVisitor Begin( IYamlEvent event ); void Write( string ChunkOfValue ); void End( IYamlEvent event ); } public enum YamlNodeType { Scalar, List, Map, Reference } // // in-memory directed graph nodes... // public interface IYamlNode { YamlNodeType Type { get; } string Value { get; } IYamlNode Child(string key); /* maps only */ IYamlNode Child(int index); /* lists only */ int Size() { get; } /* lists only */ IYamlIterator Children(); IYamlIterator Parents(); void VisitChildren(IYamlVisitor); void VisitParents(IYamlVisitor); } } Third, here is what the push to pull connect function would look like. This is a generic Iterator to Visitor converter which would be part of the standard YAML library. void YamlConect( IYamlVisitor writer, IYamlIterator reader ) { do { IYamlVisitor childWriter = writer->Begin(reader); if(childWriter) { switch(reader.Type) { case Map: case List: IYamlIterator childReader = reader->First(); if(childReader) YamlConnect(childWriter,childReader); case Scalar: childWriter->Write(reader->Read()); } } writer->End(reader); while(reader.Next()); } This may look complicated, beacuse the "first" and "begin" function return subordinate processors. However, this gives greater flexibility since at each level of the processing hierarchy a subordinate Iterator or Visitor can hold state information. Best, Clark |
From: Jason D. <ja...@in...> - 2001-06-06 00:38:15
|
Hi, Clark. Don't worry about stepping on toes. The one thing that I know for a fact is that I don't know all the answers and so will gladly take input from anybody and everybody who offers it. One thing that I think that you might be missing, though, is the layering of APIs that we see in our XML development. I think it's possible to layer the API that you're trying to describe on top of mine. I'll try to explain where I'm coming from but try to keep in mind that I'm not the Microsoft zealot I'm about to sound like once I start explaining this. :-) (At work, I'm actually the resident anti-Microsoft, pro-Open Source zealot.) I modelled IYamlReader after Microsoft's XmlReader which is the lowest level of the .NET XML stack. XmlReader is an abstract class that defines a low-level interface to an XML document. In C++ STL terms, you can think of it as modelling the InputIterator concept. .NET ships with two implementations of XmlReader: XmlTextReader and XmlNodeReader. The next layer up in the stack is what Microsoft calls the XmlDocument. This is basically their version of the W3C DOM. There's a plethora of other classes in the XmlDocument family: XmlNode, XmlElement, XmlAttribute, XmlText, etc. Unlike XmlReader, though, XmlDocument and friends are not abstract. The reason for this is because XmlDocument defines a Load method that accepts an XmlReader as a parameter. By iterating over the "nodes" in the stream represented by the XmlReader, the XmlDocument can instantiate whatever XmlNode derivatives it needs to in order to accurately represent the XML document being read into memory. XmlTextReader parses a stream of characters. XmlNodeReader "parses" an XmlNode. So one XmlDocument can load itself by instantiating an XmlNodeReader to read an entirely different XmlDocument. The real power comes when you realize that you can implement XmlReader to read ANY data source. There's a XmlRegistryReader out there. There's also an XmlFileSystemReader, an XmlZipArchiveReader, etc. I actually implemented what I called an XmlReflectionReader. It "parsed" a .NET class (using reflection) and represented as XML. By loading a DOM with my custom XmlReader, I was then able to transform it (using the .NET XslTransform class) into HTML. Voila! Instant documentation for any class I might come across. We could do the same with YAML. How about a YamlDirectoryReader? That could be the basis for YAR. The same layering occurs in Java. Every DOM implementation that I know of can populate itself from a sequence of SAX callbacks. The difference between Microsoft and SAX is that Microsoft's XmlReader is pull-based whereas SAX (and expat and libxml) are push based. They're both equally valid approaches but the pull-based parsers are much easier to use (though harder to implement). So what I'm trying to get down to is this: IYamlReader (and the associated YamlTextReader) are what I'd like to see as the bottom of the YAML stack. Since it's a forward-only, streaming interface, it's not only easy to implement but it's efficient. If a higher level wants to cache the nodes encountered while reading a YAML input stream, they're more than welcome to. Or they might just process it right then and thus keep the working set of the application down, if possible. I imagine that this is how YAR could work. The contents of a YAR file could probably be serialized to the filesystem as the YAR file is being parsed. Random access doesn't seem necessary and so no DOM (YOM?) should be required. Ugh, sorry for going on for so long. Do you think that this kind of model would suit what you had envisioned for YAML? By the way, I've revised (but haven't had time to re-implement yet) my original low-level interface. Thining about how YAR would be processed, I realized that it wouldn't be efficient to return scalar values back as strings or byte arrays (since a file could contain megs of text or megs of binary data). public interface IYamlReader { YamlNodeType NodeType { get; } int Depth { get; } string Key { get; } int Index { get; } int Anchor { get; } bool Read(); void Skip(); int ReadCharacter(); int ReadCharacters(char[] buffer, int offset, int length); string ReadCharacters(); int ReadByte(); int ReadBytes(byte[] buffer, int offset, int length); byte[] ReadBytes(); } I typed this up this morning when I got to work. But then I saw your First and Next methods and that they might be very handy. But I think it would still be prudent to keep a plain Read method which, when it reaches the end of a map or scalar, continues on to the next node at a higher depth. This makes it possible to process an entire YAML document with a single loop: while (yamlReader.Read()) { // do stuff with yamlReader... } When processing XML documents using XmlReader, I often have an outer loop like so which then delegates to other methods with their own "inner" loops. The trick is making sure that those other methods stop Read()ing when they're supposed to (by watching out for EndTag nodes with the correct name). A ReadNext would have made that much easier! By the way, the pretty printer that I wrote last night actually consists of just this one loop with a call to a simple YamlTextWriter. I defined an overloaded method on the IYamlWriter interface to accept an IYamlReader. The reader gets queried for the current node type, name, etc, and then outputs it. Again, sorry for babbling. I can't wait to hear your thoughts. Jason. > -----Original Message----- > From: yam...@li... > [mailto:yam...@li...]On Behalf Of Clark C . > Evans > Sent: Tuesday, June 05, 2001 4:45 PM > To: yam...@li... > Subject: Re: [Yaml-core] updated yaml.h > > > Jason, > > I don't want to step on toes... so let me know > if I am... but, anyway, I spent some time > thinking this trough a bit more.... > > ... > > First, what a simple "printer" would look like > from the application programmer's perspective. > Simply connect the dots, i.e., get the reader > and writer interfaces and then connect them together. > Large processing-pipelines can be built using > something like this... > > IYamlIterator reader = new YamlReader(inputstream); > IYamlVisitor writer = new YamlWriter(outputstream); > YamlConnect(reader,writer); > > Second, here is the new improved YAML interface > proposal... this seperates out the items > common to the Iterator and Visitor and puts > them into a single "Event" class... > > namespace YAML > { > pubic interface IYamlEvent > { > string Key { get; } /* if parent is a map */ > int Index { get; } /* if parent is a list */ > string Anchor { get; } > IYamlNodeType Type { get; } > IYamlNode Node { get; } /* in-memory repr, if any */ > } > public interface IYamlIterator : > extends IYamlEvent > { > bool Next(); > IYamlIterator First(); > IYamlIterator Parent { get; } > string Read { get; } > } > public interface IYamlVisitor > { > IYamlVisitor Begin( IYamlEvent event ); > void Write( string ChunkOfValue ); > void End( IYamlEvent event ); > } > public enum YamlNodeType > { > Scalar, > List, > Map, > Reference > } > // > // in-memory directed graph nodes... > // > public interface IYamlNode > { > YamlNodeType Type { get; } > string Value { get; } > IYamlNode Child(string key); /* maps only */ > IYamlNode Child(int index); /* lists only */ > int Size() { get; } /* lists only */ > > IYamlIterator Children(); > IYamlIterator Parents(); > void VisitChildren(IYamlVisitor); > void VisitParents(IYamlVisitor); > } > } > > Third, here is what the push to pull connect > function would look like. This is a generic > Iterator to Visitor converter which would be > part of the standard YAML library. > > void YamlConect( > IYamlVisitor writer, > IYamlIterator reader > ) > { > do { > IYamlVisitor childWriter = writer->Begin(reader); > if(childWriter) { > switch(reader.Type) { > case Map: > case List: > IYamlIterator childReader = reader->First(); > if(childReader) > YamlConnect(childWriter,childReader); > case Scalar: > childWriter->Write(reader->Read()); > } > } > writer->End(reader); > while(reader.Next()); > } > > > This may look complicated, beacuse the "first" > and "begin" function return subordinate processors. > However, this gives greater flexibility since > at each level of the processing hierarchy a > subordinate Iterator or Visitor can hold > state information. > > Best, > > Clark > > _______________________________________________ > Yaml-core mailing list > Yam...@li... > http://lists.sourceforge.net/lists/listinfo/yaml-core > |
From: Clark C . E. <cc...@cl...> - 2001-06-06 04:01:36
|
On Tue, Jun 05, 2001 at 05:38:10PM -0700, Jason Diamond wrote: | The next layer up in the stack is what Microsoft ... | The same layering occurs in Java... | So what I'm trying to get down to is this: IYamlReader (and | the associated YamlTextReader) are what I'd like to see as | the bottom of the YAML stack. Since it's a forward-only, | streaming interface, it's not only easy to implement but | it's efficient.... | Ugh, sorry for going on for so long. Do you think that this kind of model | would suit what you had envisioned for YAML?... Ok. There are three issues here. The first one is hierarchical iterator vs a hierarchy of iterators. The second one is all about layering. And the third one is about push vs pull interfaces and building a sequential processing pipeline. Before I get started, let me preface by saying that I'm very well versed in the DOM/SAX/MSXML APIs and have put a good deal of thought (and prior implementations) into this. Also, thus far, my experience has shown two types of "stream" interfaces, push and pull. And Oren has pointed out their connection to the visitor and iterator design patterns respectively. Issue #1: Hierarchical Iterator vs Hierarchy of Iterators ~~~~~~~~~ There are two approaches to representing a hierarchical stream. You can either use a hierarchy of iterators, where each iterator *only* covers a list of nodes for a given container (map/list). Or you can use a single iterator with some sort of "begin/end" or "depth" notion. Please compare and contrast public interface IYamlHierarchicalIterator { YamlNodeType NodeType { get; } string Key { get; } int Read(...); * int Depth { get; } * bool Read(); * void Skip(); } with public interface IYamlFlatIterator { YamlNodeType NodeType { get; } string Key { get; } int Read(...); * bool Next(); * IYamlFlatIterator FirstChild(); * IYamlFlatIterator ParentIterator(); } I strongly feel that the latter is much more "intuitive" and easy to read, since the iterator is flat, it only moves over a single container; where the former is hierarchical, and the user must constantly watch for the "depth" to change. In the latter, I can have *different* iterators over different sub-trees, giving the underlying implementation a great deal of flexibility. In the former, this flexibility would imply using the Envelope or Adapter pattern; thus it comes at a higher cost. In the latter the user doesn't have to think about "skipping" nodes. If the user doesn't want them, they don't ask for them and just call Next() -- it is positive logic which is easy to understand. One may say that the latter is harder to implement, perhaps, but not by much. The only "additional" complexity on the parser end is keeping the parser state on a stack. This will, actually, make the parser implementation easier to write... try it. The speed difference is negligable. In fact, for common use cases where the user of the interface requires the stack... it is faster since the user need not try to setup their own structures and copy the keys... The memory difference is not much. And given modern processors... negligable. The only thing that needs to be kept is a stack of (key, anchor, parent-pointer) tuples. This is trivial memory usage. Not even worth the mental effort in the big scheme of things. In fact... you probably have to keep this stack *anyway* for your parsing! Why not expose it! The "Read" method and "FirstChild" are exlusive. In the latter, the Read() method is defined as: (a) the characters if the node is a scalar, (b) the Read() of the first item if the node is a list. (c) the Read() of the map item having a key "=" if the node is a map. This behavior is very clean and allows for "substitutability" as defined in earlier e-mails. You mentioned that one advantage of the hierarchy of iterators is that you can implement this in a single loop. True enough. However, I have yet to see a single case (with reasonable complexity) where a single loop works well. In short... there is a difference. The hiearchy of iterators may seem less efficient, but this is negligable. And from a useability perspective... the hierarchy of iterators wins hands down... Even if you don't buy the layering argument to follow, I hope you consider this. It isn't the "norm"... however, the norm isn't necessarly the 80/20 rule. The 80/20 rule says that you want to keep "some" part of the stream in memory, but only 20% of it (the ancestor stack). Issue #2: Layering ~~~~~~~~~ I believe that you are introducing layering where it is not required. Certainly layering is often good to keep complexity in check, however, too much layering just adds additional "unneeded" interfaces, code, function calls, etc. The Reader API should be a sequential interface to YAML data. Where this information comes from, via text file or visiting a object node in a tree or rummaging over native maps and lists in memory, should be largely irrelvant. By calling your interface "Reader", I think you get off on the wrong foot, and limit youself only to the case of reading from a text stream. This is an unnecessary, and indeed harmful limitation. What is required is a pull based sequential access interface! First, if this interface is too optimized for one medium, then every medium should have an interface. By this logic, we need at least three interfaces -- a TextReader, a NodeReader, and a NativeObjectReader. Ok. So I'm getting stupid here. But the point is clear, if we have more than one interface, then we *multiply* the code, each thing we need will require a way to do it with that interface or adapters will be required. We are already going to need two adapters, push to pull and pull to push, based on real live concerns. Why make up extra ones? Second, given the two interfaces compared above, the Hierarchial vs the Flat iterator, I bet you'd say that the Hierarchical is "lower level" than the Flat iterator. I argue that since I can implement *either* with each other that they are equivalent and on the same semantic level. Why? Fundamentally, they are both forward-only sequential access interfaces. We sould be looking for an interface that will work equally well over all types of data sources In short the Reader should be one concreate implementation of the Iterator interface. Issue #3: Push vs Pull ~~~~~~~~~ I'm glad you understand the need for a pull interface. This is great. I hope you understand the need for a push interface as well, right? The printer (emitter) should be using the Visitor interface. Is this clear? Anyway, the only real point to be made here was the "Event" class in the API that I was putting forward. It's sole purpose is to put all of those "common" things that both the Iterator and Visitor interface have into a single interface for re-use. It may look complicated, but it really helps when mixing the iterator and the visitor interfaces. This Event class also provides a nice place to "stuff" away an optional node pointer... in case the iterator or visitor is looping over an in-memory structure with random access. | I typed this up this morning when I got to work. But then | I saw your First and Next methods and that they might be | very handy. But I think it would still be prudent to keep | a plain Read method which, when it reaches the end | of a map or scalar, continues on to the next node at | a higher depth. This makes it possible to process an | entire YAML document with a single loop: | | while (yamlReader.Read()) | { | // do stuff with yamlReader... | } I hope the above addresses this. From my experience, the above use case is minor.... and it can easily be handled by rather flat recursion. handle(IYamlIterator reader) { while(reader.Read()) { // to recurse call handle(reader); } } All in all, you'll find it is not much more code! In fact, you can *strip* all of the code having to deal with managing depth and let the program stack do it for you. The depth will always be pretty small, so it's no real big deal. | When processing XML documents using XmlReader, I often have an outer loop | like so which then delegates to other methods with their own "inner" loops. | The trick is making sure that those other methods stop Read()ing when | they're supposed to (by watching out for EndTag nodes with the correct | name). A ReadNext would have made that much easier! Yep! This is the *primary* use case... which motivates the "nexted iterator". | By the way, the pretty printer that I wrote last | night actually consists of just this one loop with | a call to a simple YamlTextWriter. I defined an | overloaded method on the IYamlWriter interface to | accept an IYamlReader. The reader gets queried for | the current node type, name, etc, and then outputs it. I'd like to hear about your IYamlWriter interface! | Again, sorry for babbling. I can't wait to hear your thoughts. No problem. I babble too! I hope my thoughts arn't too negative. I'm still "playing" a bit... as you can see. But, really I'd like, if at all possible to only have *two* very simple and very very useable interfaces exposed to users. I'd rather not have the "parser" and the "iterator" interface be seperate. Anyway... thank you *so* much for your feedback. Best, Clark P.S. Oren and Brian, you are welcome to jump in here and argue for/against what I'm saying. Three or Four guts are better than two... P.S.S. Do I still get to keepe the dictator hat, or must I give it up to the implementers now? (Although I *will* be implementing a C version... unless someone does it first and does it very well! *evil grin* ) |
From: Brian I. <briani@ActiveState.com> - 2001-06-06 04:54:01
|
"Clark C . Evans" wrote: > > P.S. Oren and Brian, you are welcome to jump in here > and argue for/against what I'm saying. Three or > Four guts are better than two... Sorry, but I'm pretty much staying offline until after my YAPC conference. I'll be back on June 18th. I'm not really any help in this low level stuff anyway. Just do something good! BTW, I'm happy with where we ended up syntax-wise. Is everything pretty much settled on that front? Cheers, Brian -- perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf ("Just Another %s Hacker",x);}};print JAxH+Perl' |
From: Jason D. <ja...@in...> - 2001-06-06 06:44:02
|
> Ok. There are three issues here. The first one is hierarchical > iterator vs a hierarchy of iterators. The second one is all > about layering. And the third one is about push vs pull > interfaces and building a sequential processing pipeline. Thanks for breaking it down like this. And sorry if I came across as too haughty. That certainly wasn't my intention. I absolutely value your experience and insight and so am I extremely excited by this discussion! > Issue #1: Hierarchical Iterator vs Hierarchy of Iterators > ~~~~~~~~~ > public interface IYamlFlatIterator > { > YamlNodeType NodeType { get; } > string Key { get; } > int Read(...); > > * bool Next(); > * IYamlFlatIterator FirstChild(); > * IYamlFlatIterator ParentIterator(); > } Given following example code, how can you possibly implement this interface so that it's forward only over the YAML document? IYamlFlatIterator document = new YamlParser("foo.yaml"); IYamlFlatIterator child1 = document.FirstChild(); IYamlFlatIterator parent = child.ParentIterator(); IYamlFlatIterator child2 = parent.FirstChild(); Yes, it's true that each iterator can only move forward and then stop. But I obviously have random access over the tree as a whole. As such, the interface implies the existence of a tree. I was proposing a streaming interface. In the above code, how could the implementation be expected to be anything but tree-based? If not, it would have to fseek back to the location of the first child and parse it a second time! Of course, this wouldn't even be possible if we were reading from a URL. If your answer is that the parser built an in-memory tree when it parsed it the first time and that no re-parsing was necessary then you've successfully combined the two layers that I was trying to keep separate. Parsing is always done with at least two layers. Lexical analysis and then syntactic analysis. Usually we add a third layer--semantic analysis--but that's application specific so we won't go there. Lexical analysis (scanning, tokenizing, whatever) is always (to my knowledge) exposed through a forward-only sequence of tokens (not counting lookahead). Syntactic analysis (often referred to as just parsing) usually builds the tree so that it can be processed later. These are basically the two layers that I'm talking about. What I'm trying to ensure is that we have the flexibility to build the kind of tree that we want to build without building some intermediary and sometimes unnecessary tree in the middle. For example, deserializing an XML document into an object graph can be done without loading it into a DOM first. I've done this (I'm sure you have, too) many times with SAX/expat (push) and am now doing it with much more pleasure using XmlReader (pull). The same thing can be said for YAML. The object graph becomes our parse tree which we then process according to the semantics of our application. So to speak. Another example is YAR. If we examine the sequence of YAML nodes as they appear in a YAR document, we can re-create a directory tree on disk without first constructing an in-memory YAML tree. I see your Iterator interface as a slight variation of the W3C's DOM's Node interface. It contains the nextSibling, firstChild, and parentNode properties which I see as being analagous to your Next(), FirstChild(), and ParentIterator() methods. The only difference is that the state of the node is changing using your iterator whereas the DOM makes your application become the iterator. I'm not saying that there's anything wrong with this. I agree with you that your flat iterator is a much more logical approach to iterating over tree structures. When I do use the DOM, I use next, first, and parent properties as they're not only more efficient then using NodeList but also more logical. The problem that I have is that it mandates that the implementation use a tree structure whereas the my reader or hierachical iterator does not. It can iterate over both a stream or a tree with the exact same interface. > Issue #2: Layering > ~~~~~~~~~ > > I believe that you are introducing layering where > it is not required. Certainly layering is often > good to keep complexity in check, however, too > much layering just adds additional "unneeded" > interfaces, code, function calls, etc. Did my tokenizer/parser analogy above make sense? You can't parse without tokenizing. You can't build a YAML tree without knowing what type of node you're currently pointing at in the middle of a character stream. We need both of these layers, regardless. If we expose the lower level interface, we give the developers the ability to choose how they process YAML data. Some data can be processed more efficiently as a stream and some as a tree. We can't make that decision for them. > The Reader API should be a sequential interface > to YAML data. Where this information comes from, > via text file or visiting a object node in a > tree or rummaging over native maps and lists > in memory, should be largely irrelvant. By > calling your interface "Reader", I think you > get off on the wrong foot, and limit youself > only to the case of reading from a text stream. > This is an unnecessary, and indeed harmful > limitation. What is required is a pull based > sequential access interface! This is exactly what the Reader is. There's nothing in the interface that implies that it has to come from a character stream. Except the name of course. But that can easily be changed. I was just following Microsoft's lead. For my own pull-based XML parser (in Java), I didn't use the word Reader since it's too heavily associated with character streams. I didn't use Iterator (which I wanted to) since I assumed experienced developers would assume it implements the Iterator interface. Naming is tricky. To me, though, Iterator implies that you can move in only one direction: forward. I would gladly change the interface name to use Iterator instead of Reader. > First, if this interface is too optimized for > one medium, then every medium should have an > interface. By this logic, we need at least > three interfaces -- a TextReader, a NodeReader, > and a NativeObjectReader. Ok. So I'm getting > stupid here. But the point is clear, if we have > more than one interface, then we *multiply* the > code, each thing we need will require a way to > do it with that interface or adapters will be > required. We are already going to need two > adapters, push to pull and pull to push, based > on real live concerns. Why make up extra ones? I absolutely agree with this and would go so far to say that if more than one interface was required to read or iterate a stream of YAML nodes of different types then the design is broken. But iterating a stream isn't as a capable as navigating a tree and so requires a restricted interface. The lack of a Parent property says nothing about the data source--just how you can access it. It could be a character stream. Or it could be an in-memory tree. Ot it could come from some sort of OODB or any other source. > Second, given the two interfaces compared above, > the Hierarchial vs the Flat iterator, I bet you'd > say that the Hierarchical is "lower level" than > the Flat iterator. I argue that since I can > implement *either* with each other that they > are equivalent and on the same semantic level. > Why? Fundamentally, they are both forward-only > sequential access interfaces. The ParentIterator() method effectively makes it a random access iterator when looking at the document as a whole. Am I missing something? The hierarchical iterator is a lower level interface simply because it does not impose a higher level structure on the data source--not because it's more efficient or easier to implement. > We sould be looking for an interface that will > work equally well over all types of data sources > In short the Reader should be one concreate > implementation of the Iterator interface. I would love to agree but can't because a tree-based interface does not make it posible to efficiently implement a stream based data source (like a character stream--which will probably be the most oft used data source for YAML documents). We need a minimum of two interfaces: stream and tree. Often times, the stream-based interface will be used to construct a tree-based data source which we can then use the tree-based interface to traverse. But sometimes we won't need to go up to that level where a tree is needed. > Issue #3: Push vs Pull > ~~~~~~~~~ > > I'm glad you understand the need for a pull > interface. This is great. I hope you understand > the need for a push interface as well, right? > The printer (emitter) should be using the Visitor > interface. Is this clear? Yes and I have no problems with it. But it operates at a higher level than I was focusing on. The Visitor pattern is definitely my favorite from the GoF (probably because that was the one that took me the longest to finally figure out.) > Anyway, the only real point to be made here > was the "Event" class in the API that I > was putting forward. It's sole purpose > is to put all of those "common" things > that both the Iterator and Visitor interface > have into a single interface for re-use. > It may look complicated, but it really helps > when mixing the iterator and the visitor > interfaces. I need to look at your Event class more as I don't quite grasp how it fits into things as of yet. > | Again, sorry for babbling. I can't wait to hear your thoughts. > > No problem. I babble too! I hope my thoughts arn't too > negative. I'm still "playing" a bit... as you can see. > But, really I'd like, if at all possible to only have *two* > very simple and very very useable interfaces exposed to > users. I'd rather not have the "parser" and the "iterator" > interface be seperate. I, too, only wish to see two simple interfaces exposed. The names that we're using are unfortunate, however, and it may be clouding the issue. I prefer to think of them as the stream-based interface and the tree-based interface. If we wanted to give these interfaces names, Iterator strikes me as being forward only and appropriate for streams whereas something like Navigator might be more appropriate for an "iterator" over a tree. All it would take to turn an Iterator into a Navigator would be to extend it with a single Parent property--thus enabling random access to the whole tree. > Anyway... thank you *so* much for your feedback. No, thank you for allowing me to try to contribute. Jason. |
From: Clark C . E. <cc...@cl...> - 2001-06-06 12:46:58
|
On Tue, Jun 05, 2001 at 11:43:57PM -0700, Jason Diamond wrote: | | > public interface IYamlFlatIterator | > { | > YamlNodeType NodeType { get; } | > string Key { get; } | > int Read(...); | > | > * bool Next(); | > * IYamlFlatIterator FirstChild(); | > * IYamlFlatIterator ParentIterator(); | > } | | Given following example code, how can you possibly implement this interface | so that it's forward only over the YAML document? | | IYamlFlatIterator document = new YamlParser("foo.yaml"); | IYamlFlatIterator child1 = document.FirstChild(); | IYamlFlatIterator parent = child.ParentIterator(); | IYamlFlatIterator child2 = parent.FirstChild(); Ahh. Yes, there is an implicit assumption that FirstChild() can only be called once on a Map or List, if it is called a Scalar or a second time, then a FunctionSequenceError, or NodeNotSequence or ForwardOnlyException would be raised... Sorry for not making this clear. This is very similar to the fact that ReadCharacters() is called until it returns 0 (no more characters), right? | Parsing is always done with at least two layers. Lexical analysis and then | syntactic analysis. Usually we add a third layer--semantic analysis--but | that's application specific so we won't go there. | | Lexical analysis (scanning, tokenizing, whatever) is always (to my | knowledge) exposed through a forward-only sequence of tokens (not counting | lookahead). Syntactic analysis (often referred to as just parsing) usually | builds the tree so that it can be processed later. These are basically the | two layers that I'm talking about. Right. We are on the same page... this interface should not provide "random access". If you add this limitation, then the above is sequential access (plus the current ancestor stack). | What I'm trying to ensure is that we have the flexibility to build the kind | of tree that we want to build without building some intermediary and | sometimes unnecessary tree in the middle. Yep! We are on the same page here! | Did my tokenizer/parser analogy above make sense? You can't parse without | tokenizing. You can't build a YAML tree without knowing what type of node | you're currently pointing at in the middle of a character stream. We need | both of these layers, regardless. If we expose the lower level interface, we | give the developers the ability to choose how they process YAML data. Some | data can be processed more efficiently as a stream and some as a tree. We | can't make that decision for them. Once again, we agree. However, I'd like to have the interface "identical". Perhaps you could "allow" FirstChild() to be called more than once if the input source had random access. So... FirstChild() could be allowed to throw a ForwardOnly exception, but wouldn't be mandated to do so. Hmmm. | To me, though, Iterator implies that you can move in only | one direction: forward. I would gladly change the interface | name to use Iterator instead of Reader. Iterator has the same implication for me as well. ;) | | I absolutely agree with this and would go so far to say | that if more than one interface was required to read or | iterate a stream of YAML nodes of different types then | the design is broken. But iterating a stream isn't as a | capable as navigating a tree and so requires a restricted | interface. Right. | > Why? Fundamentally, they are both forward-only | > sequential access interfaces. | | The ParentIterator() method effectively makes it a random access iterator | when looking at the document as a whole. Am I missing something? Yep. FirstChild() is only callable once... *smile* | The hierarchical iterator is a lower level interface simply because it does | not impose a higher level structure on the data source--not because it's | more efficient or easier to implement. The only thing which the Hierarchy of Iterators implies is that the stack of (Type, Index/Key, Anchor) is available. This is a rather minimal amount of informaiton. | We need a minimum of two interfaces: stream and tree. Ok. First, YAML isn't a tree, it's a graph. Thus a "Parent" object may exist on a given node, but it would have to throw a "MultipleParent" exception if the node had two incoming arrows. The "Parent" property *only* makes sense in the context of a given iterator. Second, by designing our Iterator well, we can merge both interfaces... so that the random access interface is an *extension* of the sequential access interface. | > Issue #3: Push vs Pull | > ~~~~~~~~~ | > | > I'm glad you understand the need for a pull | > interface. This is great. I hope you understand | > the need for a push interface as well, right? | > The printer (emitter) should be using the Visitor | > interface. Is this clear? | | Yes and I have no problems with it. But it operates | at a higher level than I was focusing on. SAX implements the visitor pattern. It is at the *same* level that you are focusing on, only that it is a push interface instead of a pull interface. The difference between push or pull is who has the "while loop". In push, it is the producer, in pull it is the consumer. | The Visitor pattern is definitely my favorite from | the GoF It's great book isn't it! | I need to look at your Event class more as I don't quite | grasp how it fits into things as of yet. Yes! | I, too, only wish to see two simple interfaces exposed. | The names that we're using are unfortunate, however, and | it may be clouding the issue. I prefer to think of them | as the stream-based interface and the tree-based interface. Understood. I'm trying to have a *single* interface that can be used to (forward-only) *iterate* over a random access structure as well as over an incoming text stream. | If we wanted to give these interfaces names, Iterator strikes me as being | forward only and appropriate for streams whereas something like Navigator | might be more appropriate for an "iterator" over a tree. All it would take | to turn an Iterator into a Navigator would be to extend it with a single | Parent property--thus enabling random access to the whole tree. The problem isn't the parent property, which only gives you access to the ancestor stack. The problem is the FirstChild() method appears as if it can be called twice on the same sequence (map or list) node. Certainly the ( Type, Key/Index, Anchor) tuple on the stack may take some memory... but not enough to be concerned about. And it certainly doesn't give random access! | No, thank you for allowing me to try to contribute. Well... alot of implementers implement instead of humoring fellas like me. I implement too... but often after I've talked the subject to death. Kind Regards, ;) Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-06 12:59:40
|
On Tue, Jun 05, 2001 at 11:43:57PM -0700, Jason Diamond wrote: | > Issue #1: Hierarchical Iterator vs Hierarchy of Iterators Duh! I know what went wrong, it's not a "Hierarchy of Iterators", it is a "Stack of Iterators". Stupid me. | > public interface IYamlFlatIterator | > { | > YamlNodeType NodeType { get; } | > string Key { get; } | > int Read(...); | > | > * bool Next(); | > * IYamlFlatIterator FirstChild(); | > * IYamlFlatIterator ParentIterator(); | > } A few rules go with this (that I left out). 1. You can't call FirstChild() on a map/list node more than once. 2. Once Next() is called on a parent iterator, all subordinate iterators are invalid. Perhaps this is what was causing the confusion. By saying "Hiearchy of Iterators", you read "Tree of Iterators", aka Random Access. I'm very sorry. Thank you for putting up with my stupidness... Could you re-think with this in mind? Kind Regards, Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-06 04:19:25
|
I updated the interface proposal below with Jason's ReadCharacter() variants... I'm sure I'm totally botching C# by now... so forgive me. But this is a simpler way to express stuff than the C interface. namespace YAML { public function IYamlIterator Parser(input-stream); public function IYamlVisitor Printer(input-stream); pubic interface IYamlEvent { string Key { get; } /* if parent is a map */ int Index { get; } /* if parent is a list */ string Anchor { get; } IYamlNodeType Type { get; } IYamlNode Node { get; } /* optional in-memory repr */ } public interface IYamlValueIterator { int ReadCharacter(); int ReadCharacters(char[] buffer, int offset, int length); string ReadCharacters(); int ReadByte(); int ReadBytes(byte[] buffer, int offset, int length); byte[] ReadBytes(); } public interface IYamlIterator : extends IYamlEvent, IYamlValueIter { bool Next(); IYamlIterator First(); IYamlIterator Parent { get; } } public interface IYamlValueVisitor { int WriteCharacter(); int WriteCharacters(char[] buffer, int offset, int length); int WriteCharacters(string); int WriteByte(); int WriteBytes(byte[] buffer, int offset, int length); int WriteBytes(byte[]); } public interface IYamlVisitor : extends IYamlValueVisitor { IYamlVisitor Begin( IYamlEvent event ); void End( IYamlEvent event); } public enum YamlNodeType { Scalar, List, Map, Reference } // // in-memory directed graph nodes... // public interface IYamlNode { YamlNodeType Type { get; } string Characters() { get; } byte[] Bytes() { get; } IYamlNode Child(string key); /* maps only */ IYamlNode Child(int index); /* lists only */ int Size() { get; } /* lists only */ IYamlIterator Children(); IYamlIterator Parents(); void VisitChildren(IYamlVisitor); void VisitParents(IYamlVisitor); } void YamlConect( IYamlVisitor writer, IYamlIterator reader ) { do { IYamlVisitor childWriter = writer->Begin(reader); if(childWriter) { switch(reader.Type) { case Map: case List: IYamlIterator childReader = reader->First(); if(childReader) YamlConnect(childWriter,childReader); case Scalar: childWriter->WriteBytes(reader->ReadBytes()); } } writer->End(reader); } while(reader.Next()); } } |
From: Jason D. <ja...@in...> - 2001-06-06 08:14:23
|
> I updated the interface proposal below with > Jason's ReadCharacter() variants... I'm sure > I'm totally botching C# by now... so forgive > me. But this is a simpler way to express stuff > than the C interface. Your C# is pretty damn close. We can pretend it's Java if you like. ;-) Perhaps I'm being too simplistic but I think that the following two interfaces are all that's needed to traverse both stream and tree-based data sources: public interface IYamlIterator { YamlNodeType NodeType { get; } string Key { get; } int Anchor { get; } string ReadCharacters(); // ... byte[] ReadBytes(); // ... bool Move(); // MoveToNextNodeInDocument bool MoveNext(); // MoveToNextSiblingNode bool MoveIn(); // MoveToFirstChildNode } public interface IYamlNavigator : IYamlIterator { bool MoveOut(); // MoveToParentNode IYamlNavigator Clone(); } (I'm perfectly open to suggestions for better names...) We can always add more properties/methods to the navigator like ChildCount, MoveLast (to the previous sibling), etc, if we thought we needed it. I tried to reach a compromise between our two approaches. It's supposed to be both a hierarchical and flat iterator. But instead of returning a new iterator like FirstChild(), MoveIn() simply changes it's internal state to point to the first child if there is one. MoveNext() is analogous to your Next() method. Once it reaches the last element of a list or pair of a map it will continue to return false until Move() is called which moves it to the next node after the current node's parent. This is perfect for stream-based data sources. Note that calling MoveNext() instead of MoveIn() on a node with child nodes is like calling your Next() or my Skip(). It skips the children. But if you were on a node with children and called Move(), it would move to the first child. Was I able to accurately capture both our approaches? The derived interface, IYamlNavigator adds MoveOut(). This moves up to the current node's parent like your ParentIterator() but changes the internal state rather than returning a new Iterator or Navigator. The Clone() method can be used to clone the navigator so that you can have multiple navigators over the same tree. This is possible with tree-based data sources but not with stream-based ones. Note that I didn't add anything related to the Visitor pattern. I thought about adding a Visit method to Iterator since it could be possible for the iterator to call MoveIn and MoveNext on itself and invoke the Begin and End methods on the visitor as it did that but this didn't strike me as being code specific to a data source. I envsision that these interfaces are implemented per data source but the visit code (which you implemented in your YamlConnect function) can apply to all types of data sources that expose the iterator interface (you don't have to be a tree to visit, no?) Given that, I though that your YamlConnect function should live outside these interfaces. If we were defining abstract classes instead, I can see that we would implement the method inside iterator and let derivatives inherit it but I prefer to use interfaces and not use up the only inheritance slot in both Java and C#. What I really like about this approach is that it's possible for a data source to implement IYamlNavigator and since that derives from IYamlIterator, processors that only need sequential access through a document will be able to consume both stream and tree-based data sources without having a clue as to what it is they're really communicating with. Anyways, it's time for bed. I suspect that I'm still way off base but it's been fun trying! Jason. |
From: Clark C . E. <cc...@cl...> - 2001-06-06 13:17:11
|
On Wed, Jun 06, 2001 at 01:14:18AM -0700, Jason Diamond wrote: | public interface IYamlIterator | { | YamlNodeType NodeType { get; } | | string Key { get; } | int Anchor { get; } | | string ReadCharacters(); // ... | byte[] ReadBytes(); // ... | | bool Move(); // MoveToNextNodeInDocument | bool MoveNext(); // MoveToNextSiblingNode | bool MoveIn(); // MoveToFirstChildNode | } | | public interface IYamlNavigator : IYamlIterator | { | bool MoveOut(); // MoveToParentNode | IYamlNavigator Clone(); | } Hmm. Interesting. | I tried to reach a compromise between our two approaches. It's supposed to | be both a hierarchical and flat iterator. But instead of returning a new | iterator like FirstChild(), MoveIn() simply changes it's internal state to | point to the first child if there is one. MoveNext() is analogous to your | Next() method. Once it reaches the last element of a list or pair of a map | it will continue to return false until Move() is called which moves it to | the next node after the current node's parent. This is perfect for | stream-based data sources. | | Note that calling MoveNext() instead of MoveIn() on a node with child nodes | is like calling your Next() or my Skip(). It skips the children. But if you | were on a node with children and called Move(), it would move to the first | child. | | Was I able to accurately capture both our approaches? This is certainly an interesting mix of the two interfaces. Let me mull this over some. I have some day-job work to accomplish, so I won't get to this till tonight. | The derived interface, IYamlNavigator adds MoveOut(). This moves up to the | current node's parent like your ParentIterator() but changes the internal | state rather than returning a new Iterator or Navigator. | | The Clone() method can be used to clone the navigator so that you can have | multiple navigators over the same tree. This is possible with tree-based | data sources but not with stream-based ones. Right. | Note that I didn't add anything related to the Visitor pattern... | I envsision that these interfaces are implemented per data source | but the visit code (which you implemented in your YamlConnect | function) can apply to all types of data sources that expose the | iterator interface (you don't have to be a tree to visit, no?) | Given that, I though that your YamlConnect function should live | outside these interfaces. Yep. Connect is a bad name though. Also, this is a pull->push converter. We will also need a push->pull converter, and this other converter is tougher to write and requires two threads. | What I really like about this approach is that it's possible for a data | source to implement IYamlNavigator and since that derives from | IYamlIterator, processors that only need sequential access through a | document will be able to consume both stream and tree-based data sources | without having a clue as to what it is they're really communicating | with. Right -- "one" interface. Thank you so much... Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-06 13:59:11
|
On Wed, Jun 06, 2001 at 09:17:31AM -0500, Clark C . Evans wrote: | On Wed, Jun 06, 2001 at 01:14:18AM -0700, Jason Diamond wrote: | | public interface IYamlIterator | | { | | YamlNodeType NodeType { get; } | | | | string Key { get; } | | int Anchor { get; } | | | | string ReadCharacters(); // ... | | byte[] ReadBytes(); // ... | | | | bool Move(); // MoveToNextNodeInDocument | | bool MoveNext(); // MoveToNextSiblingNode | | bool MoveIn(); // MoveToFirstChildNode | | } | | | | public interface IYamlNavigator : IYamlIterator | | { | | bool MoveOut(); // MoveToParentNode | | IYamlNavigator Clone(); | | } Two suggested changes: 1. Remove the "Move" method. As I understand it... it seems to break the hierarchical context. Consider the case where you are three levels in and MoveNext() returns false. You call Move. Where are you? Level one or level two? 2. Re-define the behavior of MoveNext(). When reach the end of a list or map, MoveNext() returns false. In this state, the only method which can be called is MoveNext(), all other methods give a function sequence error. When MoveNext() is called a second time, it's context is up one level, thus it may move the iterator to the next sibling, or, it could also return false. When MoveNext() has returned false as many times as MoveIn() was called, then the stream is complete. The above may seem complicated, but it is what you need... basically, for each subordinate loop, you keep going till MoveNext() returns false, and then the user of the interface moves to an outer loop and continues. ... With this change there are a few differences between this and the nested iterator stack proposal: 1. The iterator stack explicitly keeps the ancestor hierarchy. While this proposal does not... although there is nothing saying that an extension of this interface giving access to the ancestor stack couldn't be made. 2. Where the nested iterators have multiple objects, this has a single object. 3. Where the nested iterator is invalid state after Next() returns, this iterator is in a partially-invalid state after MoveNext(), hmmm... 4. This iterator doesn't have the problem of the user attempting to call FirstChild multiple times, where the nested iterator does (it must throw an exception). Hmm. ok. I must really get to work. Bye! ;) Clark |
From: Clark C . E. <cc...@cl...> - 2001-06-06 14:17:21
|
On Wed, Jun 06, 2001 at 09:59:33AM -0500, Clark C . Evans wrote: | On Wed, Jun 06, 2001 at 09:17:31AM -0500, Clark C . Evans wrote: | | On Wed, Jun 06, 2001 at 01:14:18AM -0700, Jason Diamond wrote: | | | public interface IYamlIterator | | | { | | | YamlNodeType NodeType { get; } | | | | | | string Key { get; } | | | int Anchor { get; } | | | | | | string ReadCharacters(); // ... | | | byte[] ReadBytes(); // ... | | | | | | bool Move(); // MoveToNextNodeInDocument | | | bool MoveNext(); // MoveToNextSiblingNode | | | bool MoveIn(); // MoveToFirstChildNode | | | } | | | | | | public interface IYamlNavigator : IYamlIterator | | | { | | | bool MoveOut(); // MoveToParentNode | | | IYamlNavigator Clone(); | | | } | | Two suggested changes: | | 1. Remove the "Move" method. | | As I understand it... it seems to break | the hierarchical context. Consider the | case where you are three levels in and | MoveNext() returns false. You call Move. | Where are you? Level one or level two? | | 2. Re-define the behavior of MoveNext(). | | When reach the end of a list or map, | MoveNext() returns false. In this state, | the only method which can be called | is MoveNext(), all other methods give | a function sequence error. | | When MoveNext() is called a second time, | it's context is up one level, thus it | may move the iterator to the next sibling, | or, it could also return false. | | When MoveNext() has returned false as | many times as MoveIn() was called, then | the stream is complete. | | The above may seem complicated, but it is | what you need... basically, for each | subordinate loop, you keep going till | MoveNext() returns false, and then the | user of the interface moves to an outer | loop and continues. | | ... | | With this change there are a few differences | between this and the nested iterator stack | proposal: | | 1. The iterator stack explicitly keeps | the ancestor hierarchy. While this | proposal does not... although there | is nothing saying that an extension | of this interface giving access to | the ancestor stack couldn't be made. | | 2. Where the nested iterators have multiple | objects, this has a single object. | | 3. Where the nested iterator is invalid state | after Next() returns, this iterator is | in a partially-invalid state after MoveNext(), | hmmm... | | 4. This iterator doesn't have the problem | of the user attempting to call FirstChild | multiple times, where the nested iterator | does (it must throw an exception). | 5. The nested iterator lets the user read only the first child, and then continue with the next sibling of the parent. This iterator forces the user to visit all or none of a given set of children. | Hmm. ok. I must really get to work. | | Bye! | | ;) Clark |
From: Jason D. <ja...@in...> - 2001-06-06 17:03:53
|
> 5. The nested iterator lets the user > read only the first child, and then > continue with the next sibling of the > parent. This iterator forces the > user to visit all or none of a given > set of children. How about a MoveOver() method? (Just kidding. ;-) I actually prefer your method names: in Iterator: Next() FirstChild() in Navigator: Parent() I try to make my method names be verbs or verb phrases but it's hard to do that without becoming too verbose or too cute (in which case it's also usually too vague). If we wanted to be able to process only the first couple children, then I would probably add SkipSiblings() to Iterator. I think that our two approaches are converging into one. I'm not convinced yet that it's necessarily a good thing. I think that a system with only two sets of interfaces (three if we want to add modifying a document in memory) is certainly manageable. The one major advantage to the single iterator approach is that it's typesafe (in statically typed languages). You can never invoke Parent() or FirstChild() when you can't or shouldn't. We can also add a whole slew of other methods to Navigator that would only be appropriate for a random access iterator and clients wouldn't have to worry about invoking any of them because they would know that since they had a navigator and not just an iterator, they were free to invoke these methods without worry. With the multiple iterator approach, there's an implied state that you _sometimes_ have to watch out for and this can't be checked by the compiler. When iterating over a tree-based data source, it should be perfectly acceptable to call FirstChild() multiple times. Why would we want to stop them? But this isn't acceptable for a stream-based data source and the compiler can't help here. He can't tell the difference because both types of data sources implement the same interface. The one major advantage to the multiple iterator approach is that you have access to the parent stack. I don't see how this could be added to the single iterator without introducing the type safety issues I referred to above. But I also don't see why I need access to the stack. Maybe you can convince me otherwise. I keep referring to the document as a tree. I know that it represents a graph but it's serialized as a tree and probably should be traversed like one (to avoid infinite loops). The Navigator interface would probably need a Dereference() method which could be invoked on a reference node and return another Navigator positioned on the referenced node. Or it positions the current navigator itself on the referenced node. MoveToAnchor()? Jason. |
From: Clark C . E. <cc...@cl...> - 2001-06-06 17:36:05
|
On Wed, Jun 06, 2001 at 10:03:48AM -0700, Jason Diamond wrote: | in Iterator: | Next() | FirstChild() That's cool. | If we wanted to be able to process only the first couple children, then I | would probably add SkipSiblings() to Iterator. Ok. | I think that our two approaches are converging into one. I'm not convinced | yet that it's necessarily a good thing. I think that a system with only two | sets of interfaces (three if we want to add modifying a document in memory) | is certainly manageable. We will have two, a push (Visitor) and a pull (Visitor) interface. ;) | | The one major advantage to the single iterator approach is that it's | typesafe (in statically typed languages). You can never invoke Parent() or | FirstChild() when you can't or shouldn't. Let's consider Parent gone from the nested proposal. And if you call FirstChild() a second time, then, just like ReadCharacters() it returns null. You can only call read once... correct? | We can also add a whole slew of other methods to Navigator that | would only be appropriate for a random access iterator and | clients wouldn't have to worry about invoking any of | them because they would know that since they had a navigator | and not just an iterator, they were free to invoke these methods | without worry. I like *extending* the iterator to add a random-access version. | With the multiple iterator approach, there's an implied state that you | _sometimes_ have to watch out for and this can't be checked by the | compiler. This is true in both cases... | When iterating over a tree-based data source, it should be perfectly | acceptable to call FirstChild() multiple times. Why would we want to stop | them? But this isn't acceptable for a stream-based data source and the | compiler can't help here. He can't tell the difference because both types of | data sources implement the same interface. I'm not so sure that we'd want multiple calls to FirstChild anyway... I like the idea of keeping the Navigator interface seperate (perhaps as an extension). | I keep referring to the document as a tree. I know that it represents a | graph but it's serialized as a tree and probably should be traversed like | one (to avoid infinite loops). The Navigator interface would probably need a | Dereference() method which could be invoked on a reference node and return | another Navigator positioned on the referenced node. Or it positions the | current navigator itself on the referenced node. MoveToAnchor()? Right. The Navigator method could have all kinds of "goodies". Clark |