Re: [Yaml-core] updated yaml.h

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Tue, Jun 05, 2001 at 09:23:29AM -0700, Jason Diamond wrote:
| I found some issues with some of the productions and have some other
| comments in general but I wanted to post the API that came out of last
| night's session before I headed off to work.

Great.  Thank you so much for proposing an alternative.
I'd like to hear your issues with the productions... it came
from my head (after discussion) to paper... without review.
So I'm sure there are problems!  I'd love to hear them.

| namespace YAML
| {
|   public interface IYamlReader
|   {
|     bool Read();
|     YamlNodeType NodeType { get; }
|     int Level { get; }
|     string Name { get; }
|     int Index { get; }
|     string Value { get; }
|   }
|   public enum YamlNodeType
|   {
|     Scalar, List, Map, Reference
|   }
| }

Ok.  This is understandable.  I specifically like the idea
of having the "node" attached to the iterator instead of
through the Read() function.  With this insight...

I've modified the proposed "C" API by moving the current
node pointer, which was returned from the "first" and "next" 
functions into the iterator structure.  Also, I've moved
the parent pointer from the node structure (where it didn't
make sence, since YAML is a graph) to the iterator structure
(where it does make sense due to nested iterators).  So...
with the insight gained from your proposal... here is a
modified "C" API....  much simpler!  Thanks! ;)

struct yaml_node {
  enum { YAML_LIST, YAML_MAP, YAML_REF, YAML_SCALAR } type;
  enum { YAML_AUTO, YAML_SIMPLE, YAML_QUOTED, YAML_MULTLINE,
       YAML_BLOCK  } style;
  const yaml_char_t *key;     /* key if node has parent map  */
  const yaml_char_t *anchor;  /* anchor used for referencing */
};

yaml_iterator_t* (*yaml_first_t)( const yaml_iterator_t *self);
bool_t           (*yaml_next_t) ( const yaml_iterator_t *self );
size_t           (*yaml_read_t) ( const yaml_iterator_t *self,
                                  void * ptr, const size_t size );

struct yaml_iterator {
  yaml_next_t      next;     /* function to ask for the next node */
  yaml_first_t     first;    /* function to ask for first child   */
  yaml_read_t      read;     /* function to handle scalar content */
  yaml_node_t     *curr;     /* the current node                  */
  yaml_iterator_t *parent;   /* the parent iterator               */
};

Specific points include:

  (a) A node is a separated from the iterator so that the
      interface may be directly used over an in-memory strucure.
      The node contains the node type, the "style" used to
      serialize (if a scalar), the key (if used in a map), 
      and the anchor (if provided).

  (b) The next() method of the iterator advances the 
      "curr" node to the next in its sequence (a map or list).
      At the end of a map or list, it returns false, and
      sets the curr pointer to NULL.  

  (c) There is a first() method to get the first child
      of a map or list.  This returns a subordinate iterator,
      which will have the "curr" value filled in with the first
      item in the list or map.  If the map or list is empty, 
      then a NULL iterator is returned.  By calling "next()"
      on a list or map node invalidates any subordinate 
      iterators and makes all children unreachable.  Thus,
      by not calling "first()" children of a sequence can
      be skipped if desired.

  (d) There is a read() method to read in the scalar value
      of a given node.  Note that the read method is operable
      for a map or a list that have not been iterated as well 
      as a scalar.  For a list, read() calls read() on the 
      lists's first child.  For a map, it calls read() on 
      the map entry having the key of "=".  On a map or list
      either first() or read() are exclusive options.  Since
      this is a "C" api, the amount of data recovered by the read 
      must be specified by the caller.  Thus "read" behaves like 
      most unix-style "C" read functions, returning the number 
      of bytes moved from the input stream into the buffer.

  (e) An in-memory version of the node and iterator structure
      could be implemented fairly easily by adding additional
      methods and/or data values, such as a way to create an 
      iterator to visit the children of a map or list node!

Thus, overall, this requires a bit more work on the
implementer of the parser:

  (1) A stack of iterators (and subsequent nodes) must
      be maintained by the parser.

  (2) A smart read() must be implemented to comply with
      the "substutability principle".

Although it may seem more complicated, this reduces
the burden on the application developer:

  (3) Child nodes can be skipped easily if not needed.
  (4) Substitutability is easy.. if not totally grockable.
  (5) When processing a node, the entire ancestor stack
      is available (holding it's content).

Best,

Clark