Thread: [Yaml-core] schemas and pull parsers

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I am starting to think about how I would write a validating YAML parser on top
of PyYaml.  There are two major issues here--how do you define the schemas, and
how do you implement the parser.  Just to get the discussion rolling, let me
throw out a really simple example:

SCHEMA:

allowed:
    - type: map
    - keys:
        - type: string
          name: FirstName
        - type: string
          name: LastName
          validation: |
              (value in ['Evans', 'Smith', 'Jones'])
        - type: string
          seq: 1
          name: Children
          max: 10

CONFORMING YAML:

---
FirstName: John
LastName: Smith
Children:
  - Billy
  - Mary

A validating parser would want to have a pull interface to the base parser.  (I
don't believe any existing YAML implementations provide such an interface.  Most
implementations load the whole document at once.)  On the particular data set
that I've given, the validating layer might make the following calls into a pull
parser:

>>> parser.isStartMap()
1
>>> parser.getKey()
'FirstName'
>>> parser.getScalar()
'John'
>>> parser.getScalar()
'Smith'
>>> parser.getStartMap()
1
>>> parser.getKey()
'Children'
>>> parser.isStartMap()
1
>>> parser.isStartList()
1
>>> parser.getScalar()
'Billy'
>>> parser.getScalar()
'Mary'
>>> parser.getScalar()
StopIteration exception thrown
>>> parser.getKey()
StopIteration exception thrown
>>> parser.getKey()
StopIteration exception thrown

Does this make sense so far?

Once you implemented all the methods for a pull parser, you would probably want
to implement the all-the-data-at-once yaml.load() method on top of the pull
parser methods.  Since the generic load call would not have a schema to drive
the parsing process, you would need some sort of node-testing method(s).

>>> parser.getNodeType()
'map'
>>> parser.getNodeType()
'scalar'
>>> parser.getScalar()
'FirstName'
>>> parser.getNodeType()
'scalar'
>>> parser.getScalar()
'John'
# etc.

Thanks,

Steve

Thread: [Yaml-core] schemas and pull parsers

yaml-core