At 04:36 PM 8/2/2001 -0400, Clark C . Evans wrote:
  1.  I think that a scalar node should be defined not as
      a string of zero or more characters, but rather as
      any object that can be _serialized_ as a tuple:

        (type name, sequence of zero or more characters)

Are you saying that the in the info model the scalar does not have an associated type, while in the serialization it does?

In the model I posted, a scalar is a possible value of an optionally typed node.  I think the model already gives you that tuple.

Or are you saying that you'd rather not have the type factored out of the individual nodes, so the model looks like this:

A "node" is one of the following:
- A "map"
- A "list"
- A "scalar"
- A "null"

A "map" consists of the following:
- A "type name"
- An unordered set of zero or more "pairs"

A "list" consists of the following:
- A "type name"
- An ordered set of zero or more "nodes"

A "scalar" consists of the following:
- A "type name"
- A string of zero or more "characters"

A "null" consists of the following:
- A "type name"

This informationally equivalent to what I posted, just a different way of portraying it.

  2.  Characters must be defined with a one to one match
      with Unicode.  In particular, 0x0 through 0xFFFFFFFF
      is too broad.  The character code point range
      should be limited to...

      [#x0-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Ah, but this is not the YAML text file.  YAML's eu8 production allows this.

  3.  We need also need a sequential information model
      which extends the core information model by...

      a) adding an anchor to each node
      b) introducing the reference node.

Is that part of the information model?  The information model describes what YAML means to represent, not how YAML represents it.

A purely sequential API may need to make this distinction, but I argue that this is an implementation detail.  If multiple nodes contain the same node, is it informational to know which node ended up getting the serialization and which ones got the reference?  The distinction didn't exist pre-serialization; it doesn't round-trip.