[Yaml-core] type family -> conversion operator?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I just spend some time on the phone with Brian.  Oren is right, 
it definately is coming down to perspective.  To start, let's note
that Brian's "loader" has two stages:

  1. Generic loading; where the graph is constructed into
     native Hashtable, Array, and Scalar objects (the kind)
     plus a family/format pair.

  2. Binding; where the application provides a bunch of converters 
     identified by the family.  The converters process their node
     and return a replacement.  In this method you can think of 
     the tuple (kind,family,format,content) being converted into
     an opaque application/language specific object with an identity
     and a type.

A few observations:

- In the Native context (and even in the Filter context) the
  family URI is _gone_.  In the Filter context there is probably
  a way to obtain the family URI algorithemically, but the family 
  URI itself is just not there.

- There is nothing stopping the Binding process from deriving its
  final type based upon kind, family, format, or content.   And it
  may be unwise to think of it in other terms.

So.  I think calling the URI a "family" or "type family" or 
implying somehow that it is a type or represents a type is
problematic.  Really, that URI is used by the loader to find
an appropriate conversion function which binds the generic
YAML data into a native representation.   

Given the reality of the implementation, its quite clear to 
me that the URI is an identifier for a *conversion operation*
and not an identifier for a given type or type family.  Then a 
few things fall into place:

1.  It's clear that the URI is not part of the Native model,
    as the conversion operator is not the "type" of the node.

2.  How the conversion operator deals with the factors of
    format, context, and kind are its business.

How does this sound?

Best,

Clark

P.S.  Here is some brain storming I did to arrive here which
you may all find useful...   following are contexts which
may or may not be equivalent to what we currently call model.

Syntax: Information found in a YAML serialization.
Parser: Information provided by a event-based parser API.
Loader: Information used to build language/application bindings.
Filter: Information that must be preserved for round-tripping.
Native: Information stored natively by a non-roundtrip application.
Tool  : Information used by the YAML toolset (YPATH, YSCHEMA, etc.)

The Parser context is based on the Syntax; less information regarding
style, comments, and lots of other syntax level details.  The Loader 
context is based on the Parser; less information such as key ordering, 
alias positioning, and other details associated with sequential-access
limitations.  The Filter context is based on the Loader; less information
which is not required to round-trip, such as format.  Native context
may contain any information in the Loader context.  The Tool context is
idential to the Loader context.

Ok.  The "graph" model is a mix of the Loader, Native, and Filter contexts
which is why we are having a difficult time talking about it.  Here
are alternative statements about what the "graph" model is:

   1.  It defines what information found in the serialization
       may be used to construct a native object and thus excludes
       information such as the style and key ordering since these
       elements are not found in the graph model  (aka the Loader). 

   2.  It defines what information must be preserved by the 
       application for round-tripping data, and thus excludes
       information such as the format (aka Filter).

   3.  Provides an abstraction of the application's native data
       as it relates to YAML (aka Native)

   4.  Describes the interface or abstractions which YAML tools
       operate upon, such as YPATH, etc.  (aka Tools)

These four contexts sometimes contradict each other.  For instance,
if your loader does not resolve leaf nodes, and passes this onto the
application, then the loader must contain the "format".  However,
since format should not be round-tripped, format is not in the graph
model.

The question on the table is where does "kind" exist?

Parser:  "My events exude kind, there is no way around it".
Loader:  "I need kind so that I may unpack the events."
Filter:  "I must round-trip the kind".
Native:  "Kind for me is wrapped up with the data type."
Tool:    "My API depends on kind for its operation".

So, it appears that "kind" must be obtained from every context
up to but not including the Native.   Format is somewhat similar:

Parser:  "I pass on the format, to me it is an opaque thingy".
Loader:  "I need the format to unpack leaf nodes".
Filter:  "Ahh, format, not needed."
Native:  "This is something for the loader to worry about..."
Tool:    "I need format to dispatch which regex to use or
          I need the loader to normalize my strings for me."

So, it seems format and kind are very similar.  Note that
the "transfer vector" from the parser to the loader consists
of family, kind, format, and content; where content is either
characters or a sequence/mapping of children per the kind.

Hmm