On Fri, Aug 03, 2001 at 02:02:15PM +0200, Oren Ben-Kiki wrote:
| I'm going to play the "purist" here. What I want is for the *core* info
| model to be a *tree* of map, list, null and text scalar.
| Instead, we'll ensure the YAML APIs will allow for a *graph* whose nodes
| include "native data object" as a valid node. Note that this node may be a
| non-leaf node, references and cycles are allowed, etc.
Ok. So we are going to make two information models then,
one which is layered over the other?
My primary concern:
I don't want a short-hand mechansim where what appears
to be a scalar in the file is actually a map when loaded
into memory. In other words, I don't want:
tag: !class value
| How is it possible to reconcile these two opposing statements? The answer is
| (star wars music) "Trust the Color, Luke!". The color idiom is built on two
| concepts. One, the ability to put on "colored glasses" and see only keys of
| a certain "color". Two, the ability to easily and efficiently construct an
| application by layering different modules. These two reinforce each other.
| (Think of each "color" as a regexp pattern. If the key matches the pattern,
| it has this color. In XML, for example, a color would typically be
| 'some-namespace-prefix:.*' - let's not delve any deeper there :-).
A layered mechanism for type determination of
scalar values based on regular expression matching.
This mechanism plus the base information model
gives a "typed" model.
Coloring only applies to scalars, and thus only
a scalar can be typed with this mechanism?
A mechanism where a map or list can be treated
as a scalar. This provides for forward-compatible
changes by allowing programs expecting a scalar
to be given a map/list.
Note that this mechanism depends upon the
knowledge of what the application program is
expecting, and thus it is a "pull" property and
is not supported directly via language bindings.
| The core YAML layer (parser/printer) is as trivial as you can get - just
| handling a tree of map/list/string.
| A different layer would handle mapping from this core data model to native
| objects. This layer could use specially colored keys to control this
| mapping: which native type to use, how to parse the value, how to identify
| references, etc. It could also use patterns on the values ("this looks like
| an integer"). It could use schema information ('delivery' is expected to be
| of type 'date'). It could use some combination of the above.
| Since this layer is not part of the core, I don't care much. So if Clark's
| application needs the concept of reference-by-key, Joe's needs
| reference-by-path, and Brian's needs reference-by-anchor, There's no problem
| involved. We can all do it without messing up our simple, clean YAML core.
| In fact, it may be possible to mix and match their implemenations in one
| application, as long as they are well behaved.
Ok. I think I got it. We could have a type layer which
parses the scalar
key: !java.lang.Date 2001-3-23
as a Date object for 23 Mar 2001. To give a map a class
name we could, at this layer have the convention of
using the "!" key. Therefore...
! : com.clarkevans.DateRange
to : 2001-3-23
Which would be a map, having the given class. In both of these
cases, the YAML is "stupid", but the higher layer that knows
about objects can take hold.
Similarly, at this layer, Brian's types can be handled...
| - The YAML APIs should be defined so that they will work for a graph
| containing native data structures. This means storing all the state of a
| visitor/iterator in the visiting/iteration context as opposed to within the
| nodes themselves. It should be possible to apply a YAML visitor/iterator to
| and even in Java (with more effort). In C++ etc. we'll have to require the
| native data structure to cooperate somewhat - implement some interface etc.
Ok. Thus... a coloring layer for references
would intercept scalar values having & and *
via regular expression matching and re-writing.
val: &001 23.45
would add the "anchor" to the mix, and produce
a de-normalized view:
Ok? But this, once again, is done in a higher
layer. So the base YAML would see the above
as it is... complete with the & and *.
| - We need to define the color pattern for the mapping keys. We can define
| just one such pattern (today: single character indicator keys), or we can
| delve into the issue of how to solve the general problem of managing
| globally unique application specific colors (or ids in general).
| - We need to have a way to make YAML files readable even when colored keys
| are used. At minimum, we have to ensure they are readable when the mapping
| layer's colored keys are used. This is too basic a use case to leave it to
| the full map syntax. We may also choose to try to provide a general
| mechanism for more readable application specific colored keys.
I'm not sure you can limit coloring to any particular notion
of "keys" as if the colors may be stripped... I'm skeptical
that this would wokr. Take the refernece example below.
Stripping the colors would leave "ref" as a blank string
and this is clearly not acceptable.
| The first issue (API) is technically difficult. However it is certainly
| possible to solve. Also, it has no bearing whatsoever on the core YAML
| format spec (it would effect the core API spec).
| My proposal: Let's table it for a while. This doesn't effect Brian or anyone
| working on the high-level load/save API.
I'm not sure that we can just ignore this issue
as I'm not convinced that layering can work without
making N YAMLs (one for each possible combination
of layers). And N YAMLs is bad.
| The second issue (Color patterns scheme) is politically difficult - or at
| least the general problem is.
| Historically, XML has chosen a horribly complex way to do it
| (document-defined mapping from prefixes to URIs). Java has chosen a simple
| but verbose way to do it (reverse DNS strings); a shorthand mechanism
| ('import ...') battles verbosity. IANA has suggested a simple, terse but
| centralized way to do it (central registry of universal color patterns).
| My proposal: Let's keep the "single indicator character" pattern reserved
| for this. Define a set of such keys in DATA-1.0: '!' for type, '#' for
| comment, and '&' for anchor; reserve some for future use; and reserve some
| for application-specific use.
Ok. I'll respond to the rest under a separate cover.