Brian Ingerson wrote:
> I can live without class names for Scalars and Arrays, but Hashes are a
> must. Most perl objects map into Hashes.
> Even Array and Scalar ones can
> be serialized into Hashes.
Sounds interesting. How?
> Hashes work for all Python objects and
In C++ (or Java) you'd want to de-serialize map/list/scalar as standard
library hash/vector/string; this makes it difficult to provide an extra
"class" property to list/scalar.
> Classes will be in. I think we should allow these
> permutations since the syntax seems to allow them.
> We will need a disclaimer stating that not every
> YAML system will be able to preserve the round-trip
> information for some combinations, namely:
> a) Having & and * on the same line.
> b) Using # with @ or with a scalar
I'd rather say - "NO strict YAML 1.0 system" instead of "Not every YAML
system". These constructs simply aren't YAML 1.0, period. Otherwise, people
will be very surprised when they write:
x: #float 1.2
y: #float 3.4
And notice the YAML pretty printer strips away the '#float's.
Let's think a sec about *why* we want to add classes. It seems they are
intended to allow a de-serialization into application-specific native data
types (that is, something other then Hash/Vector/String).
I think that 99% of the time, the application specific data type is an
object class (like "point", above). In which case it is represented in YAML
as a map, with a key for each data member. It follows that the data type for
each key is already determined by the map's class; e.g., class "point"
expects float coordinate values. So there's no need to explicitly declare
Can anyone come up with a use case where it makes sense to assign a class to
something which isn't a data member of an object, and isn't an object by
itself? The only thing I can think of is top-level keys - when the whole
YAML file is de-serialized into a single object and there's no top-level
element to declare the type in. I think that declaring a top-level element
is good form in such a case, and that it isn't onerous to require that.
In the remaining 1% of the cases, there is a workaround. Use the good old
color idiom - that's exactly the sort of problem we invented it for, after
all. So, taking my "# :" syntax, using the "default value" concept Clarl put
in YAML for this explicit purpose (:-), and spicing it with using '=' as the
key for the "default value" (pretty intuitive), you get:
OK, that's more verbose, but remember we are talking about 1% of the cases.
Besides, we should eat our own dog food - either we believe in the color
idiom, or we don't.
A word on the color idiom: This is a concept which was raised a while back
in SML-DEV, in relation to schema evolution, meta-data vs. data, etc. Take
the simple 'point' example:
That's a simple 'object' with two data members, and you can write YAML code
handling it. Then, one day, you decide you want to add an accuracy
designation to each. Or maybe you round-trip data from XML people, and you
want to round-trip the information of whether 'x' and 'y' were attributes or
sub-elements. At any rate, you want to add meta-data to 'x' and 'y'.
The color idiom says: "color" the 'x' and 'y' elements with sub-elements
containing this meta data. Take the original value and place it in a
sub-element as well (this would be the "default" sub-element Clark talked
about. If you are stumped for a name for it, just use '=' by convention).
The point is that using YAML's "default" rules, old code will continue to
work unbroken on this new data structure. The value of 'x' is still '1.2'.
If you aren't interested in accuracy, you just ignore it.
have to think of this a bit). In Perl, for example, you'll have to define a
conversion to scalar context which retrieves the default value (this is
possible, right? It is a bit beyond my Perl mastery).
It turns out the "color idiom" solves a great many problems - schema
evolution, layering of processing modules, round-tripping to other syntax
forms, versioning, you name it. So Clark and I are rather fond of it :-)