Thread: [Yaml-core] Summary of the IRC session

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

We've agreed to get rid of #TAB. That was easy.

As for timestamps, we got dragged into the whole problem of implicit types,
whether they can be extended, and so on. This is heavy stuff and debate was
very lively :-)

I raised a proposal that was tentatively accepted as "worth pursuing". The
core of it is to accept that type family is optional (i.e., a node may have
*no* type family, only a *kind*). It is the loader's duty to convert a
generic node - which may or may not have an associated type family, and/or a
format - into some native data structure (and the dumper's job to do the
reverse).

This is almost, bit not quite, completely unlike our current "implicit
typing".

I'll need to write this down "properly" - what are the exact effects on the
various data models, processing, consequences for generic tools, the schema
language, and so on (most of these issues were discussed a bit in the IRC
session, but not all the way through). This will take me some time so the
earliest I'll be able to post this would be Sunday.

I've already started thinking about formalizing this, and here are some very
preliminary notions:

- Maybe the native model shouldn't be defined in terms of type family at
all; instead maybe it should have the concept of a "native data type", a
"native value", and a "kind". "Type family" and "Format" would only exist in
the generic model, with the "Viewer" responsible for the mapping. (The
Viewer is used by the Loader and Dumper - take a look at the data models
diagrams).

- In this view type family and format are merely instructions to the Viewer
(OK, Loader) on how to map the generic node to a native one (i.e., it is a
"transfer method" - what a coincidence :-).

- Would containers benefit from format? It seems to me they very well might
(admittedly rarely), and given the above view forbidding format for
containers is arbitrary and a needless exception. It would be simpler to
allow them.

- I'm impressed by the fact that this is almost identical to Perl's type
system - and we arrived at it independently. Either Larry Wall was extremely
lucky, he had a working crystal ball that told him this type system would be
good for Yaml, or this approach is "right" in some deep way (for scripting
languages) and he "merely" arrived at it as an "inevitable" result. Of
course this made our life easier because we had it in front of us, while he
more-or-less invented it from scratch (AFAIK).

I don't want to start a language war here or anything, and Parrot is
supposed to run Python and Ruby programs as well anyway... I'm just rather
surprised by this result. If you would have asked me a year back my bet
would have been that we'd end up being more "traditional" and Perl would be
the "odd man out". In fact when I first encountered "bless" I thought it was
a horrible hack; now I want to bless Larry for getting it right.

Either way, YAML makes a *perfect* fit for Parrot now. Way to go!

- As for timestamps... I think we had better leave them out of the core
spec. The use cases we have to day (logging etc.) don't require timestamp as
a type family. They are all happy using strcmp on two different values (for
==, >= etc.). They aren't different in any way from the use cases for using
URLs in YAML, or IP addresses, or E-mail addresses, etc. In all these cases,
simply thinking of them as a string and letting the application worry about
its internal format is the right way to go. And in all these cases, there
are standards external to YAML that specify how these strings should be
formatted (in the case of dates, there's ISO as well as other de-facto
standards).

When a time data type is actually _needed_ it is when the above isn't enough
(e.g. you need generic YAML tools to provide operators on these values). But
then a simple timestamp type also isn't enough (e.g., due to time zone
issues). We should start work on a separate spec that would cover both
time/date and currency (A "Recommended YAML type families for business
applications" spec). Clark could take the lead there. We'll cover fun stuff
such as time zones and time periods and currency conversion rates and so on.

There may also be a similar separate spec for URLs and E-mail addresses and
IP addresses and domain names (A "Recommended YAML type families for network
applications" spec) - The Ruby people may want to drive this one, as it
matches some of their built-in types. Maybe another spec for units (A
"Recommended YAML type families for engineering" spec), and so on.

The core spec should only contain core "_language_ data types" (as opposed
to core "_application_ data types"), which means all the types we have today
minus the timestamp.

Thoughts?

Have fun,

	Oren Ben-Kiki

Thread: [Yaml-core] Summary of the IRC session

yaml-core