On Thu, Sep 05, 2002 at 11:28:12AM +0300, Oren Ben-Kiki wrote:
| We've agreed to get rid of #TAB. That was easy.
| As for timestamps, we got dragged into the whole problem of implicit types,
| whether they can be extended, and so on. This is heavy stuff and debate was
| very lively :-)
Indeed. Extra flexibility at a cost of complexity; although our
end goal is user readability... so perhaps this isn't so bad.
The biggest problem with this approach is that it will take several
months to flesh out and explore. Although it does seem to be
the "right way" to do it.
| I raised a proposal that was tentatively accepted as "worth pursuing". The
| core of it is to accept that type family is optional (i.e., a node may have
| *no* type family, only a *kind*). It is the loader's duty to convert a
| generic node - which may or may not have an associated type family, and/or a
| format - into some native data structure (and the dumper's job to do the
Right. Aka a "scalar" value but not necessarly a string, integer, etc.
| I'll need to write this down "properly" - what are the exact effects on the
| various data models, processing, consequences for generic tools, the schema
| language, and so on (most of these issues were discussed a bit in the IRC
| session, but not all the way through). This will take me some time so the
| earliest I'll be able to post this would be Sunday.
| - Maybe the native model shouldn't be defined in terms of type family at
| all; instead maybe it should have the concept of a "native data type", a
| "native value", and a "kind". "Type family" and "Format" would only exist in
| the generic model, with the "Viewer" responsible for the mapping. (The
| Viewer is used by the Loader and Dumper - take a look at the data models
| - In this view type family and format are merely instructions to the Viewer
| (OK, Loader) on how to map the generic node to a native one (i.e., it is a
| "transfer method" - what a coincidence :-).
Ok. So in this formulation there isn't a need to distinguish
between type family and format, they can be rolled together
as a single entity; a transfer method?
| - Would containers benefit from format? It seems to me they very well might
| (admittedly rarely), and given the above view forbidding format for
| containers is arbitrary and a needless exception. It would be simpler to
| allow them.
| - I'm impressed by the fact that this is almost identical to Perl's type
| system - and we arrived at it independently. Either Larry Wall was extremely
| lucky, he had a working crystal ball that told him this type system would be
| good for Yaml, or this approach is "right" in some deep way (for scripting
| languages) and he "merely" arrived at it as an "inevitable" result. Of
| course this made our life easier because we had it in front of us, while he
| more-or-less invented it from scratch (AFAIK).
Ok. So, in the generic model each node then is either
a scalar, list or mapping. Further, each node may have a
!transfer|method but this is optional. Let us call "implicit
typing" the process whereby transfer methods are either added or
stripped. The question becomes _where_ does this typing occur.
We have several choices:
(a) it is done by the parser
(b) it is done in a step between the parser and the loader
(c) it is done by the loader
(d) it is done after the loader
In my opinion, for greatest compatibility, it should not be
done in the loader as this would cause each language to have
their own implicit typing mechanism; giving code duplication
and thus implying interoperability concerns. Indeed, this
step could be done in a shared "C" libyaml which all native
Ideally then, this "implicit typing" should be expressed not
in code but as a YAML document which fills in implied
!transfer|methods at either the parser level or right before
the loader. The loader would then be responsible for finding
an appropriate binding for the given transfer|method.
Further, there are two ways typing can occur:
(a) regular expression
(b) by path
The output of the parser is the serial model; thus it appears
as if this typing should be expressable at this model (and
not the generic model). Thus, a YPATH restricted to sequential
access would be sufficient for path-based access.
How about making the type family _mandatory_ in the generic
model and _optional_ in the serial model?
If you agree up to here, then this leaves us a question
as to how we keep this "implicit typing" process open for
specification in the future while allowing us to finalize
I was thinking that the #SCHEMA or #DOMAIN directive could
provide for this escape hatch. In any case, once we fix
the models nothing else really needs to be done here.
| - As for timestamps... I think we had better leave them out of the core
| spec. The use cases we have to day (logging etc.) don't require timestamp as
| a type family. They are all happy using strcmp on two different values (for
| ==, >= etc.).
This only works if everyone uses an even stricter subset, namely
a full explicit ISO 8601 with T and specified up to N digits for
the fraction of a second. IMHO, this is just too ugly to be workable.
| They aren't different in any way from the use cases for using
| URLs in YAML, or IP addresses, or E-mail addresses, etc.
Time is different. URLs and such can be compared for equality
directly by string comparison and various operators arn't defined.
For timestamp it is more complicated.
| in the case of dates, there's ISO as well as other de-facto standards
And lots of ways to write a date in those standards. Ick. This
is exactly the problem that a YAML timestamp solves.
| When a time data type is actually _needed_ it is when the above isn't enough
| (e.g. you need generic YAML tools to provide operators on these values). But
| then a simple timestamp type also isn't enough (e.g., due to time zone
zoneinfo is available almost everwhere and is sufficient
| The core spec should only contain core "_language_ data types" (as opposed
| to core "_application_ data types"),
Agreed. And I use alot of relational database programming languages,
which have TIMESTAMP as a core language data type.