Thread: RE: [Yaml-core] What is a type-family anyway?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi guys, I've been on vacation these last few days: lots of sun, water, and
temperatures over 100 in the shade - if one could find some, that is :-)

At any rate, I've just caught up with the messages. I have mixed feelings
about Clark's proposal. I can see the bright side, of course, but also
several potential shady issues (pardon the analogy - I do hope the sun
hasn't muddled my brain too much :-).

Here are some concerns, in no particular order:

- HTTP only?

Isn't this a bit restrictive? There *are* other protocols one can use to
fetch web documents (e.g., ftp). And others might be added later on...
What's wrong with keeping our shortcut notation, and merely limiting the
URIs to URLs?

- Optional end of rainbow => URI allowed?

Clark mentioned making it optional (an obvious necessity for private types).
If it is optional, what is the problem with using URIs, exactly? If one
chooses to use an 'isbn:...' type family (Ugh), he's merely opted out of
ever supplying a pot of gold at the end of the rainbow, forever. Which he is
allowed to do anyway...

- XML namespaces

Would this mean giving up on using XML namespaces? That would make
converting XML schemas to YAML that much harder. Now, we had this idea of
*constructing* YAML type family names from the pair {namespace-URI,
local-name}. If we could do that in a reasonable way - say,
namespace-URI/local-name or whatever - we could probably find a way to
preserve this "dual-personality" schema/namespace option for implementers.
It may be vital for YAML gaining acceptance in the world.

- Fragments.

Clark has ruled them out, which is sensible. Supposing that any use of a
fragment in an XML namespace means the author has gone the way of "no gold
at the end of the rainbow, ever", and assuming that the above issue of
mapping XML to YAML is resolved somehow (is '/' allowed in a fragment? have
to check. Maybe we'll have to revert to '$')... anyway, assuming all this,
can we view 'format' as a 'fragment'?

    int: !int#dec 7 # http://type.yaml.org/int#dec

Presumably each format could be a top-level key so the above would make some
sense... That's rather nicer than using '|' for this purpose. Again, only
assuming the XML issues are resolved!

- Minimal key set

What's the minimum one would have to put in the "pot of gold"? I mean, if it
is there at all. An empty document would presumably be legal (after all,
this "pot of gold" *is* optional). But what must a non-empty document
contain? *Should* there be a minimum set of keys, or just "recommended
keys"?

- Relationship with RDDL and other meta-data standards...

Probably someone should set up some way they can be simply cut&pasted into
this scheme, using appropriate top-level key(s) and structure. We should at
least check that this is feasible/reasonable...

- Requirement from a YAML processor.

I think that accessing the "pot of gold" should be *very* optional, and it
should be *crystal clear* it is perfectly possible to handle YAML without
ever thinking about it.

Further, I think it should also be made very clear that it is *very*
unrealistic to expect, in the general case, that anything other than, say,
schema validation, would be possible to achieve for a type family that is
solely "known" by its "pot of gold". That is, any expectation that YAML
application will magically understand "the semantics" of a type through its
"pot of gold" is, to say it kindly, naive.

Of course people could attach code in any interpreted language - scratch
that, in *any* language (Windows X86 DLLs included - Ugh) - to the "pot of
gold", thereby allowing dynamic loading of type family semantics. I find it
to be more of a scary thought than a comforting one :-)

- Relationship with the schema mechanism.

Having a "pot of gold" document accessible via the type family immediatly
suggests that this should be "the way" the schema would be fetched. That's
nice and all, but... is it practical to chop the schema into multiple
physical documents this way (one per type family used)? Putting aside
efficiency issues, what about problems like version control and being easy
to read/write?

Keep in mind that a collection type family will constrain its contained
sub-nodes to the n-th degree regardless of their type - or, I should say,
*in addition* to the generic restrictions specified by their type.

- The risk factor.

There is a giant leap between "type families are unique IDs with
human-readable definition" and this proposal. And unlike everything else
we've done with YAML, this would be exploring into new lands because nobody
I know of has done anything like it.

Here we are speculating about what may or may not prove useful to
application developers, and I for one do not have the personal experience in
such dynamic-loading extendible-from-the-web-yet-strongly-typed systems to
say whether this make sense or not.

Actually, I have a great deal of experience (as do we all) in one such
system, the HTML browser. And it is a terrible mess, a failure of standards
to achieve anything like a sane system - the most we can learn from it is
what *not* to do.

It may make more sense to use DNS-like mechanism. Or just make direct use of
DNS. Or LDAP. Or WebDav. Or something. It may make more sense to have each
top-level key reside in its own physical document. I have no idea, because I
don't have a good grasp of the use case. Speaking of which...

- The use case?

What *is* the use case (other than being able to answer the newbie about
"what does a type family point to")? What is the class of applications that
want to be schema-aware but not schema-specific? If the answer is
"validating parsers and authoring tools" than I think that this proposal is
a serious overkill. A simple schema language would do the trick for both.

Is it something like "web services"? I have strong doubts about whether
something like this proposal is actually useful for such services (given a
schema language exists). Services require a much stronger knowledge of
"semantics" than would be offered by the "pot of gold". IMVVHO, that is -
since nobody ever saw "web services" actually working as hyped, that's all
anybody has to offer, I'm afraid. On the other hand, using "point-to-point"
or "client-to-server" schema-specific XML-RPC/SOAP/etc. *is* working in
practice. Again this only requires a schema language (if that).

- Effects on the spec?

If we agree the "pot of gold" is optional, and if we make it easy to look at
a URL and say whether it is a "pot of gold" or not (simplest way: give it a
distinctive mime type), is there really any reason to change the spec? It
seems to me we can safely define this whole thing in a separate spec - "A
convention for using YAML type families as URLs for fetching meta-data". We
can start by giving some meta-data for our type core families as *an
example*.

If people like it and build on it - great. If it is useless for 99% of the
people in the world (my suspicion at this point - feel free to set me
right), no great loss, either. We'd have merely over-formalized a bit how we
define type families.

Minor changes to the spec may still result (specifically, handling of
fragments and formats - and mentioning that there *is* an *optional*
convention for meta data planned/available at a separate spec). I would be
more than happy to discuss them under such an approach.

- Effects on time table?

I suspect it will take ages to settle the issues this proposal raises. I'm
less than enthused at the thought of wording such a chunk of functionality
into our core spec. From the narrow point of view of "let's get a spec out
the door", this proposal seems to be a serious problem.

I could be wrong here - especially if it is worded as something optional,
and would be rather loosly defined. By still, at this point, my vote is to
otherwise steer away from this whole thing in the YAML 1.0 *CORE* spec.
Let's create a separate YAML 1.0 *META* spec for this instead. Our current
spec is big enough as it is anyway...

Have fun,

    Oren Ben-Kiki

Thread: RE: [Yaml-core] What is a type-family anyway?

yaml-core