Re: [Yaml-core] Proposal to add implicit typing

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Thu, Feb 05, 2004 at 03:37:55PM -0800, Sean O'Dell wrote:
| > What's we have got is a direction, not much more. We think it would make
| > sense to adapt Relax-NG approach to YAML. This means discarding tons of
| > XML-isms from it, and adding a few things... In general the idea is that
| > a schema is something like a BNF syntax whose "tokens" are YAML
| > primitives. The schema "BNF productions" will each describe a single
| > tag. In this approach, explicit tags would be very rare - except for a
| > tag for the root of the document, indicating the "start production". If
| > not using private tags, it is trivial to merge several schemas together
| > and mix-and-match their elements... And so on.
| 
| Blah, I didn't like Relax-NG at all.  I'm not even sure that a document 
| format schema is the right place to approach implicit typing schemas;
| although the value pattern matching situation is similar.

Well, we are thinking in this mode as a good starter.  Murata San is
a very capable thinker and I'd like to reuse his work if possible.  That
said, YAML is very different than XML, so at some level (a pretty low
level) we will be quite distinct.

| You know what?  If you put a statement in the specification that basically 
| said "implicit typing schemas may change the type tag of a scalar, but it is 
| the individual parser implementations to determine which native data object 
| is used in place of the scalar" I would be really happy.  No matter what 
| happens with schemas, I think that's a safe statement to make.  I mean, the 
| alternative is "typing schemas strictly govern which native language data 
| objects are loaded in place of scalars" which, I think, would be sort of 
| crazy, eh?

As I se it, you can think of data typing as having two stages:

  * The first stage is looking at YAML nodes _lacking_ a tag,
    and filling in the tag. Let us call this 'tagging'.

  * You can find the appropriate native data type for each
    node based solely upon its tag, this is called 
    'resolution'.

The restrictions upon 'tagging' are as follows:

   - you can use what 'context' the node is in, that is, 
     look at its parents to make up your mind

   - you can use the value of a node, that is, look at
     its children or its text value

   - you can use the node's kind, if it is a scalar, 
     sequence, or mapping

   - in the very special case of a scalar, you can use
     the distinction between it being plain scalar or not.

You may not use serialization or presentation attributes
(plain scalar hack excepted) during tagging, these include
but are not restricted to:

   - distinctions between styles other than plain, 
     for instance single vs double quoted
   - the order of mapping keys
   - comments, specific spacing, etc.

The reason for this restrictions are, of course, to make YAML
information more consistent across implementations.  In this 
way, someone is free to convert a single quoted scalar into
a double quoted scalar or reorder keys without worring about
changing the information being presented.

Hope this helps.

Clark