Thread: [Yaml-core] #9 - Ambiguous Tags

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Ok.  Here is a suggestion, call it #9. It incoporates several ideas
floating around:

  - It uses the Python/Ruby style of name resolution, as suggested
    by T.Onoma and Why.  That is, you check for a local (aka private)
    package first, next you check built-in packages, and failing that,
    an exception is raised.

  - It incorporates David's suggestion of limiting built-in types to
    only words (but allowing the '/').  This helps reduce the chance of
    collisions, you can be sure that resolution of built-in packages
    will always fail if you use names like "Perl::Package" or
    "com.company.JavaPackage', etc.

  - It also incorporates David's suggestion of using 'implicit-plain'
    and 'implicit-not-plain' tags to make implicits easier to grok;
    this happens to put some very nice makeup on a ugly wart.

  - It follow's T.Onoma's request that he be able to specify a 
    private tag that is _not_ subject to default %TAG cooking.
    It make it possible to _expressly_ disable cooking no matter
    what %TAGs are present

  - It allows people to use YAML tags in most cases without problem,
    and, but, if they really want to be super-safe they would need
    to use explicit %TAG based typing.

  - It provides a model for Brian's notion that the Application
    is the final authority of what each node's tag is; that is,
    the proposal formalizes ambiguity.  

  - It incorporates, for the first time, a rationalization of
    how implcit typing should be done; which is still poorly
    defined and explained in the specification.

First, let me review/define the types of 'serialization' tags:

  - Global tags are those that are globally unique, traditionally,
    these have been URIs; that is, they start with a word followed by a
    colon and use only URI characters. Strictly speaking, Perl::Packages
    happen to match this production, so they could also be considered
    global even though they are not URIs.

  - Private tags are those that have meaning local only to a given
    processing environment.  They are convient to use, but may conflict
    with other uses.  Therefore, they should be used carefully but, in
    most 99% of cases, there just isn't a problem with collisions.

  - Magical tags are those which are explicitly provided, but happen
    to not be Global nor Private.  It is not necessary that magic
    tags be used; as a combination of global or private tags would
    suffice for many purposes.

  - Missing tags are those that are not provided in the YAML
    syntax.  These have been traditionally been called "implicit" tags, 
    but please use "missing" instead, as it is far more clear.

Then, we define a process, called 'Cooking', which is done by the parser
and is purely a syntax-only operation on a Document's tags.  The cooking
process uses the %TAG directive to change magical tags into either
Global tags, or Ambiguous tags (defined below).  This is done without
any application involvement and is completely defined by the YAML
specification.

  - Ambiguous tags are Magical tags which do not become 'Global' during
    the cooking process.  They are also Missing tags, with the following
    names (provided by the Cooking process):

      plain scalar        ->  !implicit-plain
      non-plain scalar    ->  !implicit-scalar
      mapping             ->  !implicit-mapping
      sequence            ->  !implicit-sequence

    Therefore, the result of the 'Cooking' process is a non-empty
    tag having either Global, Private, or Ambiguous tags.  While
    it is not strictly necessary to give mappings and sequences
    non-empty tags, it is done for consistency.

Then, we have another process, called 'Resolution' converts Ambiguous
tags into either Global or Private tags.  Unlike cooking, this is an
application-directed process; probably carried out by the YAML Processor
via given instructions.   The information used by the resolution process
is restricted to that provided in the YAML Representational Model.  In
particular, 'Resolution' should be viewed as a transformation of the
YAML graph, the result of resolution _is_ a different YAML document,
albeit one that will typically be directly related to the source
document plus schema information.  Note that 'Resolution' does not in
any way affect Global nor Private tags. Thus, one can provide a private
or global tag, and no matter how the resolution process is defined, it
will be passed through unchanged.

The last stage of processing, 'Recognition' usually happens during
loading, where each node's tag is used to "find" an appropriate native
data type and construct the appropriate binding.  If a tag is not
'recognized' during this process, it is an error.

states:    { O: Orignal, C: Cooked, R: Resolved }
category:  { G: Global, P: Private, _: Missing, 
             M: Magic, A: Ambigous, '*': Depends }

In a more concreate form,

  ---                  # OCR   After-Cooking
  - !http://yaml.org   # GGG   http://yaml.org
  - !Perl::Package     # GGG   Perl::Package
  - !!private          # PPP   private
  -                    # _A*   implicit-plain
  - ''                 # _A*   implict-scalar
  - !int               # MA*   int
  ...

  %TAG clarkevans.com,2004:   #default namespace
  ---                  # OCR   After-Cooking
  - !http://yaml.org   # GGG   http://yaml.org
  - !Perl::Package     # GGG   Perl::Package
  - !!private          # PPP   private
  -                    # _A*   implicit-plain
  - ''                 # _A*   implict-scalar
  - !int               # MGG   tag:clarkevans.com,2004:int
  ...

  %TAG clarkevans.com,2004: cce
  ---                  # OCR   After-Cooking                 Resolve?
  - !http://yaml.org   # GGG   http://yaml.org               No
  - !Perl::Package     # GGG   Perl::Package                 No
  - !!private          # PPP   private                       No
  -                    # _A*   implicit-plain                Yes
  - ''                 # _A*   implict-scalar                Yes
  - !cce^int           # MGG   tag:clarkevans.com,2004:int   No
  - !int               # MA*   int                           Yes
  ...

Basically, in this proposal, which we can call #9 if you wish,
is much like #8, only that the default is not private; it is
the process of:
  - check for private matches, if not,
  - check for any 'regex' based matches
  - use matches from tag:yaml.org,2004,
    namely !str, !map, !seq for implicit-s
  - raise an exception.

So, it attempts to blend the 'implicit' mechanism with the 
!unambiguous tags.  If people use !ambiuous tags... well,
that's their choice; possibly enough rope so they can do
cool things; or, perhaps enough rope to hang themselves,
but, in any event, using ambiguous tags (implicit, or
non-private non-global tags) _is_ recognized as a transofrmation
of the YAML document and treated appropraitely.

Cheers!

Clark

Thread: [Yaml-core] #9 - Ambiguous Tags

yaml-core