On Wed, Jun 9, 2010 at 12:19 PM, William Spitzak <spitzak@rhythm.com> wrote:


Thank you for your time and ideas.

Ingy dot Net wrote:

"!!timestamp 2001-12-15T02:59:43.1Z"

I do not like this at all, because now you have to make up rules for how the value is quoted and you need to implement a nested parser for this. Leading spaces? Quotes in it? Backslashes in it? In fact you might as well just put the entire YAML file into one quoted string and call that a solution. I think any real solution has to split things into the final units at the JSON level, with the only post-processing the removal of single bytes from the starts of returned strings.
The rules of JSYNC are quite simple, and the parser you speak of is a couple simple regexes. Adding a wrapper structure to every annotated node was exactly the solution I wanted to avoid. I prefer to annotate the three kinds of nodes, using their native facilities.

But to be honest, the hard work is not in either syntax method, but in figuring out how to add the YAML stack over current JSON parser/emitters in the wild. You could easily try both syntaxes and see how they compare.

Where you have simplified in having one way to do it for everything, you have changed the overall memory structure of the graph. Where I have 3 rules instead of one, I have preserved the original structure. Either way is worth a try though, because they are effectively the same.

To be clear:

1) For mappings: look for 2 special keys: "&" and "!".
2) For arrays: inspect the first entry for a string beginning with "&" or "!".
3) For a string: look for a "&" and/or "!".

Since neither tags nor anchors can contain a space char, you can parse them out by matching from the start to the next space. Trivial.

More below...

A value with an anchor:
       ["&anchor", <value>]

{"&": "anchor", ...}
["&anchor", ...]
"&anchor ..."

A value with a tag (the %XX syntax is decoded so !!a%20b is written as "!!a b"):
       ["!tag", <value>]

A value with both an anchor and tag:

       ["&anchor", "!tag", <value>]
       ["!tag", "&anchor", <value>]

{"!": "tag", "&": "anchor", ...}
["!tag &anchor", ...]
"!tag &anchor ..."

A value that is a reference:


A map entry with a key that is a string with no tag or anchor:

       "key": <value>

A map entry with a key that is a reference:

       "*anchor": <value>

A map entry where the key is any object, including numbers, null, true, false, arrays, maps, and values with anchors or tags. Here 'A' is a string generated by the converter that does not conflict with any other anchors (note that parser can distinguish these anchors because they are declared as keys, not as part of an array value):

       # YAML: <key>: <value>
       "&A": <key>,
       "*A": <value>

This is a fantastic method. Way better than my proposal. I will fully adopt it for the forthcoming JSYNC specification. Thank you.

All string values that start with 1 or more '.' characters followed by any one of '&*!@' have a single '.' removed to get the actual value:

       ".!foo" -> "!foo"
       "...!foo" -> "..!foo"
       ".1" -> ".1" # NOTE NO CHANGE!
       "..." -> "..." # AGAIN NO CHANGE!

I have agreed with this logic, since you mentioned it. My only worry is that it doesn't future-proof well. You might want to escape other leading sigils.

I have updated the http://jsync.org example to reflect this.

Errors: Strings starting with '&' or '!' at unexpected places are an error. Strings starting with '@' are always an error. Arrays containing strings starting with '&' or '!' must conform to one of the patterns above, otherwise they are an error.

   I guess the ultimate awkward corner case would be:

   A tagged custom type, which is also given an anchor, and which is
   created from a map with some non-string keys and some
   (suitably-escaped) reserved keys.

--- !custom &anchor
55: I can't drive
awesome: '!!!'

 ["!custom", "&anchor",
 {"&1": 55,
  "*1": "I can't drive",
  "awesome": ".!!!"

  "!": "custom",
  "&": "anchor",
  "&1": 55,
  "*1": "I can't drive",
  "awesome": ".!!!"
Cheers, Ingy