On Wed, Jun 9, 2010 at 5:03 AM, Matthew Willson <matthew.willson@gmail.com> wrote:
* A tagged custom type which is constructed from (say) a string rather than from a map - my example was {"$tag": "!!timestamp",  "$value": "2001-12-15T02:59:43.1Z"}

My apologies, I just spotted this in your example on a second look ("!!date March 2, 1962") and from that I could have guessed what you intended for an anchored string too.

I guess there's a trade-off to be made here between conciseness and speed/ease of parsing. One of my goals was to get as much mileage as possible out of fast native JSON parsers, so I tended towards letting the JSON parser do as much of the work as possible, rather than encoding things in strings which I would then need to parse myself in javascript. The only overhead which it adds in the common case, is traversing the parsed JSON tree for objects and checking for at most 3 special key names on each object encountered, whereas yours additionally needs to visit each string with a regexp.

I wouldn't design a language based on a premature optimization of an implementation detail. With all the code needed to make a JSYNC implementation, I doubt that detail would make any noticeable difference. On the other hand, making the the language overly complex surely would be noticed.

A couple more points, if you aren't convinced. First, anchored regexes are very fast operations. Second, you haven't avoided traversing the entire tree. In your example, every leaf node would need to be examined to see if it was one of your special objects. And speaking of objects, in your example you use an object with 2 pairs of 4 strings, where I use a single string. So not only is your serialization longer, requiring more memory, but your in memory representation is huge compared to a simple string. If this causes more mallocs, you can throw out any savings you _might_ get avoiding a regexp.

To me, none of that matters. The real issue is creating a nice, adoptable language.


But then again my proposal is less concise when it comes to serializing. Using "!", "&", "*" keys (as per your proposal) rather than "$tag", "$anchor", "$alias" etc would help a bit with that though. 

   "!!int 55": "I can't drive",

Presume you also have a plan for the case where the key is an object too? :)

I do. It was talked about in an earlier discussion on this list. It's not pretty, but it works. And objects as keys are not very common so that's OK with me.

{
    " ": {
        "JSYNC": "1.0",
        "TAG": {
            "!": "tag:ingy.net,2005:"
        },
        "keys": {
            "&001": [2,5],
            "&002": [6,6]
        },
    },
    "!": "diceRolls",
    "*001": 14,
    "*002": 21
}

%TAG ! tag:ingy.net,2005:
--- !diceRolls
[2, 5]: 14
[6, 6]: 21
...

Basically a key containing a single space, points to an object that can contain extra YAML information. It can also contain objects to be used as keys later, by reference. That's the only way I can think of to do complex keys in JSON.

Also, I guess an alternative approach (or perhaps one which is undertaken in parallel?) would be to propose an extension to json with new syntax for references and tags, to match the YAML model, but only the minimal extra syntax required to do so, with the goal of keeping it as easy to parse as possible.

Explain this more. I don't really see what you are proposing...

I was wondering about defining (as a parallel effort) some minimal syntax extensions to JSON which add the features (tags, anchors etc) required for the YAML model, but without going as far as full yaml syntax, eg something like:

{
  "foo": *anchor,
  "bar": !date &anchor "2010-01-01",
  123: 456,
  ...
}

It would require an extended JSON parser to deal with it of course, but still a fair bit easier to parse than full YAML syntax. Perhaps a bit of a half-baked idea, but it might complement the goals of JSYNC somewhat, defining something which is easier to parse but supports the full YAML model with tags/aliases, and also offering a version of it which is embedded within plain JSON as an alternative for cases where you want to be able to use a plain JSON parser.

The above is valid YAML[1], so effectively you are defining another subset of YAML. I would advise against this. Other people have defined non standard subsets of YAML, in the name of simplicity. (YAML::Tiny in Perl). I think this just muddies the waters, and confuses people. It would be better to get all the YAML implementations working properly and in harmony, with similar APIs. I think JSYNC, would facilitate that.

Cheers, Ingy

[1] It is actually invalid YAML because the alias precedes the anchor.
 

-Matt