Re: [Yaml-core] source control for YAML implementations

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Steve Howell (sh...@zi...) wrote:
> > round trip: [ ruby ]
> 
> I wouldn't use the inline array syntax for the "round trip" specification,
> because some implementations (like the pure Python implementation) might not
> support it, and we want to get folks bootstrapped with the tests as soon as
> possible.

Yeah, definitely.  The tests should be comprised of implicit sequences and 
mappings.  I've even wondered if document parsing is a bit much for starters.  
I didn't add document parsing until version 0.19.  Then, again I guess it's 
not too difficult in most languages to split the file based on the '---' and 
handle each test individually.

We should also put a limit on the character set used in the unquoted strings.
Many new parsers may not be able to handle:

  ---
  name: YAML Specification 1.0: URI Escaping

I'd say that if we want to use such characters, we have to put it in a
block.  

> We need to decide whether round tripping is the exception or the rule.  I
> think it might make more sense to have a "no_round_trip" specifier, for the
> following reasons:
> 
> 1) When you're not round tripping, it's more clearly documented in the tests.
> 2) When you get round tripping working, you get the satisfaction of removing
> the line from the test.
> 3) The test suites stay smaller.

If we do use it, I'm going to keep with 'round trip' over 'no round trip' for 
these reasons:

1) New authors wouldn't have to go through and add every test they can't round
trip, which is bound to be most.
2) When you get round tripping working, you get the satisfaction of adding
the line into the test. ;) I'm joking.  Sorry.
3) For me, round tripping is an exception.  I don't plan on being able to
round trip all of the tests.  I would simply have to store too much parsing data
in the structures returned to the user.

Perhaps we could just flag tests that are intended to be round tripped.  For
example, Clark (in the discussion about an ID element) mentioned that anchors
and aliases are not designed to be round-trip-able.

The spec says:
"An alias node only exists in the syntax and serial models. When converted to 
the generic model, an alias node becomes a second occurrence of the anchored 
node."

So, examples with anchors and aliases shouldn't round trip.  Conversely, I'd
find it very interesting to see which implementations can round trip private
types and domain types.  I mean to see how the different implementations
handle this, along with whether they can both dump and load a domain type would
be quite cool.

> We should define round tripping precisely.  When I say things round trip, all
> I mean is you can take the python, export it to YAML, read it back into
> python, and it's the same python.  The round tripping aspect of a test is
> often somewhat orthogonal to the "YAML parsing" aspect of the test.
> 
> Round tripping tests an implementation to be a pure serializer, but it
> doesn't test its ability to deal with multiple forms of human-entered input.

This is a very good point.  I think you're right about the round trip aspect
being orthogonal to the parsing.  I definitely think there are implementations
which could skip the round trip altogether.  I'd want the tests to be useable
for someone who is just writing a simple YAML parser.

I think the whole idea of the round trip field is just for interest factor.
Just like the 'brief' field, it's a bit of extra information that seems
worthy of belonging in our central file cabinet.

> > I've noticed that Brian also has notes in his tests, which give reasons why
> > the test won't round trip.  It would nice for implementation authors to
> > have room for notes.  Not sure where it would go.
> 
> I think we could just do this ad-hoc for now.  Add a "notes" key at first if
> you want notes, and then if different implementations want to note different
> things, we can work out a system later.

Yeah, sounds good.

> > Also, let's say different implementations crop up for different languages.
> > Maybe it would be nice to accomodate a section for each implementation
> > instead, with notes.
> >
> 
> I don't think naming conflicts will be a big issue.  I could see a breakdown
> like this:
> 
> perl - Brian's perl
> ruby - Why's ruby
> python - Steve's python
> libyaml_perl - Neil/Brian's perl
> libyaml_python - Neil/Clarks' python
> (etc.)

Also, agreed.  

_why