Oren Ben-Kiki wrote:
>
> > > 1. Brian stated that he would invstigate Oren's Syntax
> > > and get back with us if it meets Perl's serilization
> > > requirements for hard references. If not, specify
> > > what alternatives we can use.
> >
> > I don't think it's that important to investigate. It will probably
> > always be a moot point. I will let Data::Denter use it's current scheme
> > to deterministically round-trip all Perl data structures. YAML.pm
> > probably will have no need for this. It's all acadenic and I have no
> > spare time for academics for three more months. (My guess is, yes it
> > could be made to work, but would be suboptimal for Perl people) Let's
> > leave it at that for now.
>
> Does that mean we are giving up on Denter using YAML syntax (extended to
> handle pointer-to-pointer)?
Just for the record, the Perl component for YAML is called YAML.pm.
Data::Denter is only of interest to Perl programmers from this point on.
It may fondly be remembered as the catalyst for YAML 1.0. And it may
keep a greedy eye on the YAML projects treasures, but that's of no
concern here.
>
> I'm going to go over it with a fine-tooth comb, just to see what is involved
> in making YAML a superset of it. I guess I'll also have to look at MIME
> while I'm at it, with the same comb :-)
Beware of the nits! Nasty buggers. ;)
>
> > On 4 & 5. I don't really like the blank line at the beginning thing
> > because people will mess it up or not understand it. And we have many
> > heuristic options.
> >
> > A) Parse lookahead for X-YAML-Version
> > B) Option-A rarely needed because as soon as we see a key that is *not*
> > RFC822 compliant, we assume YAML. 99% of the time this is the first
> > line!
> > C) If there is no whitespace allowed before the colon in RFC822, we
> > simply make it a requirement in YAML. Or does this break your RFC
> > compatability rules?
> >
> > Just for my own edification, would you please explain the rationale
> > behind making YAML RFC822 compliant. And do so with one of more specific
> > examples. Thanks :)
>
> Well, for example, suppose that YAML was a "good enough" superset of RFC822.
> Then we could just adopt my idea that "blank lines separate top-level maps"
> and we wouldn't have to say anything further about RFC822 headers, period.
> If one wants to read/write a mail message as a YAML document, then it will
> simply work (as long as he sticks to the "safe" constructs there). If one
> wants to have a YAML document that has nothing to do with RFC822, that also
> works. No need for any special statement about them. I like this approach
> best.
I think that sounds right, if I understand it correctly.
My only contention above was the very first blank line, not the ones
separating documents.
> >
> > " this is the hash\n key for this example :-) " : #class :
>
> I assume the trailing ':' is a typo?
>
No. See earlier post message for the reasoning.
> > |# My Perl Subroutine
> > |
> > | sub version {
> > | if ($_[0] =~ /\n/) {
> > | return \ "\to sender";
> > | }
> > | }
> >
> > Sorry for overloading this example with so many weird things. I'll just
> > comment on the multiline semantics:
> >
> > A) Trailing whitespace is preserved if the transporter preserves it.
> > B) The content can always be encoded before transport anyway.
> > C) Nothing is escaped. The content is truly verbatim. A '\' is a '\'.
> > D) An implicit newline is assumed to be at the end of every line.
>
> We have to decide what our position is about them, BTW. Is a newline a "\n"
> or a "\n\r" - the answer may be different in-memory and in the text file
> (and thank you, O nameless DOS/CPM programmer, for inflicting this on us :-)
Bastard of Bastards. :(
But I think the heuristic is quite simple. Since the newline is
implicit, just replace whatever is there with the system's native
choice.
>
> > E) Note that the '|' is one column back from the actual indentation
> > level. This is intententional. And it will work even if the indent width
> > is set to one character wide. (not mandatory, but I like it.)
>
> Under Python indentation rules, there's no problem indenting the "label"
> line by 4 characters and the text lines by 7, or whatever. What you say
> about one character indentation, however, implies that the following would
> be legal:
Yes. It would be legal.
>
> text:
> |multi-line
> |text
>
> I'm not certain I like it. I think Clark should make the call here -
> indentation is his baby.
I actually don't like it for another subtle reason. Tabs. You couldn't
use them properly with this scheme. So let's scrap the backing up one
space requirement. And yes, that's my final answer ;)
>
> I started thinking about it and hit on an issue which Brian may already have
> thought about - or will have to very soon, if he's covering YAML.pm :-) The
> problem is we haven't defined the data model (or, viewing it differently,
> the round-tripping issue).
>
> In "dynamic" languages such as Perl, JavaScript, Python (and to some extent,
> Java), it is natural to map a YAML map to the native hash, a list to a
> vector/array, and a scalar value to a simple string. That works admirably
> well, as long as the YAML entity hasn't been annotated with an ID or a class
> name.
>
> If one wants to provide a stable-round tripping utility (e.g., suppose I
> want to write a YAML pretty printer), where am I to store the ID of a scalar
> value? The class of a map? For this use case, it seems my best course of
> action is to wrap the native construct (map/list/scalar) in an object which
> has an "id", a "class", and a "value".
>
> There are several options:
>
> A) Use the native constructs when possible, and only use "wrapper" objects
> when there's a need. That makes access pattern unpredictable: do I write
> map{key} or map{key}.value?
That's my idea.
>
> B) Always use wrapper objects, and give up on de-serializing YAML into
> arbitrary native data structures. Big hit on usefulness - if we do this,
> Brian will just give up on us :-)
You're getting to know me pretty well ;)
>
> C) Declare that IDs may be re-written arbitrarily, even by pretty printers.
> That is, banish them from the data model.
I think I agree...
>
> That leaves "class" as the only problematic issue. We explicitly decided not
> to talk about it in the conference call. It seems to me like there's no way
> around requiring that this data will survive round-trips, but I also don't
> see how it is possible to de-serialize "scalar value" into a normal "Java
> String" if someone attached an "unknown" class to it.
I've read through this briefly, but don't have time to comment yet.
Let's stick with the original syntax for now.
In general, keep in mind that YAML 1.0 will *not* be the final YAML
spec. It will evolve to YAML 2.0 and so on. For now, let's strive for
maximum sytactic simplicity. I think we can special case the semantics
of 1.0 without needing to change the current syntax.
> > >
> > > 12. Brian mentioned that he'd show YAML to one of
> > > his Perl friends. (sorry I didn't catch his name)
> >
> > Damian Conway http://www.csse.monash.edu.au/~damian/
>
> His input will be greatly appreciated.
Emailed Damian last night. He's preparing for an 11-week world speaking
tour. I'll see him in June at the YAPC (Yet Another Perl Conference) in
Montreal and I'll be sure to pin him down about YAML. BTW, I mentioned
to Clark that I'll probably be speaking about YAML at YAPC :)
> > > 15. Clark agreed to write up the "single vs multi"
> > > line controversy and post to the list so that
> > > it is clearly understood.
>
> I thought we settled this... Every scalar value is potentially multi-line.
> It doesn't seem to cost us anything, or does it?
I agree but see below.
>
> > > 16. We made little progress on the scalar indicator
> > > for lists, to colon or not to colon. It wasn't
> > > agreed, but Clark thinks this is someone else's
> > > monkey. If Oren and Brian can't agree within
> > > 7 days, Clark will put on the dictator cap.
> >
> > We traded in the '$' for the ':'. '$' as the last character in a line
>
> I thought ':' was the first one; it is "as if" it is a normal header, with
> the key "just happening" to be empty. This seems more consistent.
>
> > meant a multiline scalar was to follow. Converting this semantic to the
> > ':' leaves us with these represntations:
> >
> > key1 : @
> > single line
> > :
> > classless folded
> > multi line
> > another single line
> > and another
> > #class &0001 :
>
> : #class &0001
No, not a mistake.
>
> > classed multi
> > line
> > #class &0002 classed single line
> > %
> > key : value
> > @
>
> This is an empty list, right?
Yup. Just to keep you on your toes :)
>
> > ~
>
> And this is a null?
Indeed.
>
> > #classy %
> > key : value
> > : even this multi line on the same line
> > as a colon thingy works because there
> > a little bit of indentation imposed by
> > colon. (Although I don't love it)
>
> This means the following:
>
> : single line
>
> Will also work, even though you *really* dislike it. I like them :-)
Noted :-)
>
> > : "Another thingy like above that meets"
> > "RFC822 wackiness"
> > :
> > | 1
> > | 1 1
> > | 1 1 1
> > |Just for completeness :-)
>
> I think we've said everything there's to be said about this, and whether or
> not you find either:
>
> list:
> : One
> : Two
> : Three
> and Four
>
> Or:
>
> list:
> One
> Two
> :
> Three
> and Four
>
> To be beautiful or ugly is, when all is said and done, a matter of taste. To
> you, the extra ':'s are an eyesore; to me it seems strange that the
> multi-line value is "more indented"; it seems as though there's structure
> involved, when there isn't. I also like being able to do /^:/ in VI to get
> to the next entry.
While your comment on aesthetics may be true, there is a major
distinction between what you think a ':' means and my intent.
1) A ':' is always a key value separator. We agree on that, but each
want it to have one other meaning.
2) You want colon to be a "list bullet" in list context.
3) I want ':' to mean '$' for scalar values. And I want it to almost
always be optional (unless there is ambiguity)
4) That said. We can make it the canonical/default form for emitters if
we wish.
Consider the following four examples.
1) Fully qualified with '$'.
key1 : $ my dog has fleas
key2 : $ "$40.00 for veternarian exam"
key3 : $
The vet said, "Yes Ingy,
Your dog has fleas."
key4 : $ Ingy said, "Wow,
my dog has fleas!"
key5 : #class1 $ I hate fleas
key6 : #class2 $
What is your viewpoint
about fleas?
key7 : #class3 @
$ Tom the flea
$ Dick the flea
$ Harry the flea
is not really hairy
%
foo : bar
#class4 %
FOO : BAR
#class5 $ A very classy flea
#class6 $
|My favorite fleas:
| Jim
| Bob
2) Fully qualified with ':'. The only real gain here is no " for $40.00.
key1 : : my dog has fleas
key2 : : $40.00 for veternarian exam
key3 : :
The vet said, "Yes Ingy,
Your dog has fleas."
key4 : : Ingy said, "Wow,
my dog has fleas!"
key5 : #class1 : I hate fleas
key6 : #class2 :
What is your viewpoint
about fleas?
key7 : #class3 @
: Tom the flea
: Dick the flea
: Harry the flea
is not really hairy
%
foo : bar
#class4 %
FOO : BAR
#class5 : A very classy flea
#class6 :
|My favorite fleas:
| Jim
| Bob
3) Minimal
key1 : my dog has fleas
key2 : $40.00 for veternarian exam
key3 :
The vet said, "Yes Ingy,
Your dog has fleas."
key4 : Ingy said, "Wow,
my dog has fleas!"
key5 : #class1 I hate fleas
key6 : #class2
What is your viewpoint
about fleas?
key7 : #class3 @
Tom the flea
Dick the flea
: Harry the flea
is not really hairy
%
foo : bar
#class4 %
FOO : BAR
#class5 A very classy flea
#class6
|My favorite fleas:
| Jim
| Bob
Note that the only required ':' (besides the key/value ones) is for ':
Harry the flea'
4) Suggested canonical form:
key1 : my dog has fleas
key2 : $40.00 for veternarian exam
key3 :
The vet said, "Yes Ingy,
Your dog has fleas."
key4 : Ingy said, "Wow,
my dog has fleas!"
key5 : #class1 : I hate fleas
key6 : #class2 :
What is your viewpoint
about fleas?
key7 : #class3 @
: Tom the flea
: Dick the flea
: Harry the flea
is not really hairy
%
foo : bar
#class4 %
FOO : BAR
#class5 : A very classy flea
#class6 :
|My favorite fleas:
| Jim
| Bob
So in this last example we always use the optional scalar indicator ':'
for all scalars in a list (by default). Note that a #class or &id
*always* comes before a %, @, or :. It's just that the ':' is usually
optional.
The things I don't allow are:
key1 : @
: %
: @
: : a scalar
: #class %
: #class @
: #class : a scalar
The problem with ':' as a "list bullet" is that it could not be
optional. And that's too restrictive just to satisfy a personal
aesthetic.
, Brian
--
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'
|