From: Oren Ben-K. <or...@ri...> - 2001-11-13 16:51:48
|
Clark C . Evans wrote: > | - Scalar forms. I summerized the new proposal, which requires 5 > | separate scalar types. We can go with it, keep things as they are > | today, or try to simplify it further. > > How about... [block, folded and "in-line" - the latter being quoted, text or implicit] I thought about it. The problem is our requirements: - We need a 'block' type. - We need to handle RFC822 style: in-line-folded. - We need a most general form for unprintables etc. - We need this most general form to be valid in keys. Quoted does it. In theory this gives us three types so far. But... - We need a simple folded form in a next-line style for top-level. \<eol> does that. - We need the most general form to have a next-line style for top-level. \\<eol> (escaped) does that. Back to 5 styles. Now, I don't see we can merge in a meanigful way any of the first three styles. But we may be able to merge the last two with their in-line (key) counterparts. After all, just being next-line vs. being in-line shouldn't be such a big deal, right? So, I thought we could do the following: - A folded scalar has a '\<eol>' or '\<space>' indicator. The latter is optional if the scalar starts with something reasonable (not a \, not an indicator, not a space, etc.). So: \ this starts with a space And: \ \this starts with a '\'. While: this is what you'd expect. Note: \ *all* are implicitly typed. Otherwise we didn't save a style (IMHO). This brings us back to 4 styles. It also provides a nice way to handle in-line folded starting with line noise. Of course if we do that there wouldn't be no good reason not to allow it for blocks too (at this point Brian says 'ouch', I expect - but what's the harm? besides we could not to if you really feel bad about it). Next, escaped vs. quoted. I couldn't come up with a better alternative than today's escaped: \\ No need to escape " here. However, I think we can simply word the spec in such a way that it would be "the same style" as "quoted". Just call all of single/double-quoted and next-line escaped as an "escaped" style. After all, if we can merge all the 'folded' sub- styles together, we can do the same for 'folded'; and having to escape " or ' in some sub-styles isn't _that_ much worse than not being able to use a : in an in-line folded key. Right? If we go this way it may make sense to change the syntax for chomped block into '||' instead of '|-' (just like we have \ and \\ we'll have | and || - I wonder how Brian is taking this so far :-). Summary: block_indicator ::= '|' '|'? ( #20 -- not in top-level | eol indent(n) ) block ::= block_indicator block-data-as-today folded_indicator ::= '\' ( #20 -- not in top-level | eol indent(n) ) | nice-char -- not in top-level folded_data ::= folded_indicator any-folded-data-as-today escaped ::= '"' double-escaped '"' -- not in top-level | ''' single-escaped ''' -- not in top-level | \\ eol any-escaped What do you think? Almost the same as today but we can weasel it into being just three styles. BTW the productions would be *way* easier if the separation between "text" and "implicit typed" was at the wording level (``the first regexp in the parser's implicit types list *must* be "^alpha" for "org.yaml.text"''). That would save us about half the folded productions... > | - References. OK, Alias != Pointer. Fine. Now, how does YAML serialize > | "\$a->[0]"? I posted an example showing why this is hard (and how > | Data::Dumper gets it wrong). I also showed how it can be done given some > | creative use of the '!ref' type. We need to decide on this. > > Since it appears Perl is kinda "broken" here... Brian? > The Perl version could just treat it as one > would expect from seralizing C, No help. You can do the same thing in C as in Perl. Serializing this is a bitch, but it is the same problem in Perl and C. > This one is squarely in Brian's domain. :-) I think my proposal is the only one which is "safe". > | - Info model. If Clark comes up with a reformulation of > | the info model that JAP can grok, I'd be happy to include > | it into the spec, and more than happy to re-work the simpler > | productions :-) > > Great. I read Clark's latest on this and I agree. Now, Clark, if you could just re-draft section 2.1 (Information model) along these lines :-) You are doing a good job and I think you can cut & paste some of your last post directly into it - especially the examples! > | - Pipelining. Let's leave it out for now - right? > > Base64 is the only use case... it'd be nice to have > a cleaner solution for this though. Perhaps it's best > just to include base64 methods and let the user handle > the issue... Right. > | - YPATH, APIs. Let's not go there yet. > > Ok. However I don't think we can move to "release canidate" > till some of the APIs are worked out, in case there is > some inconsistency we need to fix-up. I guess we would just have a "release candidate candidate" until then :-) > | At any rate, we are very close. Let's finish it. > > +1 So, Brian's work is required on the references, Clark's 2.1 section is needed for the info model, and I need your thoughts on my 3-styles scalar format proposal above... IRC today? Have fun, Oren Ben-Kiki |
From: Oren Ben-K. <or...@ri...> - 2001-11-13 17:01:59
|
Clark C . Evans wrote: > | I like the syntax & semantics of these 5 types *exactly* as you > | posted. > > Agreed. OK but consider my latest as well. > | > - References. OK, Alias != Pointer. Fine. Now, how does YAML > | > serialize > | > "\$a->[0]"? > | > | I'll handle Perl with what we have. Enough said. > > Ok. I'm curious (said the cat). How will you do that? > | > - Info model. If Clark comes up with a reformulation of the info > | > model that JAP can grok, I'd be happy to include it into the spec, > | > and more than happy > | > to re-work the simpler productions :-) > | > | OK > > It'll take me a few days.. I have day job work to do. No rush. > | > - Pipelining. Let's leave it out for now - right? > | > | I just want to know how to serialize: > | > | print YAML->Emit(bless [], "Foo::Bar"); We keep underestimating the power the existing building blocks give us. The answer is, of course, you serialize it as: --- !!Foo::Bar =: !seq :-) Come to think of it, this means one can safely write: --- !gif = : !binary \ [= base64 =] No pipelining syntax needed! What I hate about 'pipelining' is that it messes up the info model. If YAML has to round trip the pipeline I think the above syntax reflects it in a better way. BTW, should we allow ':' in private types? Currently it is banned (you can only use alnum, '-' and '.'). Should we just allow any printables? How about DNS-based names: should the part after the DNS name be anything, or restricted to (alnum '-' '.') as today? > | Hey, final note. What's going to keep us from mucking around with it > | when > | it's finished? I think we should all agree on some kind of frozen > | period, > | where nobody is allow to suggest major changes until we get a few > | implementations completed and though an alpha test phase. > > Sounds good. Agreed. Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2001-11-13 18:01:45
|
On Tue, Nov 13, 2001 at 07:02:48PM +0200, Oren Ben-Kiki wrote: | > | print YAML->Emit(bless [], "Foo::Bar"); | | We keep underestimating the power the existing building blocks give us. The | answer is, of course, you serialize it as: | | --- !!Foo::Bar | =: !seq | | :-) | | Come to think of it, this means one can safely write: | | --- !gif | = : !binary \ | [= base64 =] Neat. Actually... this is kinda cool. Speaking of which, as I recall, depending on the size of the binary object (if it's not on a 32 bit boundary), a base64 value may end with 0,1,2, or 3 equal signs (or something like that). Thus the trailing equal sign is part of the base64 content and may not be there in all cases... [= base64 ] I humbly suggest that this implict syntax addition is dropped... it certainly doesn't help to clarify things. Note that we can always add it later, if the input to the base64 module sees a [= it knows that it came from an implicit source. | No pipelining syntax needed! What I hate about 'pipelining' is that it | messes up the info model. If YAML has to round trip the pipeline I think the | above syntax reflects it in a better way. Neat o. | BTW, should we allow ':' in private types? Currently it is banned (you can | only use alnum, '-' and '.'). Should we just allow any printables? How about | DNS-based names: should the part after the DNS name be anything, or | restricted to (alnum '-' '.') as today? I think the private type should allow any printable. Best, Clark |
From: Brian I. <in...@tt...> - 2001-11-13 19:35:16
|
On 13/11/01 19:02 +0200, Oren Ben-Kiki wrote: > Clark C . Evans wrote: > > | I like the syntax & semantics of these 5 types *exactly* as you > > | posted. > > > > Agreed. > > OK but consider my latest as well. Change the wording if you like. Please leave the syntax & semantics as is. > > | > - References. OK, Alias != Pointer. Fine. Now, how does YAML > > | > serialize > > | > "\$a->[0]"? > > | > > | I'll handle Perl with what we have. Enough said. > > > > Ok. > > I'm curious (said the cat). How will you do that? I need more than "\$a->[0]" to answer the question. I am not going to try to guess what you mean. Please give a full Perl example, and please keep in mind the things I pointed out previously. The above is a reference to a scalar that just happens to be part of an array. I would not attempt to indicate anything about the array unless that were being serialized too. Is that what you want? > > | > - Pipelining. Let's leave it out for now - right? > > | > > | I just want to know how to serialize: > > | > > | print YAML->Emit(bless [], "Foo::Bar"); > > We keep underestimating the power the existing building blocks give us. The > answer is, of course, you serialize it as: > > --- !!Foo::Bar > =: !seq Cool! > BTW, should we allow ':' in private types? Currently it is banned (you can > only use alnum, '-' and '.'). Should we just allow any printables? How about I'd like ':', but I wouldn't just open up the door so soon. How about on an as needed basis? Cheers, Brian |
From: Clark C . E. <cc...@cl...> - 2001-11-13 19:44:28
|
| > > | > serialize | > > | > "\$a->[0]"? | | The above is a reference to a scalar that just happens to be part of an | array. I would not attempt to indicate anything about the array unless that | were being serialized too. Is that what you want? This satisfies my curiousity. | > --- !!Foo::Bar | > =: !seq | | Cool! Yes. Nice insight Oren. Strike "pipelining" from the todo list. | > BTW, should we allow ':' in private types? Currently it is banned (you can | > only use alnum, '-' and '.'). Should we just allow any printables? How about | | I'd like ':', but I wouldn't just open up the door so soon. How about on an | as needed basis? I think fixing the prodution to allow anything other than whitespace after a private marker (!!) will be good enough. It's private, they can do what ever they wish here. ;) Clark |
From: Oren Ben-K. <or...@ri...> - 2001-11-13 18:09:01
|
Clark C . Evans wrote: > | --- !gif > | = : !binary \ > | [= base64 =] > > Neat. Actually... this is kinda cool. Yeah! > Speaking of which, as I recall, depending on the size of the > binary object (if it's not on a 32 bit boundary), a base64 value > may end with 0,1,2, or 3 equal signs (or something like that). > Thus the trailing equal sign is part of the base64 content > and may not be there in all cases... > > [= base64 ] Oh. > I humbly suggest that this implict syntax addition is dropped... > it certainly doesn't help to clarify things. Note that we > can always add it later, if the input to the base64 module > sees a [= it knows that it came from an implicit source. So no trailing ']' either, right? Just the pure base64 data. I'm OK with that. > | BTW, should we allow ':' in private types? Currently it is banned > | (you can only use alnum, '-' and '.'). Should we just allow any > | printables? How about > | DNS-based names: should the part after the DNS name be anything, or > | restricted to (alnum '-' '.') as today? > > I think the private type should allow any printable. In that case I move we go back to the previous wording: (from memory) ``type names begining with a reverse DNS entry are reserved for the DNS domain's owners; type names starting with a IANA type are reserved for that type owners; type name starting with '!' are private...'' - and simplify the productions back to a single one using in_line_non_char+. OK? Have fun, Oren Ben-Kiki |
From: Brian I. <in...@tt...> - 2001-11-13 19:35:35
|
On 13/11/01 20:09 +0200, Oren Ben-Kiki wrote: > Clark C . Evans wrote: > > | BTW, should we allow ':' in private types? Currently it is banned > > | (you can only use alnum, '-' and '.'). Should we just allow any > > | printables? How about > > | DNS-based names: should the part after the DNS name be anything, or > > | restricted to (alnum '-' '.') as today? > > > > I think the private type should allow any printable. > > In that case I move we go back to the previous wording: (from memory) ``type > names begining with a reverse DNS entry are reserved for the DNS domain's > owners; type names starting with a IANA type are reserved for that type > owners; type name starting with '!' are private...'' - and simplify the > productions back to a single one using in_line_non_char+. OK? Fine with me. As long as we are covered for everything else. I was just being cautious. Cheers, Brian |
From: Brian I. <in...@tt...> - 2001-11-13 19:35:07
|
On 13/11/01 18:52 +0200, Oren Ben-Kiki wrote: > Clark C . Evans wrote: > > - A folded scalar has a '\<eol>' or '\<space>' indicator. The latter is > optional if the scalar starts with something reasonable (not a \, not an > indicator, not a space, etc.). > > So: \ this starts with a space > And: \ \this starts with a '\'. > While: this is what you'd expect. > Note: \ *all* are implicitly typed. > Otherwise we didn't save a style (IMHO). Don't like it. Don't want to put any weight on whitespace. And It's ugly IMO. > This brings us back to 4 styles. It also provides a nice way to handle > in-line folded starting with line noise. Of course if we do that there > wouldn't be no good reason not to allow it for blocks too (at this point > Brian says 'ouch', I expect - but what's the harm? besides we could not to > if you really feel bad about it). Ouch! I'd really like to stick to the exact syntax and semantics. The wording is completely up to you. Take it as a challenge :) FWIW, I have no problem with 5 scalar types and 500 productions. > So, Brian's work is required on the references, Clark's 2.1 section is > needed for the info model, and I need your thoughts on my 3-styles scalar > format proposal above... What exactly is left to do on the references. You lost me here. I thought we were all set. Cheers, Brian |
From: Oren Ben-K. <or...@ri...> - 2001-11-14 06:38:13
|
Brian Ingerson wrote: > > > | I like the syntax & semantics of these 5 types *exactly* as you > > > | posted. > > > > > > Agreed. > > > > OK but consider my latest as well. > > Change the wording if you like. Please leave the syntax & semantics as > is. OK. > > | > - References. OK, Alias != Pointer. Fine. Now, how does YAML > > | > serialize "\$a->[0]"? > ... > I need more than "\$a->[0]" to answer the question. I am not going to > try to guess what you mean. Please give a full Perl example, and please > keep in mind the things I pointed out previously. The case of interest is where Data::Dumper fails: my $t = 'text'; $a = []; $a->[0] = $t; $a->[1] = \$t; $b = [ \$a->[0] ]; YAML->Dump(somewhere, [ $b, $a ]); ($B, $A) = YAML->Load(somewhere); $a->[0] = 'new'; $A->[0] = 'new'; print "'", ${$b->[0]}, "' ?= '", ${$B->[0]}, "'\n"; I'm curious about how you are going to handle this case; you could always fail it in the same way Data::Dumper does (try it there and see what I mean). That's OK with me, I guesss; after all people are happy with Data::Dumper (formatting issues aside), so they would accept YAML if it "fails in the same way". Perhaps it would even be a feature :-) > > BTW, should we allow ':' in private types? ... > I'd like ':', but I wouldn't just open up the door so soon. > How about on an as needed basis? I'm not certain what that means. Are you OK with just reverting to the older wording? (start with DNS/IANA -> owned; start with '!' -> private; anything else -> reserved; type string can contain anything). Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2001-11-14 06:49:09
|
| > > | > - References. OK, Alias != Pointer. Fine. Now, how does YAML | > > | > serialize "\$a->[0]"? | > ... | > I need more than "\$a->[0]" to answer the question. I am not going to | > try to guess what you mean. Please give a full Perl example, and please | > keep in mind the things I pointed out previously. | | The case of interest is where Data::Dumper fails: | | my $t = 'text'; | $a = []; | $a->[0] = $t; | $a->[1] = \$t; | $b = [ \$a->[0] ]; | YAML->Dump(somewhere, [ $b, $a ]); | ($B, $A) = YAML->Load(somewhere); | $a->[0] = 'new'; | $A->[0] = 'new'; | print "'", ${$b->[0]}, "' ?= '", ${$B->[0]}, "'\n"; | | I'm curious about how you are going to handle this case; you could always | fail it in the same way Data::Dumper does (try it there and see what I | mean). That's OK with me, I guesss; after all people are happy with | Data::Dumper (formatting issues aside), so they would accept YAML if it | "fails in the same way". Perhaps it would even be a feature :-) The question is, can you detect in Perl that \$t is \$a->[0] ? I think where Data::Dumper gets messed up is when it tries to get cute. If it just did the "dumb" thing, wouldn't it work well? Clark |
From: Clark C . E. <cc...@cl...> - 2001-11-14 07:29:41
|
On Wed, Nov 14, 2001 at 02:01:24AM -0500, Clark C . Evans wrote: | | > > | > - References. OK, Alias != Pointer. Fine. Now, how does YAML | | > > | > serialize "\$a->[0]"? | | > ... | | > I need more than "\$a->[0]" to answer the question. I am not going to | | > try to guess what you mean. Please give a full Perl example, and please | | > keep in mind the things I pointed out previously. | | | | The case of interest is where Data::Dumper fails: | | | | my $t = 'text'; | | $a = []; | | $a->[0] = $t; | | $a->[1] = \$t; | | $b = [ \$a->[0] ]; | | YAML->Dump(somewhere, [ $b, $a ]); | | ($B, $A) = YAML->Load(somewhere); | | $a->[0] = 'new'; | | $A->[0] = 'new'; | | print "'", ${$b->[0]}, "' ?= '", ${$B->[0]}, "'\n"; | | | | I'm curious about how you are going to handle this case; you could always | | fail it in the same way Data::Dumper does (try it there and see what I | | mean). That's OK with me, I guesss; after all people are happy with | | Data::Dumper (formatting issues aside), so they would accept YAML if it | | "fails in the same way". Perhaps it would even be a feature :-) | | The question is, can you detect in Perl that \$t is \$a->[0] ? | I think where Data::Dumper gets messed up is when it | tries to get cute. If it just did the "dumb" thing, wouldn't | it work well? *blush* Ignore what i wrote... I'm tired. Clark |