From: Brian I. <in...@tt...> - 2001-10-25 00:56:13
|
Oren and Clark, I've read over the new proposals and rebuttals and I think we're onto something. Of course, this has given me some new ideas of my own. Here's my take: inline1: unquoted inline2: "quoted" nextline1:: unquoted. an ending colon after the first colon indicates that a quoted or unquoted scalar follows. nextline2:: "If the first character is a double quote (or maybe even a single quote???) then it is a quoted scalar\n" block:| a trailing pipe or backslash indicates block form. next line intentionally left blank: Something you should be aware of at this point is that the separator is ': ' or ':| ' instead of ':' or ':|'. Where the ' ' is really /\s+/ one or more whitespace (including a newline) no newline block:\ In other words, the scalar value cannot immmediately follow the separator. This may not be 100% necessary but makes things clearer visually, and less edge cases. See classnames and references below. parse error: expected map but no colon found at this indentation level inline empty string: also empty string:: null: ~ map: foo: bar who is that foo who is:: sitting at the bar? list: : foo : bar :100: sparse :: "This string is element 101 in the list." inline negative number: -42 nextline negative number:: -42 list2: - dash can now easily be used - as a list bullet. -100: this looks a bit strange -: But it's a rarer case - Just thinking. object:!org.yaml.foo attr1: value attr2: value referenced object:!org.yaml.bar:&001 attr1: value self: *001 scalar based object:!.foo: Now we can accomodate scalar based objects. This is one because of the trailing colon. Also, any class '.foo' can be an abbreviation for 'org.yaml.foo'. A single node classname like 'foo' should mean that it is private (ie user defined). standalone reference:*001 empty map:!.map empty list:!.list another null:!.null another empty string:!.string unquoted string:!.unquoted "I like YAML", said the camel. perl subroutine:!.block sub greeting { my $name = shift; return "Hello $name"; } another block:!.chomped Look ma: NO TRAILING NEWLINE adjustments from other emails: : existing:!com.clarkevans.Person first: Clark last: Evans missing:!com.clarkevans.Person : That's it actually :) This seems to handle everything in a quite pleasing way. I'm sold. You? Cheers, Brian There are still other things I want to talk about like: things: : implicit types : special keys : starting productions : production separation : indentation characters (tabs anybody?) : round-tripping comments : throw-away comments : DWIM modes : preprocessors/filters : YAML standard library (MIME etc) But one major change at a time :) |
From: Clark C . E. <cc...@cl...> - 2001-10-25 03:22:40
|
Ok. I must say that I like the "packed" :? pattern where ? is an indicator. This is compact and kinda neat. Summary of "new" indicators (immediatley following :): | This is a block scalar with a trailing new line \ This is a block scalar without a trailing new line : This is a multi-line scalar I have a few suggestions: A. Strengthen the packing requirement, making the indicator mandatory follow the colon. B. We strengthen the : indicator to specifically mean an un-quoted, un-escaped multi-line scalar. C. For multi-line keys, we strictly use the quoted form. (for readability only) If I use the same keys, they are meant as the proper replacement. | inline1: unquoted | inline2: "quoted" | nextline1:: | unquoted. an ending colon after the first colon | indicates that a quoted or unquoted scalar follows. | nextline2:: | "If the first character is a double quote (or maybe | even a single quote???) then it is a quoted scalar\n"| nextline2: "\ Multi-line quoted scalars are rather easy to do. It \ is simply a quoted \"inline\" scalar which happens \ to extend multiple lines. It needs no special treatment. \ And it removes the exceptional case from the multi-line \ unquoted, unescaped, : scalar.\n" | block:| | a trailing pipe or backslash indicates block form. | next line intentionally left blank: | | Something you should be aware of at this point | is that the separator is ': ' or ':| ' | instead of ':' or ':|'. Where the ' ' is | really /\s+/ | one or more whitespace (including a newline) | no newline block:\ | In other words, the scalar value cannot immmediately | follow the separator. This may not be 100% necessary but | makes things clearer visually, and less edge cases. | See classnames and references below. | parse error: | expected map but no colon found | at this indentation level | inline empty string: | also empty string:: | null: ~ null-note: IMHO, null: ~ should also be syntax error. null:~ | map: | foo: bar | who is that | foo who is:: | sitting at the bar? map: foo: bar "who is that \ foo who is":: sitting at the bar? map-note:: I find the above more readable since the " mechanism is clearly a multi-line thingy." | list: | : foo | : bar | :100: sparse | :: | "This string is element 101 in the list." list-note:: Having the sparce list is kinda neat. Given the trailing colon it is very understandable. | inline negative number: -42 | nextline negative number:: | -42 | list2: | - dash can now easily be used | - as a list bullet. | -100: this looks a bit strange | -: | But it's a rarer case | - Just thinking. list2:~ | object:!org.yaml.foo | attr1: value | attr2: value | referenced object:!org.yaml.bar:&001 | attr1: value | self: *001 another:!org.yaml.zar:&001 attr1:!org.yaml.integer:&002 34 attr2:*002 attr3:&002| A block that is given an anchor attr4:*002 | scalar based object:!.foo: | Now we can accomodate scalar based objects. | This is one because of the trailing colon. | Also, any class '.foo' can be an abbreviation | for 'org.yaml.foo'. A single node classname | like 'foo' should mean that it is private | (ie user defined). comment: nice | standalone reference:*001 | empty map:!.map | empty list:!.list | another null:!.null thoughts:: I'm uncomfortable with this since I'd like each node to have a kind and a type in the information model. Kind = map, list, scalar And type is user defined. More thoughts here. I have a hunch that merging them is dangerous. | another empty string:!.string | unquoted string:!.unquoted "I like YAML", said the camel. | perl subroutine:!.block | sub greeting { | my $name = shift; | return "Hello $name"; | } | another block:!.chomped | Look ma: | NO TRAILING NEWLINE danger:: This I'm very uncomfortable with. The "style" used (the type of scalar) is othogonal to type -- for instance I may have a "org.yaml.perl" type but a "yaml.org.block" style... | adjustments from other emails: | : | existing:!com.clarkevans.Person | first: Clark | last: Evans | missing:!com.clarkevans.Person | : | That's it actually :) notes:: This is still problematic -- as missing is ambiguous. It could be a map, list, or single-line scalar. More thought on this is required. | This seems to handle everything in a quite pleasing way. | I'm sold. You? Yep. Suggestions A and C are simple, and I hope you like them. Suggestion B makes multi-line scalars easy to grok (no exception for the colon) -- and I don't mind what it does to quoted scalars. If you or Oren don't like the suggestions, then we can go with this. However, the last three blocks above are not going to scale as I see it. We can't condense them into a single construct... they are very different. Best, Clark |
From: Brian I. <in...@tt...> - 2001-10-25 06:19:47
|
On 24/10/01 23:32 -0400, Clark C . Evans wrote: > Ok. I must say that I like the "packed" :? pattern > where ? is an indicator. This is compact and kinda neat. > Summary of "new" indicators (immediatley following :): After reading your assessment below, I think that I did not explain the delineator syntax clearly enough. (You are close but sometimes slightly off) Here is an attempt to write a grammar for it: delineator ::= ':' descriptors? indicator lwsp+ descriptors ::= descriptor ( ':' descriptor )* descriptor ::= class | anchor | index class ::= '!' classname anchor ::= '*' anchorname index ::= '#' indexnumber # possible syntax for sparse lists indicator ::= '' | # List or map or inline scalar ':' | # quoted or unquoted nextline scalar '|' | # block with trailing newline '\' | # chomped block (no trailing nl) '~' # undefined or null value lwsp ::= #x20 | #x09 Not perfect grammar I know, but hopefully clarifying... > > | This is a block scalar with a trailing new line > \ This is a block scalar without a trailing new line > : This is a multi-line scalar Multiline wasn't part of the equation. Nextline was the key issue. > > I have a few suggestions: > > A. Strengthen the packing requirement, making the > indicator mandatory follow the colon. Not sure I get you here. The indicator *is* mandatory. It just happens that one possible indicator is ''. > > B. We strengthen the : indicator to specifically > mean an un-quoted, un-escaped multi-line scalar. What about it's next-line property? I don't see the need for this. > > C. For multi-line keys, we strictly use the > quoted form. (for readability only) OK by me. (I think :) Oren? > > If I use the same keys, they are meant as the proper > replacement. > > | inline1: unquoted > | inline2: "quoted" > | nextline1:: > | unquoted. an ending colon after the first colon > | indicates that a quoted or unquoted scalar follows. > | nextline2:: > | "If the first character is a double quote (or maybe > | even a single quote???) then it is a quoted scalar\n"| > > nextline2: "\ > Multi-line quoted scalars are rather easy to do. It \ > is simply a quoted \"inline\" scalar which happens \ > to extend multiple lines. It needs no special treatment. \ > And it removes the exceptional case from the multi-line \ > unquoted, unescaped, : scalar.\n" Well, this will work anyway. So I still would like to be able to: nextline2:: "Multi-line quoted scalars are rather easy to do. It \ is simply a quoted \"inline\" scalar which happens \ to extend multiple lines. It needs no special treatment. \ And it removes the exceptional case from the multi-line \ unquoted, unescaped, : scalar.\n" > > | block:| > | a trailing pipe or backslash indicates block form. > | next line intentionally left blank: > | > | Something you should be aware of at this point > | is that the separator is ': ' or ':| ' > | instead of ':' or ':|'. Where the ' ' is > | really /\s+/ > | one or more whitespace (including a newline) > | no newline block:\ > | In other words, the scalar value cannot immmediately > | follow the separator. This may not be 100% necessary but > | makes things clearer visually, and less edge cases. > | See classnames and references below. > | parse error: > | expected map but no colon found > | at this indentation level > | inline empty string: > | also empty string:: > | null: ~ > > null-note: IMHO, null: ~ should also be syntax error. > null:~ I like this a lot. Added. > > | map: > | foo: bar > | who is that > | foo who is:: > | sitting at the bar? > > map: > foo: bar > "who is that \ > foo who is":: > sitting at the bar? > map-note:: > I find the above more readable since the " mechanism > is clearly a multi-line thingy." Fine by me. Oren? > > | list: > | : foo > | : bar > | :100: sparse > | :: > | "This string is element 101 in the list." > > list-note:: > Having the sparce list is kinda neat. Given > the trailing colon it is very understandable. Funny you you like the trailing colon. It's actually a mistake on my part. Should be like: list: :42 element number forty-two :55: element number fifty-five :99:&0001 : sub list : which has an anchor We could put a special 'index' character on the descriptor for clarity/consistency: list: :[42] element number forty-two :#55: element number fifty-five :(99):&0001 : sub list : which has an anchor > > | inline negative number: -42 > | nextline negative number:: > | -42 > | list2: > | - dash can now easily be used > | - as a list bullet. > | -100: this looks a bit strange > | -: > | But it's a rarer case > | - Just thinking. > > list2:~ Is this a clever way of saying "No thanks" for dash bullets? > > | object:!org.yaml.foo > | attr1: value > | attr2: value > | referenced object:!org.yaml.bar:&001 > | attr1: value > | self: *001 > > another:!org.yaml.zar:&001 > attr1:!org.yaml.integer:&002 34 > attr2:*002 > attr3:&002| > A block > that is given an anchor > attr4:*002 Yes. Correct. Good. > > | scalar based object:!.foo: > | Now we can accomodate scalar based objects. > | This is one because of the trailing colon. > | Also, any class '.foo' can be an abbreviation > | for 'org.yaml.foo'. A single node classname > | like 'foo' should mean that it is private > | (ie user defined). > > comment: nice > > | standalone reference:*001 > | empty map:!.map > | empty list:!.list > | another null:!.null > > thoughts:: > I'm uncomfortable with this since I'd like > each node to have a kind and a type in the > information model. Kind = map, list, scalar > And type is user defined. More thoughts here. > I have a hunch that merging them is dangerous. I'm inclined to agree with you. I was just showing that if you open this up for empty lists and maps, you really need to extend it for all types. So if we take this away, how do we do empty lists and maps? Oren? > > | another empty string:!.string > | unquoted string:!.unquoted "I like YAML", said the camel. > | perl subroutine:!.block > | sub greeting { > | my $name = shift; > | return "Hello $name"; > | } > | another block:!.chomped > | Look ma: > | NO TRAILING NEWLINE > > danger:: > This I'm very uncomfortable with. The "style" > used (the type of scalar) is othogonal to > type -- for instance I may have a "org.yaml.perl" > type but a "yaml.org.block" style... Again, I agree. I was just poking. > > | adjustments from other emails: > | : > | existing:!com.clarkevans.Person > | first: Clark > | last: Evans > | missing:!com.clarkevans.Person > | : > | That's it actually :) > > notes:: > This is still problematic -- as missing is ambiguous. > It could be a map, list, or single-line scalar. > More thought on this is required. It looks like we need something like '%' and '@' to mean (*only*) empty map or list. normal map: foo: bar empty map:% normal list: : item1 empty list:@ missing map based object:!com.clarkevans.Person% missing list based object:!com.clarkevans.Person@ missing string based object:!com.clarkevans.Person: If this seems a tad weird, at least it is completely consistent with the model thus far. Also remember, these empty thingies are infrequent. > > | This seems to handle everything in a quite pleasing way. > | I'm sold. You? > > Yep. Suggestions A and C are simple, and I hope you > like them. Suggestion B makes multi-line scalars > easy to grok (no exception for the colon) -- and I > don't mind what it does to quoted scalars. If you or > Oren don't like the suggestions, then we can go with this. Reassess A, given the grammar. B I don't like. Oren? C is fine by me. Oren's call. > > However, the last three blocks above are not going > to scale as I see it. We can't condense them into > a single construct... they are very different. Yes. Agreed. > > Best, > > Clark |
From: Clark C . E. <cc...@cl...> - 2001-10-25 03:27:56
|
| There are still other things I want to talk about like: | | things: | A: implicit types | B: special keys | C: starting productions | D: production separation | E: indentation characters (tabs anybody?) | F: round-tripping comments | G: throw-away comments | H: DWIM modes | I: preprocessors/filters | J: YAML standard library (MIME etc) For implicit types :: My initial thoughts is that this only works with one-line, non-quoted scalars. There should be a registry with the REGEX for the implicit type and if it matches, the type is set. The open issue is if this REGEX list is a singleton (on the yaml.org site) or if it can be local. If it is local, then portability suffers as two vendors may have a two different types for the same regex; or worse, overlapping regex. In this case, yaml fragments from both vendors can't be mixed. This would greatly hurt YAML. Thus, IMHO, the REGEX list should be a singelton and published in a spec. For the others :: I'm interested to hear what you say! Clark |
From: Brian I. <in...@tt...> - 2001-10-25 06:19:35
|
On 24/10/01 23:38 -0400, Clark C . Evans wrote: > | There are still other things I want to talk about like: > | > | things: > | A: implicit types > | B: special keys > | C: starting productions > | D: production separation > | E: indentation characters (tabs anybody?) > | F: round-tripping comments > | G: throw-away comments > | H: DWIM modes > | I: preprocessors/filters > | J: YAML standard library (MIME etc) > > For implicit types :: > > My initial thoughts is that this only works > with one-line, non-quoted scalars. There should > be a registry with the REGEX for the implicit > type and if it matches, the type is set. > > The open issue is if this REGEX list is a singleton > (on the yaml.org site) or if it can be local. If > it is local, then portability suffers as two vendors > may have a two different types for the same regex; > or worse, overlapping regex. In this case, yaml > fragments from both vendors can't be mixed. This > would greatly hurt YAML. Thus, IMHO, the REGEX > list should be a singelton and published in a spec. I agree with all of this. Oren? > > For the others :: > > I'm interested to hear what you say! Stay tuned then... > > Clark > > > > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Brian I. <in...@tt...> - 2001-10-25 06:20:00
|
While I'm at it: On 24/10/01 17:56 -0700, Brian Ingerson wrote: > There are still other things I want to talk about like: > > things: > : implicit types I want them. These should be the same (on systems that support integers :-) implicit integer: 42 explicit integer:!.int 42 > : special keys These should be the same. object1:!myclass foo: bar object2: !: myclass foo: bar > : starting productions I'd like a YAML document to be able to contain any number of any YAML productions. Here is a document containing 5 top level productions. They are a map, a list, a string, a block, and an object. They are separated by multiple sequential unquoted newlines. The key point here is that any "blank" line that is part of a multi-line string must have the proper indentation. foo: bar : item1 : item2 "A scalar. Must be double quoted to be a standalone Top level production Otherwise blank lines will be a problem" | Note one level of indentation. The following line *is* indented QTY DESC COST 1 YAML email $0.00 !foo attr1: value > : production separation Multiple sequential unquoted newlines. See above. I really like this over the "\n----\n" separators. Especially because if you are concatenating nodes onto the document, it doesn't matter whether you pre or post (or both) separate. BTW, leading and trailing newlines in the document would be ignored. > : indentation characters (tabs anybody?) Use one tab, instead of four spaces. Someone else suggested this and it makes a lot of sense in many ways. Why not just have one tab for each indentation level? The obvious answer is that 8 columns sucks. This is true. But a tab isn't 8 columns long. It's just *usually* represented that way. That's about the *only* thing that sucks for tabs. Otherwise they really make sense. They're not ugly to the computer. They're easier to enter. They're less error prone for deep indentation. If you can configure your devices to represent them as 2, 3 or 4 columns your're in business. Of course, every YAML enable device would work that way. And hey, now we have something that's configurable, for when 4 columns is just too much. Palm Pilots? I don't know. BTW, I am still very much against allowing tabs *or* spaces for indentation. We need to pick one. I just happen to be in favor of tabs at the moment. OTOH, a spaces to tabs (or vice versa) converter could be a standard filter. Or not ;) > : round-tripping comments Something like: this: #: a comment same as: this:# a comment Thoughts? I probably care least about this point, but it still is something we should agree on. > : throw-away comments Unquoted lines beginning with '#' in column 1 should be ignored by the parser. This will be so helpful, I consider it a slam-dunk. (People are already requesting this) Could be implemented as a standard filter. > : DWIM modes Maybe we don't need them with all this great new stuff. > : preprocessors/filters : space to tab indentation fixer : comment stripper : validator : blank line ignorer > : YAML standard library (MIME etc) I guess this is a library of preprocessors, filters and postprocessors that come standard. This let's you have a bunch of standard usage options that aren't in the core info model, but that people like/need nonetheless. Cheers, Brian |