From: Clark C . E. <cc...@cl...> - 2001-11-09 16:59:58
|
| > > 6. A version indicator is also optional | > > within the descriptors immediately | > > following the separator. | > | > --- ?1.0 | > | > ideas? | | --- YAML:1.0 ... --- ?YAML:1.0 ... | Rationale: this allows us doing all sorts of other stuff in the future | without requiring us to look like line noise: | | key ::= alpha alnum* | value ::= linear_non_space* | | Unknown key:value following the separator should be ignored (with a | warning). The YAML key' value is <major>.<minor> where the usual backward | compatibility rules apply. | | This removes any confusion with indicators within the document. Being | verbose is not an issue because it is only once per document. If we add ? as an indicator before items like this and ?INDENT:0 we can then have one-line scalars... --- ?YAML:1.0 2001-01-01 --- !string this is a one line scalar --- 1.0 | > > 3. Comments must be indented to the | > > level that they apply. If they are | > > at a lower indentation, then the | > > indentation level is reset... so | > > a higher indented item will be an | > > error... | > | > Comments must be at the same level as the line following them. | | Seems you both have the same intent. I agree with it. I like Brian's formulation. It's simpler. | > > 4. Comments should be part of the emitter | > > and parser API so that they can be | > > round tripped at a low level. However, | > > they may be dropped at higher levels. | > | > I get the parser level. (Line numbers need to be preserved for error | > reporting). But what about the emmission level? Explain. | | There has to be a way to emit comments. Exactly. | I don't know whether this should be written into the spec, however. Perhaps not. But it should be in the API description so that a YAML processor will be familar (same interface) no matter what the language. This will take some work... | > > 3. We will reserve an indicator valid at the | > > separator level that may be used to | > > set the number of spaces in a future version | > > if we find that it is necessary. | > | > So the indent level could be changable, but only for an entire document. | > This might be nice. | | Something like: | | --- YAML:1.1 INDENT:4 | invoice: | price: $12 --- ?YAML:1.1 ?INDENT:4 invoice: price: $12 And yes... let's not include this in YAML 1.0 | > Let's stick to trying 1 space for now. | | YES. | | > > Implicit scalars... | > > | > > 1. Implicit scalars should only be single-line. | | No (base64). I don't mind making base64 explicit. It certainly doesn't have to start on the first line. This one, for example is a very very small jpg. image: !base64 \ R0lGODlhGQAPAOMAAAICBDaanAJSVAISFP7+/Gb OzAJmZAIeHGbMzGbMzGbMzGbMzGbMzGbMzGbMzG bMzCH+Dk1hZGUgd2l0aCBHSU1QACH5BAEKAAYAL AAAAAAZAA8AQAR70EgZArlBWHw7Nts1gB6RGV0w CBMlkp4qlHJppkNoyW1r5SmcTeV6wUwrFI4VEul SMyRLchhYrYLq4MDKYrm9XuFQuIzLhALApm6VV+ g44FBSHybokQGdnivNfhJ8enwFSR12eB4jcWZ3g HeCJQJycXSJEzaIc5SIWz0RADs= Not to bring this up. But perhaps we should consider compound types... !jpg!base64 | > 1c. Implicit scalars have no whitspace characters. (brian proposed) | | No (base64, dates). timestamp: 2001-02-21 15:20:00+5 | > > 2. All implicit scalars shoudl be part of the YAML | > > specification (perhaps a secondary document) | > > and third party implicit scalars are not allowed. | > > If a user wants their favorite sclar type implicit, | > > they can propose it on the YAML list and we may | > > bless it. Otherwise they can use explicit types. | | NO. I don't see implicit types as being any more special than | explicit ones. The choice of types in my document is part of | my document's schema, and it is my decision as a document | author to either use public or private types. The goal of YAML is to support interoperability among processes, languages, etc. If a user defined type is going to be used, then it should be plainly obvious. | key: !!foo bar The above is "good enough" for this use case. If you want we can revisit this at YAML 1.1 if there is enough outcry from people asking for the feature. But I'd rather be strict for now. | I expect to be able to use a private implicit type as well. | A good point is that such private types should not collide | with any public type. Right. This is the problem. | I propose that all private implicit type must start with `, | just like all private explicit types must start with ! | (note that ` wouldn't be an indicator, it would just be | the first character of an unquoted scalar). So: | | key: `-=>bar<=- Ok. This is an acceptable compromise. | > > 3. (brian) We will reserve all regular expressions | > > up front and any thing not reserved will be | > > a string. | > | > (brian) We can reserve a whole lot of potential patterns by making them | > implicit warnings. ie They throw a warning that they are reserved for | > future use and should be quoted to eliminate the warning. | > | > > (clark) We reserve anything that does not | > > begin with an alphabetic character. | | No. Allocate all patterns matching "^`" for private usage and reserve all | the rest for public implicit types. Simpler and better. Ok. So those implicit types beginning with a ` tick are private in nature. Nice compromise. | > > This difference emerged after Brian gave the | > > use case of putting a regular expression | > > as a value. Clark's answer was "too bad", | > > (a) use a multi-line scalar, or (b) quote | > > the regular expression, or (c) let's introduce | > > an implict "regex" type starting with the ` tick, | > > (d) or allow single line blocks: | regex | | regexp: !text ... line noise here ... | | Is currently a legal way of doing this. I think "line noise | as text value" is rare enough that an explicit '!text' annotation | is acceptable (even in Perl :-). The in-line block syntax seems | more trouble than it is worth. Not bad. Given you suggested "private" implicit type area, one could even use the ` tick... if one wanted. But then they'd have to register their implicit type handler or it would be a warning. No? Alternatively, we could have an org.yaml.regex explicit type. match-string: !regex ... regular expression ... This would be neat since Java and Python have regular expression objects. | My proposal for changing the reference syntax from '*' to '->' | still stands, however. Hmm. This is *ok* with me, but I like * better. How does this work with more complicated references? | You may have noted I changed the base64 syntax from [...data...] | to [=...data...=], due to the same reason: using a single prefix | character is like using a type A IP network or a TLD; it should | only be done for an extremely good reason. Actually, it might be good to just use an explicit !base64 and not bother with an implicit type for this use case. | I feel '=' is a case where a one-char type is acceptable, | and also '~'. Both are actually single-char values rather | than a single-char prefixes; so in theory '~/=...data...' | implicit types could still be used, though of course | their relationship with '='/'~' would become an issue. | At any rate, I would definitely draw the line there. | Wasting '*' seems like, well, a waste :-) Besides, | '->' may be more readable to a newbie. Hmm. I disagree here. Back references are going to be very common. Let's make it simple, either * or ^, no? | Oh, just find some prefix to attach to regexps and be done | with. Tick would have been nice, only I just used it for | private types. '*' would also have been nice, only it is | taken by references. '?/' seems best: | | regexp: ?/regexp here/ I like the idea of an explict type for regular expressions. search: !regex \/regular expression | I didn't use an unadorned '/' because I think in general prefix-based | implicit type should be at least two characters long, to allow a larger set. | So, how about we change references from '*' to '->' while we are at it? | | anchor: &12 | reference: ->12 reference: *12 Hmm. Brian? | > > Random notes: | > > | > > 1. Brian likes !seq and !map as the abbreviations | > > for the default map and sequence types. I'm | > > assuming that !string is the default scalar | > > type (in the absence of explicit/implicit types). | | How about we keep calling it !text rather than switching to !string? | I think that "text" better reflects the intent. I find string to be | a bit more of an implementation term (like 'list' and 'array' are | for 'sequence'). Perfect. I like !seq, !map, and !text as the default types. That a !text maps to to a java.lang.String on the Java platform is a platform specific detail. Summary.... | I propose key:value as above. And: I modified this as ?key:value | C) Implicit types: use ` for private ones? Nice. | Should I write it up as the next draft? Delivery on Sunday | morning as usual :-) Please! ;) Clark |