From: Clark C. E. <cc...@cl...> - 2005-01-10 19:08:59
|
On Mon, Jan 10, 2005 at 01:01:36PM +0100, Bra=F2o Tich=FD wrote: | firstly - what is the relations between YAML and UNICODE? | is it possible to represent non-unicode string data? e.g. it is | easy to write "\uFFFE", however code point #xFFFE is defined as | noncharacter in Unicode. is it allowed or forbidden? (IMO should be | forbidden) It should be forbidden. All scalar values should be representable via the Unicode character set. Binary data should be encoded using Base64 or some Unicode equivalent. | anchors, aliases and identity: | --- | - &ONE 1 | - &ONE 2 | - *ONE | --- | def: | - &ONE 1 | - &TWO 1 | use: | *ONE: "one" | *TWO: "two" | ... | is the first document equivalent with [ 1, 2, 2 ]? Yes. | is the second invalid? (if yes, in what stage does it blow up - i | think composition, and the error is 'not valid') Yes; this is invalid. Duplicate keys should be detected and reported as soon as possible. You are correct that at least some form of random-access is required to detect duplicate keys,=20 however, ideally, this particular case should be caught by the parser if possible. There is another case of key-uniqueness violation. This happens when the tag/content is different for two nodes, but once the tag is "Recognized", the two nodes turn out to have an identical value. This sort of error can only be caught at the end of composition. | concatenating documents to streams:=20 | (just checking) when reading stream and % is found unescaped/unquoted,=20 | it is either a directive before next document or error. either way | processing of current document is stopped and directives are read=20 | until next '---'. but what about BOM? it is valid Unicode character, so= in: | --- | doc1stringstringstring | stringstringstring | BOM | --- | doc2 | ... | is BOM part of the first document or optional BOM before the | second document? Good question, per the last discussion, the first document is illegal since the content it is not indented -- right? It was nasty cases like this that we wanted to avoid. | anyway, the draft reads very well and the examples are really helpfull. | I especially like the right-hand double quoted flow style equivalents. Super!=20 Best, Clark --=20 Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550=20 ~/ , mobile: +1.203.444.0557=20 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |