From: BlueGM <bl...@gm...> - 2009-09-04 21:32:45
|
We seem to be on agreement that the handling of this data type is library specific, so I won't go too much into this. I did want to point out, though, that option 2 can be quite valuable. A well behaved writer should always write a valid Unicode string using the normal YAML string type. That won't always happen, especially if users are typing any of this in manually, which they will be. Consequently, it makes sense for a library to expose an option that if the data type can be converted safely to a native string, then do so, otherwise throw an exception or raise an error that the application can handle. In this way, an application that only wants valid strings can handle such strings even if it was marked with this data type. However, that is something specific to the library's interface to (and contract with) the application, so not necessary to specify. My key point was that the processor (usually in a library) and the application need to agree on how this data type is handled, and we agree on that. Discussing how this is handled is somewhat moot, beyond demonstrating the versatility of this data type when a library or application knows about it, which was my intention. -----Original Message----- From: Oren Ben-Kiki [mailto:or...@be...] Sent: Friday, September 04, 2009 5:04 PM To: Brad R Cc: yam...@li... Subject: Re: [Yaml-core] New invalid UTF-8 proposal using tags and %nn On Fri, 2009-09-04 at 16:48 -0400, Brad R wrote: > It is also entirely possible for a library to expose options to the > application about how to handle the new data type. Options that a > library might expose include: > > 1) Always return it as a byte array (I think this should be the > default and required behavior) This is the key point. Anything else is optional and completely up to the library. For example: > 2) Throw exceptions/Raise errors if it can't convert it to a valid > string (so if an application only wants to deal with strings, it can) Sure, this is a possible API. IMO this defeats the whole purpose of this tag. It is a safe working assumption that if this tag is used, the string will not be valid (Unicode). But sure, a library could provide it. > Is this something that the YAML spec needs to cover, though? Most emphatically *no*. The YAML spec defines the data model. As long as the data is preserved, the implementation is compliant. In fact the whole !!utf-u tag is out of scope for the spec itself, just like !! binary and a zillion others. Have fun, Oren Ben-Kiki |