From: Clark C. E. <cc...@cl...> - 2002-09-10 02:53:14
|
According to our website, YAML is first and for most a "native object" serialization language esp for languages like Perl and Python. We should stick hard to that goal. However, that said; we can't stop users of the "side-effects" that our serialization format and parser create: ordered keys and duplicate keys. Thus, we have either two options; duck and ignore that they are doing it; or perhaps try to give them _some_ accomidation. As an added "bonus" we become much more XML compatibilish; and this could help adoption in those circles. Now, we can either create another model for this; or we could phrase the "serial" model in such a way that it allows both of these effects to exists. Quite clearly key order comes for free with the serial model. And since node equality can't be examined without the type resolution at this layer we could even allow duplicate keys. This has the added advantage of clarifying that parsers _must not_ try to eliminate duplicate keys. In this way, tools written for the serial model must worry about duplicate keys and maintain key order. Then, at our more abstract level, our "graph" model we simply reject those instances of the serial model which have duplicate keys and we clearly state that node order is not preserved. So. These thingys can co-exist (without creating a new model) just through current clarifications of the existing model. At this point, an Application has two choices. It can use the YAML Parser only, and take its information from the Serial model; or it can use the YAML Loader and restrict itself to maps without duplicates and where key-order is forbidden... implementations may even want to scramble the keys just to make it crystal clear. I would be happy with this compromise -- but it means following the policy Neil laid out: 1. If an Application wants to use key order as information and/or accept duplicate keys, then it must use the Parser interface directly (which provides output in the serial model). 2. If an Application agrees with the generic model, where key order is not signficant and duplicate keys are not allowed, then it may use the Loader interface (which provides the graph model) To implementers this means the following rule to help keep things straight: Your Loader interface should not in any way be using key order or allowing for duplicate nodes in the stuff it generates. Ideally, this means the loader should only pass the application or object type constructors a fully-created random-access hashtable (ideally with pre-scrambled keys). And not provide any loading "hooks" which break this process (and hence enable the user to accidently violate the model). Thus, people have it both ways. If they want to treat YAML as Syntax, they can use the Parser interface. If they want to use YAML as a Serialization Language (with its model) then they use the Loader interface. There will be reasons to use both interfaces... I know it sounds draconian... but I think this is a good compromise position that allows for both models but still keeps YAML's vision as a serialization language for modern scripting languages pure. Best, Clark |