From: Adam M. <ad...@me...> - 2004-12-27 03:21:34
|
Have you guys considered subtracting out the overlap with OGDL and phrasing YAML as a layer above OGDL? (types, etc). There's a clear need for the kind of stuff that YAML is doing, but it would be nice to have a more modular specification -- people who only need graph structure wouldn't have to move to a different file format. Sort of like the relationship between Unicode and XML (although one notch "up" the protocol stack). - a -- I wrote my own mail server and it still has a few bugs. If you send me a message and it bounces, please forward the bounce message to me...@gm.... Thanks! |
From: PA <pet...@gm...> - 2004-12-27 17:50:03
|
Hello, On Dec 27, 2004, at 02:53, Adam Megacz wrote: > Have you guys considered subtracting out the overlap with OGDL and > phrasing YAML as a layer above OGDL? (types, etc). I like this idea! There are so much commonalities between the two: http://ogdl.sourceforge.net/spec/ > There's a clear > need for the kind of stuff that YAML is doing, but it would be nice to > have a more modular specification -- people who only need graph > structure wouldn't have to move to a different file format. Sort of > like the relationship between Unicode and XML (although one notch "up" > the protocol stack). <shameless-plug> This is the approach I took with PL: http://alt.textdrive.com/pl/ The format itself is basically an updated version of NeXT ASCII Property Lists (aka plist): http://alt.textdrive.com/pl/3/pl-the-format And the types and extensions are build on top of it: http://alt.textdrive.com/pl/4/pl-the-types http://alt.textdrive.com/pl/4/pl-the-types </shameless-plug> Cheers, PA. |
From: Oren Ben-K. <or...@be...> - 2004-12-27 22:05:11
|
On Dec 27, 2004, at 02:53, Adam Megacz wrote: > Have you guys considered subtracting out the overlap with OGDL and > phrasing YAML as a layer above OGDL? (types, etc). Not really. OGDL seems very similar to the first incarnations of YAML. YAML does more than merely add a typing layer; it also provides scalar styles that ODGL doesn't support etc. We believe these features to be "needed" (or we wouldn't have added them :-). I doubt there's a clean cut-off point between two distinct layers here. Or, put another way, YAML already is "modular" in the sense you mean (having "syntax/presentation", "tree/serialization" and "graph/representation" models). On Monday 27 December 2004 19:50, PA wrote: > This is the approach I took with PL: Yes, PL is also along these lines, as well as many other formats: > The format itself is basically an updated version of NeXT ASCII > Property Lists (aka plist): Brian, for example, joined YAML when he was looking to update the Data::Denter format, which itself is a modification of Data::Dumper... and so it goes. We hope to make YAML "the" solution to the needs addressed by all these formats. Now that the spec (coming soon to a web site near you!) has solidified (we think :-) What it needs most is implementations. Have fun, Oren Ben-Kiki |
From: PA <pet...@gm...> - 2004-12-27 23:07:36
|
On Dec 27, 2004, at 23:05, Oren Ben-Kiki wrote: > We hope to make YAML "the" solution to the needs addressed by all = these > formats. Oh, my... another all encapsulating, all singing, all dancing,=20 universal solution: YAML, noun A magic elixir of legend, claiming to solve all problems while=20 inevitably exacting an ironic cost. =93Once we drink the YAML and take care of a few minor things =97 = parser,=20 DTD, entification, well-formed-ness, validation, namespaces, I18N,=20 transformations, schemas =97 all will be peaceful in the kingdom!=94 http://www.eod.com/devil/archive/xml.html > Now that the spec (coming soon to a web site near you!) has > solidified (we think :-) What it needs most is implementations. A Java implementation would be nice, although I'm not quite sure if the=20= cultural "brace" divide can be bridged :o) Cheers, PA. |
From: Oren Ben-K. <or...@be...> - 2004-12-28 17:09:36
|
On Tuesday 28 December 2004 01:07, PA wrote: > Oh, my... another all encapsulating, all singing, all dancing, > universal solution: You forgot it being "the best thing since the pre-sliced coke can!" > =93Once we drink the YAML and take care of a few minor things =97 parser, > DTD, entification, well-formed-ness, validation, namespaces, I18N, > transformations, schemas =97 all will be peaceful in the kingdom!=94 Exactly! Seriously: YAML evolved out of SML-DEV which evolved out of XML-DEV, exactly to=20 address the fact that a single universal solution (XML) doesn't work;=20 XMl is great for structured documents (where YAML sucks Dyson spheres=20 through capillary tubings). YAML is _very_ focused on the specific=20 problem of data serialization, where it is XML that sucks. Just to show=20 we believe in "give XML what XML is due", the YAML spec is written in=20 XML (DocBook). Well-formed-ness, namespaces, and I18N are all covered in the YAML spec.=20 True, the parsers are "in the works" and we don't have a proper schema=20 language yet. We will, and this covers DTDs and validation. It just=20 goes to show how problematic the XML spec is that people would consider=20 separating these issues into "a maze of twistly large specs, all XML". That only leaves entification. Sorry, we don't entify (which, unless I'm=20 mistaken, is a verb only applying to Middle-Earth Huorns, due to lack=20 of Entlings :-) > > What it [YAML] needs most is implementations. > > A Java implementation would be nice, although I'm not quite sure if > the cultural "brace" divide can be bridged :o) Given an ANSI C implementation it should be callable from Java; a pure=20 Java port is definitely possible. As for using {} and [], YAML dosn't=20 force you to use indentation-based blocks, it also allows you to use {}=20 and [] in a way that a Java/C programmer would find intuitive. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-12-28 19:07:35
|
Oh dear! I hate it when things don't start out nice-and-cheerful-like, as I stated in my last post, there is much to rejoice here! YAML could really use a nice Java binding... and it appears as if PA's loader would be a wonderful start. I'm still hopeful that there is much room for collaboration here. On Tue, Dec 28, 2004 at 12:07:43AM +0100, PA wrote: | Oh, my... another all encapsulating, all singing, all dancing, | universal solution I think you really mis-interpreted what Oren was saying. Our syntax is largely inconsequential. YAML is, in our mind, a very simple colored (aka typed) graph model where nodes are scalars, mappings, or sequences. This is a universal solution -- if the model fits your particular need. That you have "re-invented" this same model is actually reassurance that we (both YAML and PL) are on the right track. I'm extremely happy to see another implementation of this same model. | A Java implementation would be nice, although I'm not quite sure if the | cultural "brace" divide can be bridged :o) It isn't an either/or solution -- you need both. Depending on the data structure, editing requirements, and other "presentation" level needs, different solutions are welcome. I'd love if you'd consider my past email proposal, with a few small tweaks in PA, it could quite easily be a minimal-YAML. Even if you don't consider being a YAML syntax subset, we can and should collaborate at the API and higher levels. We would certainly provide the highest benefit to our respective user communities if we did so. Cheers, Clark |
From: PA <pet...@gm...> - 2004-12-28 20:17:48
|
Hi Clark, On Dec 28, 2004, at 20:07, Clark C. Evans wrote: > Oh dear! I hate it when things don't start out nice-and-cheerful-like, It was meant to be cheerful :) > as I stated in my last post, there is much to rejoice here! YAML > could really use a nice Java binding... and it appears as if PA's > loader would be a wonderful start. I'm still hopeful that there is > much room for collaboration here. Sure. The loader itself doesn't have any formal dependencies over the underlying format. Perhaps a fair amount of "assumptions" though... > I think you really mis-interpreted what Oren was saying. Our syntax > is largely inconsequential. YAML is, in our mind, a very simple > colored > (aka typed) graph model where nodes are scalars, mappings, or > sequences. > This is a universal solution -- if the model fits your particular > need. That you have "re-invented" this same model is actually > reassurance that we (both YAML and PL) are on the right track. I'm > extremely happy to see another implementation of this same model. > > | A Java implementation would be nice, although I'm not quite sure if > the > | cultural "brace" divide can be bridged :o) > > It isn't an either/or solution -- you need both. Depending on the > data structure, editing requirements, and other "presentation" level > needs, different solutions are welcome. I'd love if you'd consider > my past email proposal, with a few small tweaks in PA, it could > quite easily be a minimal-YAML. Even if you don't consider being a > YAML syntax subset, we can and should collaborate at the API and > higher levels. We would certainly provide the highest benefit to > our respective user communities if we did so. Sounds very reasonable :) Cheers, PA. |
From: Clark C. E. <cc...@cl...> - 2004-12-28 18:56:54
|
| http://alt.textdrive.com/pl/ This looks very nice! Unlike ODGL, it seems that YAML and PL share essentially the same information model -- there is much room for collaboration here if you wish. Before you comment, I'd ask you to spend some serious time reading our specification -- I don't have time to regurgitate every last detail of YAML. Since our model is the same, there is no reason why we cannot share a higher-level infrastructure. For example, our parsers could have a API that is substantially similar, and the 'loader' into native data structures (which uses this API) could be a completely distinct component independent of YAML's and PL's syntax. In a similar manner, we could share a common type library, and schema language, path language, etc. The PL syntax is largely a subset of the YAML syntax, very close to our "canonical" form. We both use "double-quotes" for scalar values, we both use curly braces for mappings. If your language is not too entrenched, the following changes would make the PL syntax a "kernel" of the YAML syntax: - use brackets [] for lists, instead of () - use ':' to separate keys from values in a mapping pair, and "," to separate pairs in a mapping - use "C" style escaping in your quotes (see the YAML specification for the full list of escape codes. - use !date "2004-12-23" for your typed data, instead of "date:2004-12-23" It's surprising how little difference between YAML's "flow" style and the PL syntax is. It's a bit late to change the YAML syntax (it would have far too many implications) -- however, if PL is still young... it would be so wonderful to have a PL as a "simple subset" of YAML. I'd certainly support your efforts. That said, even if unifying the syntax between our projects isn't in the cards... we can still collaborate on a slightly higher level as described earlier. Kind Regards, Clark |
From: PA <pet...@gm...> - 2004-12-28 19:54:52
|
Hi Clark, On Dec 28, 2004, at 19:56, Clark C. Evans wrote: > | http://alt.textdrive.com/pl/ > > This looks very nice! Glad you like it :) Property list (aka plist) where introduced over a decade ago by NeXT [1] alongside OPENSTEP [2]. They are still widely used today [3] [4] [5]. PL simply "dust off" the format [6] [7] and add optional type information to it [8] [9]. > Unlike ODGL, it seems that YAML and PL share > essentially the same information model -- there is much room for > collaboration here if you wish. Before you comment, I'd ask you to > spend some serious time reading our specification -- I don't have > time to regurgitate every last detail of YAML. Yes. I'm going through it little by little. > Since our model is the same, there is no reason why we cannot share > a higher-level infrastructure. For example, our parsers could have > a API that is substantially similar, and the 'loader' into native > data structures (which uses this API) could be a completely distinct > component independent of YAML's and PL's syntax. Yes. This is how the current PL implementation is structured already. There is a basic Reader/Writer [10] [11] which only deals which the core structural elements. On top of that, there is an Object Input/Output [12] [13] which deals with de/serializing the objects themselves in terms of the core types. > In a similar manner, we could share a common type library, That would be great. I just made up my "standard" types as I was moving along :) > and schema language, > path language, etc. > > The PL syntax is largely a subset of the YAML syntax, very close to > our "canonical" form. We both use "double-quotes" for scalar values, > we both use curly braces for mappings. If your language is not too > entrenched, PL's core syntax is over ten years old (map, collection, string and binary). I would rather keep it the way it is as this provides me with "toll-free" backward compatibility. > the following changes would make the PL syntax a "kernel" > of the YAML syntax: > > - use brackets [] for lists, instead of () > > - use ':' to separate keys from values in a mapping pair, > and "," to separate pairs in a mapping > > - use "C" style escaping in your quotes (see the YAML specification > for the full list of escape codes. > > - use !date "2004-12-23" for your typed data, instead of > "date:2004-12-23" > > It's surprising how little difference between YAML's "flow" style and > the PL syntax is. It's a bit late to change the YAML syntax (it > would have far too many implications) -- however, if PL is still > young... It's both "young" (the types) and "old" (the format itself). > it would be so wonderful to have a PL as a "simple subset" > of YAML. I'd certainly support your efforts. > > That said, even if unifying the syntax between our projects isn't > in the cards... we can still collaborate on a slightly higher > level as described earlier. Yes. There seems to be a lot of scope for cooperation :) Cheers, PA. [ 1] http://en.wikipedia.org/wiki/NeXT [ 2] http://en.wikipedia.org/wiki/OPENSTEP [ 3] http://docs.sun.com/app/docs/doc/802-2112/6i63mn65o?a=view [ 4] http://www.gnustep.org/resources/documentation/Developer/Base/ Reference/NSPropertyList.html [ 5] http://developer.apple.com/documentation/Cocoa/Conceptual/ PropertyLists/Concepts/OldStylePListsConcept.html [ 6] http://alt.textdrive.com/pl/3/pl-the-format [ 7] http://alt.textdrive.com/assets/public/PL-ABNF.txt [ 8] http://alt.textdrive.com/pl/4/pl-the-types [ 9] http://alt.textdrive.com/pl/5/pl-the-implementation [10] http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/PL/ PLReader.java?view=markup [11] http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/PL/ PLWriter.java?view=markup [12] http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/PL/ PLObjectInputReader.java?view=markup [13] http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/PL/ PLObjectOutputWriter.java?view=markup |
From: Clark C. E. <cc...@cl...> - 2004-12-28 20:16:55
|
On Tue, Dec 28, 2004 at 08:54:53PM +0100, PA wrote: | Property list (aka plist) where introduced over a decade ago by NeXT | [1] alongside OPENSTEP [2]. They are still widely used today [3] [4] [5]. | PL simply "dust off" the format [6] [7] and add optional type | information to it [8] [9]. May I suggest that you don't put your type information inside the quotes... we did that in an earlier pass, and it turned out to be quite a mess. We ended up making the type specifier a completely different token, !type , that happens immediately before a given scalar value. Also, with YAML 1.1 we have an abbreviation mechanism borrowed from XML, but with clear disclaimers that the 'prefixes' are _not_ part of the information model and are just a presentation level thingy. | > Unlike ODGL, it seems that YAML and PL share | >essentially the same information model -- there is much room for | >collaboration here if you wish. Before you comment, I'd ask you to | >spend some serious time reading our specification -- I don't have | >time to regurgitate every last detail of YAML. | | Yes. I'm going through it little by little. Consider PL as just using the "flow" styles: [] sequence, {} mapping, and "scalar". | Yes. This is how the current PL implementation is structured already. | There is a basic Reader/Writer [10] [11] which only deals which the | core structural elements. On top of that, there is an Object | Input/Output [12] [13] which deals with de/serializing the objects | themselves in terms of the core types. Sure; and I'm quite sure we can share the Loader/Dumper (the higher level component). We didn't use input/output since this was too ambiguous -- words are everything. Also, we didn't use Reader/Writer because people seemed to use this pair of words to represent a lower-level abstraction for reading/writing to a socket or file. Instead we use Parser/Emitter. I'm not sure, but I think we (YAML group) coined the word 'emit' as the opposite of parse, but it's taken hold and I've seen other people use the same word as an antonym. | > In a similar manner, we could share a common type library, | | That would be great. I just made up my "standard" types as I was moving | along :) We've been documenting these at http://yaml.org/spec/type.html | PL's core syntax is over ten years old (map, collection, string and | binary). I would rather keep it the way it is as this provides me with | "toll-free" backward compatibility. Right. So, we should look to see how we can 'auto-detect' the syntax style differences so that a more 'unified' tool could provide the same API events without the user having to specify the syntax dialect. | It's both "young" (the types) and "old" (the format itself). Our type system is based on having a TAGURI for every type, and providing built-in syntax shorthands. We've been working on this for a few years (on and off) and the YAML 1.1 work, which should be posted soon, details our approach. Some last-minute tactical changes are still possible, but the overall strategic vision isn't. | Yes. There seems to be a lot of scope for cooperation :) Nice! Best, Clark |
From: PA <pet...@gm...> - 2004-12-28 20:54:53
|
Hi Clark, On Dec 28, 2004, at 21:16, Clark C. Evans wrote: > May I suggest that you don't put your type information inside the > quotes... we did that in an earlier pass, and it turned out to be > quite a mess. I see. Got some pointers to that conversation? The fundamental reason why the types are inside the quotes is for backward compatibility with plist. Introducing a new token would break existing parsers. Something I would rather avoid. > We ended up making the type specifier a completely > different token, !type , that happens immediately before a given > scalar value. Do you have some references to the pros and cons of each approach? While adding the type information inside the quotes creates a theoretical ambiguity (e.g. is "int:123" really an 'int' or is it simply a string which happen to be formated like an 'int'), this shouldn't turn into a real problem in practice. Or did I miss something else? > Also, with YAML 1.1 we have an abbreviation > mechanism borrowed from XML, but with clear disclaimers that > the 'prefixes' are _not_ part of the information model and are > just a presentation level thingy. Not quite sure what this "abbreviation mechanism" is about... where is that defined? > Consider PL as just using the "flow" styles: [] sequence, {} > mapping, and "scalar". This is perhaps one of the fundamental things which somewhat confused me to no end: the variety of "styles" that YAML seems to allow. Why not pick one and stick to it? > Sure; and I'm quite sure we can share the Loader/Dumper (the higher > level component). We didn't use input/output since this was too > ambiguous -- words are everything. Yes and no. Sometime words are just words. In the case of PL's first implementation being in Java, it simply follows whatever names Sun has already given to their own classes [1] [2] [3] [4]. > Also, we didn't use > Reader/Writer because people seemed to use this pair of words to > represent a lower-level abstraction for reading/writing to a socket > or file. Instead we use Parser/Emitter. I'm not sure, but I think > we (YAML group) coined the word 'emit' as the opposite of parse, but > it's taken hold and I've seen other people use the same word as an > antonym. Ok. As long as we are talking about the same thing I don't mind one vernacular versus another :) > We've been documenting these at http://yaml.org/spec/type.html Thanks. Quick question: !merge seems to be more of an "operation/function" than a "type". Am I misunderstanding what it is suppose to represent? > Right. So, we should look to see how we can 'auto-detect' the syntax > style differences so that a more 'unified' tool could provide the > same API events without the user having to specify the syntax dialect. > > | It's both "young" (the types) and "old" (the format itself). > > Our type system is based on having a TAGURI for every type, and > providing built-in syntax shorthands. We've been working on this > for a few years (on and off) and the YAML 1.1 work, which should > be posted soon, details our approach. Some last-minute tactical > changes are still possible, but the overall strategic vision isn't. > > | Yes. There seems to be a lot of scope for cooperation :) > > Nice! Cheers, PA. [1] http://java.sun.com/j2se/1.4.2/docs/api/java/io/Reader.html [2] http://java.sun.com/j2se/1.4.2/docs/api/java/io/Writer.html [3] http://java.sun.com/j2se/1.4.2/docs/api/java/io/ObjectInput.html [4] http://java.sun.com/j2se/1.4.2/docs/api/java/io/ObjectOutput.html |
From: Clark C. E. <cc...@cl...> - 2004-12-28 22:34:25
|
On Tue, Dec 28, 2004 at 09:54:59PM +0100, PA wrote: | I see. Got some pointers to that conversation? Earlier in the archives... hmm, I've got to split now, but I can dig later if you need. | The fundamental reason why the types are inside the quotes is for | backward compatibility with plist. Introducing a new token would break | existing parsers. Something I would rather avoid. Yes, that much is clear. | While adding the type information inside the quotes creates a | theoretical ambiguity (e.g. is "int:123" really an 'int' or is it | simply a string which happen to be formated like an 'int'), this | shouldn't turn into a real problem in practice. Or did I miss | something else? Two things: - an escape mechanism is needed, or you simply cannot serialize some sorts of strings that could normally be serialized with PList - the semantic interpretation of the serialized value is different depending upon which tool you use In both cases you are changing the "meaning" of the content, and in my humble opinion, its warts like this that tend to cause much bigger headaches down stream. For example, a schema language in the future may need to express "int:123" -- where the end result is _not_ an integer type. ;) Personally, I think a new token is merited, together with a small tool to "strip" the token for systems that can't handle the type information. That said... it's your dog! | > Also, with YAML 1.1 we have an abbreviation | >mechanism borrowed from XML, but with clear disclaimers that | >the 'prefixes' are _not_ part of the information model and are | >just a presentation level thingy. | | Not quite sure what this "abbreviation mechanism" is about... where is | that defined? I'll post the new spec soon, but in short, !tag is a short-cut for tag:yaml.org,2002:int unless you have %TAG ! http://bing/ at the start of your YAML file, and then it is cooked to http://bing/int The purpose is to allow for third-party data types and schema language declarations. | >Consider PL as just using the "flow" styles: [] sequence, {} | >mapping, and "scalar". | | This is perhaps one of the fundamental things which somewhat confused | me to no end: the variety of "styles" that YAML seems to allow. Why not | pick one and stick to it? Different sort of data "wants" to be written differently, but I got to split... more on this later if you wish. | Quick question: !merge seems to be more of an "operation/function" than | a "type". Am I misunderstanding what it is suppose to represent? It's awfuly hard to separate data type from its function. Oren? Best, Clark |
From: Oren Ben-K. <or...@be...> - 2004-12-29 06:15:15
|
On Wednesday 29 December 2004 00:34, Clark C. Evans wrote: > On Tue, Dec 28, 2004 at 09:54:59PM +0100, PA wrote: > | Quick question: !merge seems to be more of an "operation/function" > | than a "type". Am I misunderstanding what it is suppose to > | represent? > > It's awfuly hard to separate data type from its function. Oren? It boils down to the semantics of "semantics", I suppose; we could argue the point through a whole philosophy semester. So if someone insists '!!merge' to be an "abuse" of the type system, he's got a point, but he's missing another: Nobody forces you to use it. People seem to keep missing the fact that the type repository are optional. They are _recommended_ for use in your schema _where appropriate_, to increase cross-application portability and the usefulness of "generic tools", but that's as far as it goes. So, for people who find '!!merge' to be useful, we provide a 'common' way of expressing their need, instead of having everyone inventing his own variant. Have fun, Oren Ben-Kiki |
From: Clark C. E. <cc...@cl...> - 2004-12-29 17:31:30
|
On Tue, Dec 28, 2004 at 09:54:59PM +0100, PA wrote: | >Consider PL as just using the "flow" styles: [] sequence, {} | >mapping, and "scalar". | | This is perhaps one of the fundamental things which somewhat confused | me to no end: the variety of "styles" that YAML seems to allow. Why not | pick one and stick to it? YAML makes a very important distinction between the representation of information for computation, and its presentation for humans. YAML defines a particular balance between these completely distinct and complementary forces. So that computations can be rigorously defined, a simple information model is required. However, for data to be readable, YAML allows for many 'styles' to encode this same information. See the specification for more details. We allow for multiple styles since the kind and content of the data we wish to encode can vary quite substantially. A one-size-fits-all solution fails to take into account he presentation requirements necessary to maximize human readability. For simple text, that does not conflict with other YAML tokens, we have the plain style. Why force a user to quote something when it is not ambiguous -- given that this is the most common type of content? For chunks of code with lots of significant whitespace we have the block style. Other than indenting each line, the block style places few other constraints on the embedded content. To include a code fragment, or even a YAML sub-document, one need only "indent" the content and include it in the file. Use of "sed 's/^/ /g'" is a common idiom for this sort of value. Another sort of content is "text" like this email. In this case, escaping is also not a great idea, since one may want to use "quotes" without worrying about \" each time. For this situation, new lines are there _only_ for readability, and one may want a YAML pretty-printer to automatically word-wrap the content at a particular column width. In this case, increasing the indentation of YAML content in an editor may cause the content to have a different set of carriage returns. Two carriage returns are then often used to separate paragraphs, and thus N carriage returns is converted into (N-1) carriage returns, and a single carriage return is converted into a single space. We call this the "flow" style. Finally, we have two different quoting styles. The double-quoting style is exactly what people familiar with "C" derived programming languages use. The single-quote is more familiar to those using SQL (a double single quote represents a single quote). The single quote style is often perfect for regular expressions that would otherwise conflict with YAML parsing rules and thus can't be plain; further, double quoted forms would require double-escaping for common uses of the slash, dramatically reducing readability. YAML's balance is rather clear. There is "one-true-way" to _represent_ information, however, we give the flexibility in the YAML file format to _present_ the information in such a way as to maximize its readability. With a "stylesheet" or with some simple heuristics, a YAML emitter could be extremely smart to choose the best style. The important aspect of the balance is that the representation model is *not* impacted by these presentation styles. That is, this extra complexity is totally optional at the parser/ emitter level, and the application is free to ignore these sort of things. For portability the application should also not distinguish between two values using different styles. Of course, parser complexity is the price to pay. However, it is a constant price - once a parser/emitter is written, the cost can be amortized across the number of text files it is used to store. If YAML's presentation styles save even a hour of time per user per year, the cost over a few years for all of YAML's users will easily over-weigh the implementation difficulties created. At least, this is our bet. While our styles have dramatically increased the implementation price of YAML, we think it is a good trade-off to make. Does this better explain the rationale for multiple styles? Cheers, Clark |
From: Clark C. E. <cc...@cl...> - 2004-12-28 18:33:38
|
Adam, I'm a fan of ODGL. Rolf Veen has an excellent mind; his work, ODGL, has very clever information model and syntax. Rolf Veen was one of the earlier YAML implementers, before he got disenchanted with the complexity of our solution: we had node typing and lots of syntax styles and uniqueness constraints -- all of which made implementation difficult. It's clear that Rolf came up with a different, implementable, and wonderfully succinct alternative. However, I'm not sure layering YAML on top of ODGL would be possible, nor a productive use of our time -- it is substantially different and I think addresses a completely different set of user requirements. Our information models are substantially different. ODGL's model is G = (N, E) where N is an ordered bag of Unicode strings, and E is a NxN relation. YAML's model is a mapping of nodes onto nodes, where each node has a type color, and can be a Unicode string or a mapping; a special case mapping, the sequence, uses whole numbers for its domain. For starters, ODGL implies a total ordering among all scalar values, this isn't true for YAML. So, it's not even clear you can view YAML as an extension of ODGL. They are quite different, and I'm not convinced that the differences can be bridged cleanly without a serious rework of either language. That said, if I'm wrong -- please tell me so! Our target market is different. ODGL seems very targeted towards configuration files, YAML is more targeted towards serialization of data from programming languages. Certainly there is some overlap, but it isn't as big as you might imagine. Certainly, one could use YAML for config files or OGDL for data serialization, but I think each one does better in their own domains, respectively. Our syntax is vastly different. For ODGL's model, its syntax is just perfect. The examples given on the ODGL's web site are quite compelling for the kind of data it is used to save. The result is quite readable. However, I think for serialization data, it would quickly get harder to read. YAML has multiple styles to adequately address the variance in the sorts of data to be written, ODGL doesn't. This is perfectly great, YAML's syntax is vastly complicated by comparison. So, it's a cost-vs-benefit tradeoff. The YAML people are betting that with a very solid "C" implementation and tens of bindings, the actual syntax complexity will not be a long-term issue. However, the simpler ODGL syntax is very very nice. So, the question is, how can we collaborate? I see a few different ways. First, we can steal ideas from each other! I absolutely love the idea of using a "path" for anchor/aliases. I'm sure Rolf has stole everything he wanted from YAML, without making ODGL as complicated. Second, we can incorporate a "standard-mapping" of ODGL onto YAML, and perhaps vice versa. Third, if we are clever, we can make it so that a parser implementing both standards can read both formats. For this we should check our productions to ensure that the differences are "auto-detectable". I hope this addresses your question; sorry that I was unable to answer sooner. Cheers! Clark |
From: PA <pet...@gm...> - 2004-12-29 18:47:48
|
Hi Clark, On Dec 29, 2004, at 18:31, Clark C. Evans wrote: > [snip in-depth explanation of YAML presentation styles] > > Does this better explain the rationale for multiple styles? Yes, very much so. Thanks for taking the time to explain the rational behind the different styles! Question: In light of the above and the publication of the YAML 1.1 draft (81 pages spawning 1 MB of PDF!!!), would it perhaps make some sense to split YAML into its constituting parts? In other words, YAML could be viewed as an information model with several possible representations as well as different scalars and operations on top of it. You already define a a "level of compliance". Why not go one step further and define different YAML "profiles"? I'm bringing this up because, as a whole, "The World According To YAML" is not for the fainthearted... If it would be possible to "parcel" the specification into more digestible bites, it would make it more accessible. At least to me :) The last point reflects somehow the short history of PL as well... I looked at YAML before "writing my own"... but... I frankly got overwhelmed by the sheer amount of "stuff" there... after all, I just wanted maps, sequences and scalars... not a new lifestyle... The YAML "flow" profile do just what I need... but... unfortunately... it was buried in a sea of so much other "stuff", that I somehow overlooked it... so in the end, it took me less time to "write my own", than to understand your specification... oh, well... ironic, isn't? :) Cheers, PA. |
From: Clark C. E. <cc...@cl...> - 2004-12-29 19:57:40
|
On Wed, Dec 29, 2004 at 07:47:56PM +0100, PA wrote: | In light of the above and the publication of the YAML 1.1 draft (81 | pages spawning 1 MB of PDF!!!), would it perhaps make some sense to | split YAML into its constituting parts? Yes, it is quite large. I'm not sure how to break it up in a meaningful manner without losing one or more important aspects. The difficulty of organizing a specification like YAML is the level of interaction between the various facets of YAML. Do you have a suggestion in mind? | Why not go one step further and define different YAML "profiles"? This isn't a bad idea. I have long viewed YAML as a layered system: Core "Canonical" YAML: - using only flow collections, {} and [] - using only double-quoted scalar form - using !tags for type information - &anchor and *alias for managing graphs Styled "Presentation" YAML: - directives, multiple documents in a stream - adding all the other styles The advantage of a "Core" YAML is that it can be specified in a BNF-friendly set of productions, writing a parser for it is far simpler than the "Styled" set of productions. If we were to do this, how could the spec be re-organized to accomplish a more clearly better production? I'd like to keep it as a single specification, but perhaps in three parts? Introduction Preview - Core YAML * Double Quoted * Flow Collections - Styled YAML * Scalar Styles: Plain, Single Quoted, Literal, Folded * Collection Styles: Block Mapping, Block Sequence Processing YAML Information (same) Core Syntax Styled Syntax The primary work here is refactoring the productions to single out those that are a part of the core syntax. I'm not sure if this would accomplish what you are thinking though. | The last point reflects somehow the short history of PL as well... I | looked at YAML before "writing my own"... but... I frankly got | overwhelmed by the sheer amount of "stuff" there... after all, I just | wanted maps, sequences and scalars... not a new lifestyle... Right. Ok. So, perhaps the organization is more like: Introduction Core YAML - Preview: {}, [], and "" - Information Model - Syntax Productions Styled YAML - Preview - Syntax Productions This requires quite a bit more re-work, and I'm not sure if it would even work that well. | The YAML "flow" profile do just what I need... but... unfortunately... | it was buried in a sea of so much other "stuff", that I somehow | overlooked it... so in the end, it took me less time to "write my own", | than to understand your specification... oh, well... ironic, isn't? :) Any other ideas? Best, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * |
From: Oren Ben-K. <or...@be...> - 2004-12-29 20:58:13
|
On Wednesday 29 December 2004 21:57, Clark C. Evans wrote: > If we were to do this, how could the spec be re-organized to > accomplish a more clearly better production? I'd like to keep > it as a single specification, but perhaps in three parts? > > Introduction > Preview > - Core YAML > * Double Quoted > * Flow Collections > - Styled YAML > * Scalar Styles: Plain, Single Quoted, Literal, Folded > * Collection Styles: Block Mapping, Block Sequence > Processing YAML Information (same) > Core Syntax > Styled Syntax > > The primary work here is refactoring the productions to single out > those that are a part of the core syntax. It is an interesting issue whether the "core" would use a subset of the style productions... Hmmm. > I'm not sure if this would accomplish what you are thinking though. I'm not certain I like breaking the YAML syntax to "core" and "styled" in the first place, spec effects aside. The spec is large mainly because it is choke-full of examples. They almost double the length... Should I take them out? <grin> Have fun, Oren Ben-Kiki |