From: Andrew K. <ku...@sf...> - 2002-04-09 03:21:36
|
Arggggghh!!!! (Rant, you may have guessed, follows.) > > Message: 3 > From: Oren Ben-Kiki <or...@ri...> > Date: Sat, 6 Apr 2002 16:43:34 -0500 > Subject: [Yaml-core] New draft > > Is attached. It contains fixing the timestamp format issues and disallows > anchors to alias nodes. > > Things seem to have stabilized... Our oldest version is from mid-may 2001. > Shall we have a "release candidate" at our 1st anniversary? > > Clark C . Evans wrote: > > | I use YAML a lot at work, and the only (real) complaint I have is that > > | the YAML spec didn't make it a requirement for the #YAML:xxx tag to > > | change with each release of the spec, before we settle down to an > > | approved 1.0 status. > > > > Yes, this has been a problem. My apologies. I think we are very > > close to a final version, do we want to call it 1.1? > > I'd rather not. If anything, I'd call the older versions 0.9 or something. > How much of a problem is it in practice? There couldn't be too many files to > convert... > > Have fun, > > Oren Ben-Kiki > This is a DREADFUL mistake! If you try to rename an old version and you only have to convert *one* file by one 7th grader in Saskatoon -- and 50 Yaml-core developers voted that it was OK -- still, the 50 would be wrong and the kid would be right and we'd all owe him an abject apology. Every new version should have a new version number -- and no kidding! You've already broken this rule, but there's no need to break it again. Really, there's no conceivable reason to break the faith of the user community just in order to put pretty numbers on the releases -- or, for that matter, to release on pretty dates. First things first. . . . puh-lease. Andrew |
From: Andrew K. <ku...@sf...> - 2002-04-09 04:16:45
|
Dear All, Now that I've had a bit of a listen to your exchanges, I think that I see a problem: no philosophy document. I say "Why no doc title-date?" and CCE says "No app data in the doc header. Why don't you just create a new top layer for your doc with a Header and Data field?" I could say "It's kludgy" and no one could argue, but not because it's true, it's just that there's nothing around to say "This is what YAML is for and this is our approach to designing it" (Sorry, guys, but the bulleted list is not a philosophy document.) I would say that it's reasonable not to have one at the beginning, when there wasn't a concensus among developers, but now (I hope) there is, and the public needs it to begin to understand why the spec contains all the funny decisions it does. Also On this particular issue, I think it would be helpful of the spec to contain recommendations about this top-level layer, even if they add nothing to YAML itself. That is, there should be a recommended structure for the top level to contain things like doc title and date, and (coming out of my objection re acyclic graph dumping) a flag indicating the existence of a "wrapper array" as proposed by Brian. And I think the tab-vs-spaces question should be solved by a philosophy statement. IETF insists on an extremely simple format for its RFCs: plain ASCII. Do they say why? I hope so. Let's steal their explanation. (BTW it would be easy to mandate that no generator emit tabs, even if the parser could recognize them. That would get us half-way there, anyway.) And Neil says machine-generated YAML is pretty because it has to be. Is this part of the philosophy? If so, there's no question of letting an alternative, non-indenting format into the spec. It's against our philosophy. Andrew |
From: Clark C . E. <cc...@cl...> - 2002-04-09 05:10:14
|
| I say "Why no doc title-date?" and CCE says "No app data in the | doc header. Why don't you just create a new top layer for your | doc with a Header and Data field?" 4.3.2 Directive Directives are instructions to the YAML parser. Like throwaway comments, directives are not reflected in the tree or graph models. I don't know how this can be said better. The # thingy is for parser stuff, like TABS and other syntax level issues. This is our "escape hatch" for the future. It is not a property of the application (as it is not in the tree or graph model). These directives will _not_ be reflectted in the API. I just don't know how to word this any clearer... | I could say "It's kludgy" and no one could argue, but not because | it's true, it's just that there's nothing around to say "This is | what YAML is for and this is our approach to designing it" Let us say that we added Title or Date to the parse directives. Then let's say that we wanted the user to have access to these directives through the API. Thus, we have to put into our information model something which matches our parse directives, that is a non-hierarchical mapping. However, I don't know of one programming language which has non-hierarchical mapping... thus we would need a non-native data structure to expose our parse directives to the application, in direct violation of goal #3 "YAML uses the host language's native data structures." | On this particular issue, I think it would be helpful of the | spec to contain recommendations about this top-level layer, | even if they add nothing to YAML itself. YAML has no interest in defining stuff like Document "Title" or "Date" since these don't apply universally. Instead, this is something that an application specific schema (or industry schema) would cover. DocBook is a good example of an SGML/XML schema for book writing. Not that you'd ever want to author a book in YAML (since it doesn't have mixed content), but a similar YamlBook schema and appropriate documentation could be generated which provided general structural recommendations. This is not stated in the goals... but it is rather implied by our domain. We are a "data serializion language", we are not a "format for saving documents with titles and dates". ;) | a flag indicating the existence of a "wrapper array" | as proposed by Brian. Certainly we should have some general documents that talk about "Ways you can use YAML". But given that we don't even have the C parser done yet (I'm working on it again, thank god beacuse I need it soon), I think this is a much lower priority. Also, it is pretty standard that for serialization, the top level data structure is a mapping. We didn't want to mandate it, but this is usually the case. See, for example, Python's __dict__ which provides a map from names to all objects in the current namespace. | I think the tab-vs-spaces question should be solved by a | philosophy statement. IETF insists on an extremely simple | format for its RFCs: plain ASCII. Do they say why? I hope so. | Let's steal their explanation. Ok. We have the first goal, "YAML is readable by humans". By this we have used the "printer test", if I send the document to my printer can I determine the structure? If yes over coffee and a printed page I can figure out what is going on with minimal work, then we've accomplished this goal. tabs-vs-spaces is a fight between #1 (readable) vs #6 (expressive), some people like tabs as they make entering data "faster". ;( | (BTW it would be easy to mandate that no generator emit tabs, | even if the parser could recognize them. That would get us | half-way there, anyway.) The tabs vs spaces thing is a long drawn out concern. We've taken a "sufficient" way to handle them in the current spec, I was talking about a relaxation which is not only "sufficient" but "necessary", in other words, there are a few cases where we are being overly restrictive in the current rules: - If I only use tabs for indentation (and nothing else) then I unnecessarly have to specify the #TAB directive - If I never mix tabs and spaces within siblings, but I do mix between top level items then I also unnecessarly have to specify the #TAB directive Anyway, since this is a relaxation, we can do it at any time without affecting current data. Thus, this is just me musing a bit... I don't see us acting on spaces any time soon. | Neil says machine-generated YAML is pretty because it has to | be. Is this part of the philosophy? If so, there's no question | of letting an alternative, non-indenting format into the spec. | It's against our philosophy. Good point. Would you like to start on our phlisophy document? I hope this helps... ;) Clark |
From: Neil W. <neilw@ActiveState.com> - 2002-04-09 06:21:05
|
Clark C . Evans [09/04/02 01:14 -0400]: > YAML has no interest in defining stuff like Document "Title" > or "Date" since these don't apply universally. Instead, this > is something that an application specific schema (or industry > schema) would cover. DocBook is a good example of an SGML/XML > schema for book writing. Not that you'd ever want to author > a book in YAML (since it doesn't have mixed content), but a > similar YamlBook schema and appropriate documentation could > be generated which provided general structural recommendations. +1 I think YAML-Schema is what you really want, Andrew, although you might not know it yet. And, of course, it doesn't exist yet, so you're probably out of luck for the moment. Even so, consider the following example: I have a data structure here that I need to serialize in YAML. Because it's a "mydoc" structure, I'll serialize it like so: # We'll have to pick a nice abbreviation for yamlschema :) --- #YAML:1.0 !ttul.org/~nwatkiss/mydoc.yamlschema title: "My Hairy Arse" date: 2002-03-04 10:11:12.527Z content: chapters: # Three very short chapters: - ~ - ~ - ~ authors: - Neil Watkiss (That's me!) - Someone Else There you have it! Now the document itself is stored in the "content" area, but there's a bunch of metadata required by the "mydoc" schema. I haven't defined the "mydoc" schema, but for the purposes of argument: --- #YAML:1.0 !yamlschema name: "http://ttul.org/~nwatkiss/mydoc.yamlschema" definition: title: type: leaf required: yes format: any date: type: leaf format: http://yaml.org/timestamp required: yes content: type: keyed format: any required: yes children: chapters: type: series format: any required: yes children: type: leaf format: any required: yes authors: type: keyed format: any required: yes That's a pretty simplistic attempt at YAML-Schema, but you get the idea. Now the document date and title are stored in the YAML content, but your document's content is still kept separately, as specified by the "mydoc" schema. As soon as #YAML:1.0 is out, we're going to be faced with a burning need for this technology. We'll probably leverage XML-Schema, no? Later, Neil |
From: Brian I. <in...@tt...> - 2002-04-09 08:20:35
|
On 08/04/02 23:20 -0700, Neil Watkiss wrote: > Now the document date and title are stored in the YAML content, but your > document's content is still kept separately, as specified by the "mydoc" > schema. > > As soon as #YAML:1.0 is out, we're going to be faced with a burning need for > this technology. We'll probably leverage XML-Schema, no? FWIW, I am working on a proprietary large scale XML/YAML project. My team has developed two YAML Schema languages so far. The second is actually not that far from your attempt, Neil. We also have developed tools to go from XSD to YSD and from XML to YAML and back. We've leveraged this to do some amazing YAML to HTML transforms based on XML schemas. I hope that some of this code eventually becomes OSS. At least I know the issues :) Cheers, Brian |