From: Oren Ben-K. <or...@ri...> - 2001-07-31 15:59:07
|
Joe Lapp [mailto:jo...@bu...] wrote: > Hi YAMLers! > > I've had some time to play with YAML and to digest the > details of the spec. I thought I'd share the thoughts I'm > having, some of which you'll like, some of which you won't. > First, here's the good: > ... > Okay, that's the good. There's plenty of it. Now for the > bad. I'll be more detailed with the bad since I'm expecting > less agreement: > > (B1) Proliferation of cryptic identifiers. It's hard or even > impossible to guess the meaning of a YAML file without first > memorizing the identifiers, and there are more identifiers > than I'm comfortable memorizing. I also have to memorize the > placement syntax of each identifier. I recognize that many > other data formats also have this problem, but I still don't like it. Do you mean the keys (what would be, in XML, the tag names)? I agree it is an issue, but I don't think humanity has ever found a solution to that. Each "domain" tends to invent its own mini language. It is an information-theory thing; when both sides share a common data base, information may be more efficiently passed around and processed by referring to it instead of spelling it out. Which means an outsider, lacking the common base, has a hard time entering the domain. Consider HTML - how would it look if you had to write: <paragraph>....</paragraph> ....<hard-line-break/> <keep-in-one-line>.....</keep-in-one-line> Ugh. Sure, <p>...</p> ....<br/> <nobr>....</nobr> is cryptic. That seems to be life... > (B2) Lack of extensibility. Many syntactic features seem ad > hoc, such as the top-most map separator, shorthand, block > format indicators, anchors, and references. I'm not saying > that the functionality seems ad hoc, just that the syntax > seems ad hoc. How long can we grow the language in this way > before it become impossibly complex? In order to enrich the > language during its initial development, and order to provide > some of the inevitable features to come, YAML needs some > inherent mechanism for extensibility. I don't know whether I want YAML to be extensible. Supposing we settle the serialization issue somehow, I think that anything else should be layered on top of YAML-CORE, not "extended" in the core itself. The current shorthand mechanism is an escape hatch allowing us to extend YAML to some extent without quite having to touch the core; it might be wise to leave it (or something like it) in for that reason. > (B3) Inflexible formatting. Being a file intended for user > production and consumption, formatting is critical. The user > should be able to choose the format that is easiest to enter, > should data-entry have the highest priority, or the format > that is easiest to read, should readability have the highest > priority. It should be possible to format data in the strict > structured way YAML describes, and it should be possible to > run a file through a pretty-printer to force this format, but > it should not be forced. While enforcement keeps users in > line and guarantees that the data is readable, I think user > convenience should take priority. (Note that flexible > formatting wouldn't nullify (G4); it would just make (G4) optional.) Python allows a more flexible whitespace based indentation scheme, and so did YAML at first. The reason we gave it up were because it allowed a drastic simplification of the syntax for all types of scalars, while increasing their expressive power. It was thought that any modern editor should be able to allow you to easily enter valid YAML, even given the strict indentation. However, you say: > (B4) Limits editors. I'm finding that creating and editing > YAML is a pain-in-the-butt unless you're using a suitable > editor. Early last week I reviewed a number of simple > Windows editors and finally settled on one I liked. This > week I'm finding that it's too hard to edit YAML with this > editor. Because it's very hard to get users to change > editors, those using a non-YAML-friendly editor will probably > limit their use of YAML. Can you be more specific about the difficulties you faced? I (naively) expected that any modern editor would have a user- tunable automatic indentation, and that this should be enough for convenient YAML entry. Wrong? > (B5) Serialization semantics in the core. I think it's > reasonable to build serialization semantics into a syntax > that is intended exclusively for serialization, but I don't > think its baggage should weigh those who won't use > serialization. While a YAML processor need not implement > serialization, it still needs to recognize the associated > syntax, and users reading the spec are still forced to wade > through serialization. Besides, if serialization is just one > of many applications of YAML, shouldn't serialization be > layered on top of YAML, like the other applications? If the > problem is that there is otherwise no place to put > serialization information, maybe we need to ignore > serialization and focus on extensibility. (BTW, I think I > can argue pretty strongly that anchors and references are > neither needed nor desirable in the YAML core.) Oh boy. One at a time: - Serialization: is not at the core. What is in the core is an alternative syntax form which is useful for serialization (as well as other things). "By convention" certain keys of that form are used for serialization, others for comments, etc. - Extensibility: the current shorthand syntax does allow some extensibility. Clark is making a case we should completely drop extensibility (I don't know if he's looking at it this way). We need to think about what we mean by "extensibility", anyway. is it adding different syntax forms (e.g., more scalar formats)? Adding something like namespaces? XLink? - References and anchors are vital for representing general graphs. We consider these important enough to support. Now, it is an interesting notion to treat anchor and reference syntax as a form of shorthand with a layer converting them to native references... one: [&12] data two: [*12] I don't know. References are as basic a data type in most languages as are map, list and scalar. There's something to be said for directly supporting them. On the other hand, there's something to being able to manipulate the anchors and references using normal map access operations. > (B6) Lack of comments. I think any file format intended for > user viewing and/or editing requires a clear and flexible > means for comments. YAML allows me to add comments, but only > in certain places, but even so the fact that it is a comment > would not be universally recognized. I realize that adding > comments could jeopardize (G5), though I'm not convinced it must. OK, you got your wish. YAML allows comments just in certain places: attached to maps. Simply use '#' as a key and type your comment in. Problem solved? :-) > So there you go. I realize that opinions on these matters > will vary. But still, because I find the good very good and > the bad very bad, I find myself in a love/hate relationship with YAML. Life is full of compromises... Have Fun, Oren Ben-Kiki |
From: Oren Ben-K. <or...@ri...> - 2001-07-31 17:16:09
|
Switched mailers, Joe? well, at least it will solve the pesky line-wrapping issue :-) Joe Lapp [mailto:jo...@bu...] I think my reaction is just to the proliferation of cryptic single-character identifiers. I don't want to have to ask my mind to memorize the meanings of so many symbols. While '&' and '*' may be mnemonics to C programmers, they are not mnemonics to non-C programmers and non-programmers in general. %, @, |, ~ have no mnemonic value for me. When reading the spec, I found this daunting. I'd rather see a core set of symbols and everything else be English directives. Oh. Well, Perl did pretty well in spite of taking this "line noise" to extreme. Again, it is a matter of "referring to an entry in the shared database" in order to make communication more efficient. In plain English: "people will get used to it". Things aren't that bad: our "non-intuitive" characters are @, %, *, & and |. Maybe ! as well (the [ ... ] are intuitive for marking a "special section"). How about we should have a "character glossary" to prevent newbie shock? Better yet, a one-page "YAML reference card"? Suppose you had this: YAML special characters " " (four spaces) for indentation as in list (@) and map (%). : for entry, as in a map entry: key : value and a list entry: : value @ for list, as in: list: @ : Entry 1 : Entry 2 % for map, as in: map: % key1 : value1 key2 : value2 | for block, as in: code: | main() { printf("Hello, world!\n"); } - for removing the last LF of a block, as in: line noise: |- !@#\'"$% (no trailing LF) " and ' for quoted string, as in: "Hello, world!\n" : 'The key says "Hello, world!\n"' \ for escaping in quoted strings, as in: "The key was written as \"Hello, world!\\n\"" ---- for separating top level maps, as in log files: date: 2001-07-01 21:12:34 request: GET protocol: HTTP/1.0 url: /index.html ---- date: 2001-07-01 21:13:45 request: GET protocol: HTTP/1.0 url: /toc.html ---- ! for serialization, = for default value, as in: delivery: % !: date %: iso =: 2001-07-01 [ and ] for shorthand, as in: delivery [!date %iso] 2001-07-01 Would that have helped? Have fun, Oren Ben-Kiki |
From: Joe L. <jo...@bu...> - 2001-08-01 12:17:25
|
Wow, this reference card really helps. You can almost learn YAML from it. I realize that using single character identifiers is part of what makes YAML data so readable. If we gotta keep 'em, can we at least avoid overloading them? Is % the only one that is overloaded? (Why, after thoroughly studying the spec, don't I have the confidence to answer this question?) ~Joe At 08:17 PM 7/31/2001 +0200, Oren Ben-Kiki wrote: >>YAML special characters >>" " (four spaces) >> for indentation as in list (@) and map (%). >>: >> for entry, as in a map entry: >> key : value >> and a list entry: >> : value >>@ >> for list, as in: >> list: @ >> : Entry 1 >> : Entry 2 >>% >> for map, as in: >> map: % >> key1 : value1 >> key2 : value2 >>| >> for block, as in: >> code: | >> main() { >> printf("Hello, world!\n"); >> } >>- >> for removing the last LF of a block, as in: >> line noise: |- >> !@#\'"$% (no trailing LF) >>" and ' >> for quoted string, as in: >> "Hello, world!\n" : 'The key says "Hello, world!\n"' >>\ >> for escaping in quoted strings, as in: >> "The key was written as \"Hello, world!\\n\"" >>---- >> for separating top level maps, as in log files: >> date: 2001-07-01 21:12:34 >> request: GET >> protocol: HTTP/1.0 >> url: /index.html >> ---- >> date: 2001-07-01 21:13:45 >> request: GET >> protocol: HTTP/1.0 >> url: /toc.html >> ---- >> >>! >> for serialization, >>= >> for default value, as in: >> delivery: % >> !: date >> %: iso >> =: 2001-07-01 >>[ and ] >> for shorthand, as in: >> delivery [!date %iso] 2001-07-01 >> >>Would that have helped? >> >>Have fun, >> >> Oren Ben-Kiki |
From: Oren Ben-K. <or...@ri...> - 2001-08-01 09:02:18
|
Clark C . Evans [mailto:cc...@cl...] wrote: > Hmm. Perhaps Oren is on the right track with the [..] mechanism > and using Color metaphor to a greater extent. In particular, > if we look at this this way... we have the following colors: > > ! Class > # Comment > & Anchor That makes the position of the reference node rather interesting... It would probably have to be of a "reference" class: anchor: [&1] some data reference: [!ref] *1 Think how compatible that would be with YLink: stylesheet: [!url] http://..... Neat! This means that our YAML parsers are multi-layered from the start. There's the basic parsing into map/list/scalar - you can *always* deal with anything in YAML at this level - and then there are "layers" you can choose to use, such as converting references, de-serialization, including external documents, validation, ... And our very first implementations will already include at least one such layer - converting references - to keep us honest and ensure that layering functionality "works". Very neat. > Yep. Anchor isn't a node type, it is a color. With the colored > glasses... why didn't I see this before? Hmm. Perhaps I'm wrong > to backtrack from the coloring mechanism? Perhaps :-) > *sigh* > > Ok. I have "work" to do. And my brain hurts. Lets table this for a while... Have fun, Oren Ben-Kiki |
From: Joe L. <jo...@bu...> - 2001-08-01 15:19:20
|
At 12:03 PM 8/1/2001 +0200, Oren Ben-Kiki wrote: >That makes the position of the reference node rather interesting... It would >probably have to be of a "reference" class: > > anchor: [&1] some data > reference: [!ref] *1 That IS a very neat trick! But I'm a bit uncomfortable with the idea that both the parser and the application-specific deserializer will need access to this same value. Of course, in this case the !ref could be stripped. My reaction regards access to this type slot. I'm still trying to get a grip on why YAML-Core should provide special places for application-level type information. I can only think of one argument that works for me: The purpose a YAML file is to express data, even if it happens to be data serialized from programming language objects; information the serializer needs should be segregated from the data to make it easy to tell data from non-data. But are we also satisfying a functional requirement? >Lets table this for a while... Oh, but the thoughts are coming now. Gotta get them out or lose them. If you can hold up, go for it. I won't expect a response anytime soon. ~Joe |
From: Oren Ben-K. <or...@ri...> - 2001-08-02 08:59:16
|
Joe Lapp [mailto:jo...@bu...] wrote: > I'm still trying to get a grip on why YAML-Core > should provide special places for application-level > type information. I can only think of one argument > that works for me: The purpose a YAML file is to > express data, even if it happens to be data serialized > from programming language objects; information the > serializer needs should be segregated from the data > to make it easy to tell data from non-data. That doesn't work for me. To me, (de)serializer isn't special. There are many "layers" of functionality one could apply to a document (on parsing or on processing). What does work for me is: Keep YAML human readable. The only purpose of the shorthand mechanism is to be a human-readable shorthand for what we foresee to be a common case of having a certain type of maps. Nothing more. For this reason to make sense it requires two decisions. 1. Should we use a map, or should we use an "attributed graph" model, as Clark proposed? 2. If we are using a map, is it OK to make it implicit? (again, Clark feels uneasy about this). I feel very strongly about (1). Yes, we should use just a simple map/list/scalar model, instead of an attributed graph model. My reasons are: - It avoids the need for a DOM. The ability to slurp a YAML document directly into Perl/Python/Java/JavaScript etc., use only native APIS, and then spit it out unchanged, is simply priceless. An "attributed graph" destroys this ability. - It provides for a unified, single way to handle every YAML document using a trivial (map/list/scalar) API, regardless of the functionality which is optionally layered on top of that. This allows simple utilities to ignore the complexities of the layered processing. Example: The YAML parser itself would be simpler since it would know nothing of types, references, etc. These would be provided by a separate layer. Don't use it - don't even *write* it - if you don't want to use it. This may be relevant to embedded devices etc. (e.g., expanding references may require excessive memory consumption on such a device). A YAML diff program should be completely oblivious to types, references etc. It can work at the most basic map/list/scalar level. It can be written in straight Perl, etc. A YAML verification program would easily be able to handle broken references, unknown types etc. since it would NOT expand them on reading. It would just treat them as normal maps, and do whatever verification is required without having to somehow hack the YAML parser. A YAML-T processor would be able to refer to the type, reference etc. information using the normal YAML-Path operations, on both input and output. So, to summarize, I *strongly* believe that we should base our information model on simple, unattributed, map, list and scalar data types. Accepting (1), (2) becomes a matter of taste. One of YAML's goals is to be human-readable. This means that: delivery: % ! : date = : 2001-07-31 Is unacceptable (at least, Brian and I seem to share this view). The question becomes, what should we use instead? If delivery: [!date] 2001-07-31 Looks "like a scalar" that's intentional; the scalar syntax is very readable. Granted, it may be too much like a scalar. But many other variants are possible. How about one of: delivery: %(!date &17) 2001-07-31 As a compromise? It costs just one extra character to the current syntax - a good balance between making the map explicit and maintaining a readable format. I think this answers Clark's misgivings: > [The current proposal] > has two problems: > > 1. It involves an implicit map, which > I don't think is obvious. Looking > at the first item I'd assume it is > a scalar... not a map. Would the %(...) syntax solve that? > 2. It's a bit pathalogic, but what about > a user-defined mapping which has > two keys... ! and = ? What about it? It is a perfectly legal alternative way to write the exact same map, just like: key: "\ v\ a\ l\ u\ e\ \ \" Is a perfectly legal way to write the exact same pair as: key: value Of course, in both cases this is a rather ugly way to write the same information... If the problem is that we have reserved the use of some single-character keys for our purposes (so that an application trying to use them "normally" would have problems), yes, that's a concern. I'm not certain how serious it is in practice. Every language make some things reserved... We can limit the set of shorthand keys in some other way. For example, delivery: %(!date) 2001-07-31 Could be a shorthand to: delivery: % __!__ : date __=__ : 2001-07-31 (This is similar to the python way of reserving keys). Or we could find some other creative way to minimize the pattern of reserved keys... However as long as we accept (1) above, we *must* have some set of reserved keys. Hmmm... Is this a good time to raise the namespace issue for keys? :-) Other issues: - I love Clark's idea to make the top level production be one of a map or a list. Best of both worlds indeed. Way to go, Clark! - As for using ^ instead of % for 'transfer encoding', I like Clark's notion of using | for process chain instead: picture: %(!image/bmp|gzip|base64 &17) ... This removes the need for ^ or %. It also means that the value of a shorthand key may contain any character except for white space and ( ). Seems reasonable... Have fun, Oren Ben-Kiki |
From: Joe L. <jo...@bu...> - 2001-08-02 11:46:32
|
(Note that I changed the subject line, and to ensure continuity, included the full text of Oren's email in my response.) At 11:59 AM 8/2/2001 +0200, Oren Ben-Kiki wrote: >Joe Lapp [mailto:jo...@bu...] wrote: > > I'm still trying to get a grip on why YAML-Core > > should provide special places for application-level > > type information. I can only think of one argument > > that works for me: The purpose a YAML file is to > > express data, even if it happens to be data serialized > > from programming language objects; information the > > serializer needs should be segregated from the data > > to make it easy to tell data from non-data. > >That doesn't work for me. To me, (de)serializer isn't >special. There are many "layers" of functionality one >could apply to a document (on parsing or on processing). > >What does work for me is: Keep YAML human readable. The >only purpose of the shorthand mechanism is to be a >human-readable shorthand for what we foresee to be a >common case of having a certain type of maps. Nothing >more. > >For this reason to make sense it requires two decisions. > >1. Should we use a map, or should we use an "attributed >graph" model, as Clark proposed? In the end, no matter what information model we attach to the YAML syntax, I'm pretty sure every node of the YAML object graph will need at least a type/class attribute. I think we can borrow a lesson learned from XML. A specific application may implicitly know the type structure of a YAML document, but generic tools will not. A generic tool may need a way to know the type any given node. Since only the specific application has this information, the application must hand this information to the tool. It can hand the information either via the object tree or via a separate metadata structure. For the general case, that metadata structure would have to mirror the YAML tree. Unless the YAML tree itself contains the types, the tool will likely have to perform identical operations on both the YAML tree and the metadata structure. A utility that tests two graphs for equality is an example of such a tool. In XML, the situation is actually quite a bit worse than this, and XML-Schema is mostly to blame. It's very hard to implement schema validation without allowing the the validator to attach metadata to each node. I very much hope that we could write a YAML-Schema that needed no more than a class name for each node. The upshot is that YAML node APIs will probably end up with attributes anyway. We just have to name these attributes in the information model and not allow applications to define their own attributes. >2. If we are using a map, is it OK to make it implicit? >(again, Clark feels uneasy about this). Clark explained your proposal to me. I didn't quite get it. Apparently shorthand notation is indeed a shorthand notation, one for maps. I assume it's supposed to be equivalent to a map (or supplementary to a map). I guess that means shorthand notation can only appear on maps. If this is what you mean, then I too am quite uneasy with it. You would have to add conflict resolution rules to YAML, and I think in the end users will suffer a little more confusion. Besides, I hate having two different ways to do the same thing. >I feel very strongly about (1). Yes, we should use just >a simple map/list/scalar model, instead of an attributed >graph model. My reasons are: > >- It avoids the need for a DOM. The ability to slurp a >YAML document directly into Perl/Python/Java/JavaScript >etc., use only native APIS, and then spit it out unchanged, >is simply priceless. An "attributed graph" destroys this >ability. I agree 100%. What if we limit ourselves to just a class name attribute? If the only attribute were class name, would we destroy this ability? If it's not a class name, then it becomes a map pair. Even then, wouldn't the deserializer consume this information? Wouldn't it consume the format string, for example? I do agree, though, that if we have much more than a class name attribute, we'll end up with a DOM that doesn't map well into anything. The only other two attributes we have are format and default (right?). I'm uncomfortable with both of them, even if they are clever. I don't think either is worth its cost in complexity. >- It provides for a unified, single way to handle every >YAML document using a trivial (map/list/scalar) API, >regardless of the functionality which is optionally >layered on top of that. This allows simple utilities >to ignore the complexities of the layered processing. Your proposal definitely does this, and I think we have to accomplish this, but I'd rather do it by pre-defining the attributes. >Example: > >The YAML parser itself would be simpler since it would >know nothing of types, references, etc. These would be >provided by a separate layer. Don't use it - don't even >*write* it - if you don't want to use it. This may be >relevant to embedded devices etc. (e.g., expanding >references may require excessive memory consumption on >such a device). > >A YAML diff program should be completely oblivious to >types, references etc. It can work at the most basic >map/list/scalar level. It can be written in straight >Perl, etc. > >A YAML verification program would easily be able to >handle broken references, unknown types etc. since >it would NOT expand them on reading. It would just >treat them as normal maps, and do whatever verification >is required without having to somehow hack the YAML >parser. > >A YAML-T processor would be able to refer to the type, >reference etc. information using the normal YAML-Path >operations, on both input and output. > >So, to summarize, I *strongly* believe that we should >base our information model on simple, unattributed, >map, list and scalar data types. > >Accepting (1), (2) becomes a matter of taste. One of >YAML's goals is to be human-readable. This means that: > >delivery: % > ! : date > = : 2001-07-31 > >Is unacceptable (at least, Brian and I seem to share >this view). The question becomes, what should we use >instead? If > >delivery: [!date] 2001-07-31 > >Looks "like a scalar" that's intentional; the scalar >syntax is very readable. Granted, it may be too much >like a scalar. But many other variants are possible. >How about one of: > >delivery: %(!date &17) 2001-07-31 > >As a compromise? It costs just one extra character >to the current syntax - a good balance between >making the map explicit and maintaining a readable >format. I think one of YAML's strongest advantages is that the data looks like what it is. As clever and clean as this proposal is, I think it weakens that advantage. >I think this answers Clark's misgivings: > > [The current proposal] > > has two problems: > > > > 1. It involves an implicit map, which > > I don't think is obvious. Looking > > at the first item I'd assume it is > > a scalar... not a map. > >Would the %(...) syntax solve that? > > > 2. It's a bit pathalogic, but what about > > a user-defined mapping which has > > two keys... ! and = ? > >What about it? It is a perfectly legal alternative >way to write the exact same map, just like: > >key: "\ > v\ > a\ > l\ > u\ > e\ > \ > \" > >Is a perfectly legal way to write the exact same pair >as: > >key: value > >Of course, in both cases this is a rather ugly way to >write the same information... > >If the problem is that we have reserved the use of some >single-character keys for our purposes (so that an >application trying to use them "normally" would have >problems), yes, that's a concern. I'm not certain >how serious it is in practice. Every language make >some things reserved... > >We can limit the set of shorthand keys in some other >way. For example, > >delivery: %(!date) 2001-07-31 > >Could be a shorthand to: > >delivery: % > __!__ : date > __=__ : 2001-07-31 Eeeks!! >(This is similar to the python way of reserving keys). >Or we could find some other creative way to minimize the >pattern of reserved keys... > >However as long as we accept (1) above, we *must* have some >set of reserved keys. Hmmm... Is this a good time to raise >the namespace issue for keys? :-) > >Other issues: > >- I love Clark's idea to make the top level production be >one of a map or a list. Best of both worlds indeed. Way to >go, Clark! Maybe I missed this. Is this the one that requires lookahead? How do you tell which you have? I think lengthy lookaheads are dangerous. >- As for using ^ instead of % for 'transfer encoding', I like >Clark's notion of using | for process chain instead: > >picture: %(!image/bmp|gzip|base64 &17) ... > >This removes the need for ^ or %. It also means that the >value of a shorthand key may contain any character except >for white space and ( ). Seems reasonable... I'm not sure why we're inclined to support encodings. Because of its mandatory indentation, I don't think YAML is a good format for binary data. I guess there are text encodings too, and you might want to designate that. >Have fun, > > Oren Ben-Kiki I think we're discussing the information model, hence my next post... ~Joe |
From: Joe L. <jo...@bu...> - 2001-07-31 18:06:52
|
At 06:59 PM 7/31/2001 +0200, Oren Ben-Kiki wrote: > > (B1) Proliferation of cryptic identifiers. [...] >is cryptic. That seems to be life... Yeah. <sigh> >I don't know whether I want YAML to be extensible. Supposing we >settle the serialization issue somehow, I think that anything >else should be layered on top of YAML-CORE, not "extended" in >the core itself. That's what I mean, me thinks. The core would provide an extension mechanism. Anything that must go in the core goes in the core using the extension mechanism. Anything that could go outside the core goes outside the core. All extensions are standardized. I'm just saying that if we think of layering serialization on top of the core, we'd be putting less serialization in the core. >Python allows a more flexible whitespace based indentation scheme, >and so did YAML at first. The reason we gave it up were because it >allowed a drastic simplification of the syntax for all types of >scalars, while increasing their expressive power. Agreed. A difficult but probably worthwhile sacrifice. >Can you be more specific about the difficulties you faced? I >(naively) expected that any modern editor would have a user- >tunable automatic indentation, and that this should be enough >for convenient YAML entry. Wrong? I just sat down to comment some of the code I've written and found myself doing the same sort of text manipulation that I was doing in the YAML. The only difference was that in YAML my text tended to be much more greatly indented, thus emphasizing the problem I already have with comments. There seem to be two main bugaboos: (1) Once I've put text on a line, it takes a bit more work to delete the preceding whitespace (one backspace no longer does it), and I seem to re-justify often. (2) Once entries are separated by more than a few lines, I have trouble lining up the columns. Seems that I should only have to count my sets of four spaces or backspaces, but I seem to screw this up often enough to make it time-consuming. So you're right, YAML doesn't ask anything of me that I'm not already doing. It doesn't create problems that I don't already have. But it does seem to exacerbate them. >- Extensibility: the current shorthand syntax does allow some >extensibility. Clark is making a case we should completely drop >extensibility (I don't know if he's looking at it this way). >We need to think about what we mean by "extensibility", anyway. >is it adding different syntax forms (e.g., more scalar formats)? >Adding something like namespaces? XLink? How about creating an extensible mechanism for annotating nodes >OK, you got your wish. YAML allows comments just in certain places: >attached to maps. Simply use '#' as a key and type your comment in. >Problem solved? :-) Feeling much better, thank you! ~Joe |
From: Clark C . E. <cc...@cl...> - 2001-07-31 18:19:20
|
On Tue, Jul 31, 2001 at 02:13:03PM -0400, Joe Lapp wrote: | There seem to be two main bugaboos: (1) Once I've put text on a | line, it takes a bit more work to delete the preceding whitespace (one | backspace no longer does it), and I seem to re-justify often. (2) | Once entries are separated by more than a few lines, I have trouble | lining up the columns. Seems that I should only have to count my sets | of four spaces or backspaces, but I seem to screw this up often enough | to make it time-consuming. Yep. However, typing in XML so that it is readable is just as much of a challenge... | >- Extensibility: the current shorthand syntax does allow some | >extensibility. Clark is making a case we should completely drop | >extensibility (I don't know if he's looking at it this way). | >We need to think about what we mean by "extensibility", anyway. | >is it adding different syntax forms (e.g., more scalar formats)? | >Adding something like namespaces? XLink? | | How about creating an extensible mechanism for annotating nodes Hmm. Perhaps Oren is on the right track with the [..] mechanism and using Color metaphor to a greater extent. In particular, if we look at this this way... we have the following colors: ! Class # Comment & Anchor Yep. Anchor isn't a node type, it is a color. With the colored glasses... why didn't I see this before? Hmm. Perhaps I'm wrong to backtrack from the coloring mechanism? *sigh* Ok. I have "work" to do. And my brain hurts. Clark |
From: Joe L. <jo...@bu...> - 2001-08-01 15:00:42
|
At 02:24 PM 7/31/2001 -0400, Clark C . Evans wrote: >Hmm. Perhaps Oren is on the right track with the [..] mechanism >and using Color metaphor to a greater extent. In particular, >if we look at this this way... we have the following colors: > > ! Class > # Comment > & Anchor > >Yep. Anchor isn't a node type, it is a color. With the colored >glasses... why didn't I see this before? Hmm. Perhaps I'm wrong >to backtrack from the coloring mechanism? I also like Oren's proposal, but the difference in syntax isn't that significant. I think it's just a matter of whether a thing goes in the brackets or not. However, it could greatly simplify the relevant portions of the BNF. The productions would just do this: attribute ::= '[' S? (akey ':' avalue S?)+ ']' {C1} akey ::= [&*!%] avalue ::= <whatever> Constraint {C1} says that no akey may occur more than once. Additional constraints may indicate which akeys are valid for which nodes, and which combinations of akeys are valid. This is not something that the productions should reflect, as it makes them hard to read. Providing the mechanism does give us an official place to put extensions, even if every new extension introduces yet another cryptic identifier. Perhaps some things could be expressed with English keywords? >Ok. I have "work" to do. And my brain hurts. I'm sorry, I know my posts are making it even harder on you. ~joe |
From: Clark C . E. <cc...@cl...> - 2001-08-01 17:52:46
|
On Wed, Aug 01, 2001 at 11:07:41AM -0400, Joe Lapp wrote: | attribute ::= '[' S? (akey ':' avalue S?)+ ']' {C1} | akey ::= [&*!%] | avalue ::= <whatever> | | Constraint {C1} says that no akey may occur more than once. | Additional constraints may indicate which akeys are valid for which | nodes, and which combinations of akeys are valid. This is not | something that the productions should reflect, as it makes them hard | to read. Speaking of constraints. The current SPEC does not dive into the various constraints. We should update the spec in this regard to add any constraints that are missing... Best, Clark |
From: Clark C . E. <cc...@cl...> - 2001-08-01 18:14:39
|
On the topic of [...] On Wed, Aug 01, 2001 at 11:07:41AM -0400, Joe Lapp wrote: | Providing the mechanism does give us an official place to put | extensions, even if every new extension introduces yet another | cryptic identifier. Perhaps some things could be expressed with | English keywords? First, the production reserves alot of characters that can be used for extensions... this mechanism need not be the only way to extend in the future. Although... I'm not certain how much more extension we will require. Second, I like cryptic identifiers for YAML keywords. I think Perl is good in this regard. We could use English, but are "a" and "href" any less cryptic? | > | > anchor: [&1] some data | > reference: [!ref] 1 | | That IS a very neat trick! This also makes the reference identifier informational, which, by the way, is a current request that I need for my current application... I'd like the reference to be the primary key for a row in a table. | But I'm a bit uncomfortable with the idea that both the parser and | the application-specific deserializer will need access to this same | value. Of course, in this case the !ref could be stripped. My | reaction regards access to this type slot. At the API level, I was thinking that a reference node just "inherits" from a map, list, or scalar node and forwards all map, list, or scalar requests on to the referenced node. Just a thought... | I'm still trying to get a grip on why YAML-Core should provide | special places for application-level type information. I can only | think of one argument that works for me: The purpose a YAML file is | to express data, even if it happens to be data serialized from | programming language objects; information the serializer needs | should be segregated from the data to make it easy to tell data from | non-data. Thus, we would be changing our information model from a graph to an attributed graph. Only that we stricly limit the attributes to a finite set with very specific meaning. For starters we have ... & Name (aka Anchor) ! Type (aka Class) ^ Transfer Encoding By the way, SOAP only has a limited set of attributes, "xmlns", "name" and "type". I think they bungled the separation between type and transfer encoding (which MIME got right...) ... I like adding these as attributes to every node rather than having the magical coloring system. Best, Clark |