From: Andrew K. <ku...@sf...> - 2002-05-11 03:54:36
|
> From: Rolf Veen <rol...@he...> > Subject: Re: [Yaml-core] a few questions... > > > Andrew Kurn wrote: > > > Uh . . . I mean, yes, as you say, it's an "attribute", which > > again is a restriction on tree, an important one from the > > man-machine point of view. Yes, the general idea of labeled > > graph contains all the YAML sentences (using the terminology > > of formal languages), but it's too general and contains a lot > > of other stuff besides. > > Besides the fact that this is not YAML anymore, what are the > concrete problems with this approach ? > I seem to be missing it. Here's what I think we're talking about: You say "Let's simplify the data model for YAML. The grammar is too complicated. We could get along with vertex-labeled di-graphs." I say, "That's much too simple. It allows all kinds of freaks, along with the graphs we want (models of arrays and hashes)." I agree generally with the 3 caballeros: When writing a spec, you should be restrictive at the beginning. Then you can relax it as you go, after due consideration for good reason. Does this make sense? > > > I might propose, as you might, specific simplifications > > to the current grammar, which is pretty big, never fear, > > but pushing it all the way back to labeled graph is just > > too, too far. > > My problem is that I like the YAML idea, but I do have a > node structure below :-) (One that preserves order). > > Obviously you can argue that this is not the purpose of YAML. Oh, no, I wouldn't. I think the idea of YAML is to be able to dump *anything*. That's where I feel on shaky ground. Until the phil doc comes out, we'll have to rely on the 3 C's to tell us whether my idea of YAML is right. So, OK. You want to preserve order. So, when you dump your data, the daughters of each node are dumped as a !seq, each of which is a (doesn't matter: seq or map) of 2 elts: label/value and list of daughters. This must be fairly close to your internal representation, anyway. > Regards. > Rolf. > > > What do you think? Andrew |
From: Rolf V. <rol...@he...> - 2002-05-13 08:08:37
|
Andrew Kurn wrote: > So, OK. You want to preserve order. So, when you dump > your data, the daughters of each node are dumped as > a !seq, each of which is a (doesn't matter: seq or map) > of 2 elts: label/value and list of daughters. > > This must be fairly close to your internal representation, > anyway. Sure. YAML can represent the node structure I have, in one way or another. That's not the point. The point is that YAML is based on 2 basic native building blocks: maps and lists (besides scalars). This duality is reflected in the syntax. What happens is that sometimes I'm feeling a compulsive need :-), after being triggered by some discussion here (for example, the syntax evolution case), to think loud about a syntax based on only 1 building block: node. > What do you think? I think I'll better go and do some coding ;-) Rolf. |
From: Clark C . E. <cc...@cl...> - 2002-05-13 14:57:00
|
On Mon, May 13, 2002 at 10:07:23AM +0200, Rolf Veen wrote: | Sure. YAML can represent the node structure I have, in one | way or another. That's not the point. The point is that YAML | is based on 2 basic native building blocks: maps and lists | (besides scalars). This duality is reflected in the syntax. Yes. I can argue why these two forms are special. First, they are both formulations of the mathematical function. The map construct allows for the expression of N->N functions, and the list is a very common restrictions of functions, where the domain is limited to positive integers. Indeed mathematics has a special name for such functions, it calls them sequences. Sequences are the cornerstone of understanding the derivitive, integral, and other important mathematical constructs. So, I assert from the "theory" end of things this is goodness. From the pratical end of things, over the last 30 years the array/list (sequence) has become a dominant data structure, so much so that I don't know of one language that doesn't natively support it. Further, the hashtable (or structure or record) has become increasingly relevant since the early 80's as people got sick and tired of using alternating or parallel arrays to represent arbitrary functions. The "C++ object" is a map using keys that are resolved at compile time, where the hashtable is a map with keys resolved at run time. These two data structures did not become dominant by pure chance... they happen to model the world very well, especially when used in combination with each other. So, from the "pratical" world, I assert that this distinction is goodness. Certainly you could implement mappings with sequences using the alternating key/value pattern or implement sequences as mappings using integer keys. However, mathematics has made this distinction in high school algebra courses; and pratical computer languages have adopted the mapping and the sequence as core constructs. Thus, I put it to you that both of these, in combination, while not strictly necessary make the system far more useable. | What happens is that sometimes I'm feeling a compulsive need :-), | after being triggered by some discussion here (for example, | the syntax evolution case), to think loud about a syntax | based on only 1 building block: node. Often times when you try to cover all user requirements with a single container, you end up replacing the two simple constructs of "map" and "list" with a more complicated structure called a "named list". Which is in essence the XML model. The problem with the named list is that it is not adequate. In cases where you want to use it to emulate maps, it implies that the order of the map keys is significant (when it isn't). In cases where you want to use it as a list, it implies you must have a name (when you don't). The result is that you are forced to add "extra" information in your model which doesn't exist in the real world... order-list: order-line: product: Basketball quantity: 3 order-line: product: Sneakers quantity: 2 In this case, order-line is not needed, and the order amoung product/quantity is not needed. Thus, by simplifying the model things got "more" complicated, not less. This is due to a lack of expressive power with a unified node. Further, the above structure can't be easliy loaded into native data structures; you'd need a DOM or some other construct. ;) Clark |