From: Oren Ben-K. <or...@ri...> - 2001-11-11 10:10:32
|
Brian Ingerson [mailto:in...@tt...] wrote: > On 10/11/01 20:47 -0500, Clark C . Evans wrote: > > I want to get back to our original information model. > > > > The information model of YAML is a typed *graph*, > > where each node can be one of three kinds, > > This is the kind of stuff I like you and Oren to argue about. > I'm just a > plain old JAPH. If it works for Perl in the end, I'm happy. > If not, then we > have a problem. :-) At any rate, I do think Clark is getting a bit carried away. Yes, everything in the world can be seen as a colored graph or RDF or a table or whatever. I don't think we should get into that right now. If there's enough demand so we'll add a "table" type later using +, fine. As for his concern about anchors and types messing up the information model. I beg to differ. Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least *some* notion of run time type information. C doesn't, and indeed when loading a YAML file into C you *must* encode the type info somehow. It isn't optional. Anchor: Every language I know of, except perhaps /bin/sh, has a notion of a transient unique object id. That's typically the address of an object in memory. The anchor property is the same thing. Now, taking /bin/sh as an example of a language lacking this feature, you'd have to encode anchors somehow. Again it isn't optional. Encoding is covered in the "unsupported types" section. I don't see it as a problem. So, I think that all we need is some wording clarifying the above. The last paragraph in the information model section tries to do it, I guess it should be reworked. Now, that shouldn't stop us from finding new "grand unified theories" for data modeling, and even applying them to YAML where it makes sense. Just leave them out of the spec itself. Once you start going in this direction there's no stopping until you show how the whole world is based on one simple construct, which is theoretically pleasing but is too low-level to be of any actual use. For example, take Clark's latest "labeled graph" notion. It is a great model and I'll be more than happy to see it mentioned in any presentation about YAML. It would help people doing formal work with YAML, such as verification/validation, applying algorithms to it, etc. But when all is said and done, would JAPH load a YAML file into some generic "labeled graph" structure? No way. He'll use hash tables and arrays. Let's not confuse people by placing the blue-sky theory in the spec, and bask in the warmth of our knowledge that this theory is available for those who seek it :-) Have fun, Oren Ben-Kiki |
From: Clark C . E. <cc...@cl...> - 2001-11-11 15:27:08
|
On Sun, Nov 11, 2001 at 12:11:22PM +0200, Oren Ben-Kiki wrote: | | As for his concern about anchors and types messing up the | information model. I beg to differ. | | Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least | *some* notion of run time type information. C doesn't, and indeed when | loading a YAML file into C you *must* encode the type info somehow. I'm not concerned about types. I'm concerned about the anchors. Do we have a tree /w anchors or a graph? | Anchor: Every language I know of, except perhaps /bin/sh, has a notion of a | transient unique object id. That's typically the address of an object in | memory. The anchor property is the same thing. Now, taking /bin/sh as an | example of a language lacking this feature, you'd have to encode anchors | somehow. Again it isn't optional. This is the question: Are anchors accessable after a YAML object is loaded into memory? I'm not being flippiant; there is a difference between a graph and a tree /w anchors. In a graph, the only mechanism which has access to the anchors is the "alias" mechanism. In this way, the anchors are confined to the seralization mechanism. In a tree /w anchors model, the anchors have much higher visibility and you probably need to maintain the node's parent property. Big difference. | Let's not confuse people by placing the blue-sky theory | in the spec, and bask in the warmth of our knowledge that | this theory is available for those who seek it :-) Please. This is theory, but it's not "blue-sky". It has real life consequences on our semantics. We have a choice here. Origionally we had a graph... now in the documentation we have an anchored tree. Clark |
From: Brian I. <in...@tt...> - 2001-11-11 21:46:45
|
On 11/11/01 10:39 -0500, Clark C . Evans wrote: > On Sun, Nov 11, 2001 at 12:11:22PM +0200, Oren Ben-Kiki wrote: > | > | As for his concern about anchors and types messing up the > | information model. I beg to differ. > | > | Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least > | *some* notion of run time type information. C doesn't, and indeed when > | loading a YAML file into C you *must* encode the type info somehow. > > I'm not concerned about types. I'm concerned about > the anchors. Do we have a tree /w anchors or a graph? I wouldn't know a directed graph or an anchored tree if one bit me in the ass. I think I agree with Oren that almost all languages support anchors, and you could use a map wrapper to round-trip through those that don't. foo: &001 bar can be foo: &: 001 =: bar Please explain your concerns to me in JAPH terms. What does it all mean to the lowly Perl hacker? Cheers, Brian |
From: Clark C . E. <cc...@cl...> - 2001-11-11 22:38:06
|
On Sun, Nov 11, 2001 at 01:46:40PM -0800, Brian Ingerson wrote: | On 11/11/01 10:39 -0500, Clark C . Evans wrote: | > On Sun, Nov 11, 2001 at 12:11:22PM +0200, Oren Ben-Kiki wrote: | > | | > | As for his concern about anchors and types messing up the | > | information model. I beg to differ. | > | | > | Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least | > | *some* notion of run time type information. C doesn't, and indeed when | > | loading a YAML file into C you *must* encode the type info somehow. | > | > I'm not concerned about types. I'm concerned about | > the anchors. Do we have a tree /w anchors or a graph? | | I think I agree with Oren that almost all languages support | anchors, and you could use a map wrapper to round-trip through | those that don't. We have *two* information models; one is our "native" or "random access" model, and the other is our "serialization" "sequential access" model. The August specification made a distinction between the two, and I think we need to move back to this distinction. Random Access: - Our model is a "graph", in other words, a node may have multiple "incoming arrows" or "parents". - Alias nodes and anchors are not part of this model; you can't ask for the node's anchor, and there is no such thing as an alias node. - Kinds of nodes are: scalar, map, sequence. - Each node has a "class". Sequential Access: - Our model is a "tree". In other words, each node has exactly one parent. - To record random access links, we use a special kind of node (alias) and we attach a special chunk of data to each node (anchor). - Kinds of nodes are: scalar, map, sequence, and alias. - Each node has a "class" and an "anchor". To convert from random to sequential access, the first time a node is encountered, it is serialized and given an anchor. Subsequent occurances of the node is replaced with an anchor having the same anchor value as the sequential copy of the node. To convert from sequential to random access, each time an anchor node is encountered, it is replaced with the node having an anchor of the same number. What I'd like is for the specification to once again split these two models... beacuse they *are* very different. This is my primary issue. In particular, "anchor" is *not* an implicit type. It is very special and must be treated as such. It is not reported in the random access model. ... Now, that said, I was playing with a small altercation of the information model such that our "branch" nodes were map nodes. This is kinda neet. It allows us to make the differnce between "branch" and "scalar" nodes; instead of "sequence", "map" and "scalar" nodes. What's neat, is that sequence can be made into a type... a special case where the keys of the map are integers. | foo: &001 bar | | can be | | foo: | &: 001 | =: bar I don't think we should speculate how a language which doesn't support aliases (such as C) would do this. I think that it's up to the C implementation; or, the C implementation may only deal with YAML via the sequential model. Best, Clark |
From: Clark C . E. <cc...@cl...> - 2001-11-11 22:50:11
|
Also, the producion should be ammended to not allow an anchor attribute on the same line as an alias node. Since... $a = ['val'] $b = $a $c = $b $d = [$a,$b,$c] is the same as... $a = ['val'] $b = $a $c = $a $d = [$a,$b,$c] We know that... - &001 - val - &002 *001 - *002 is the same as... - &001 - val - *001 - *001 Thus... we can soundly forbid anchor attributes on an alias node. Clark On Sun, Nov 11, 2001 at 05:50:01PM -0500, Clark C . Evans wrote: | On Sun, Nov 11, 2001 at 01:46:40PM -0800, Brian Ingerson wrote: | | On 11/11/01 10:39 -0500, Clark C . Evans wrote: | | > On Sun, Nov 11, 2001 at 12:11:22PM +0200, Oren Ben-Kiki wrote: | | > | | | > | As for his concern about anchors and types messing up the | | > | information model. I beg to differ. | | > | | | > | Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least | | > | *some* notion of run time type information. C doesn't, and indeed when | | > | loading a YAML file into C you *must* encode the type info somehow. | | > | | > I'm not concerned about types. I'm concerned about | | > the anchors. Do we have a tree /w anchors or a graph? | | | | I think I agree with Oren that almost all languages support | | anchors, and you could use a map wrapper to round-trip through | | those that don't. | | We have *two* information models; one is our "native" or | "random access" model, and the other is our "serialization" | "sequential access" model. The August specification made | a distinction between the two, and I think we need to move | back to this distinction. | | Random Access: | | - Our model is a "graph", in other words, a node may have | multiple "incoming arrows" or "parents". | | - Alias nodes and anchors are not part of this model; | you can't ask for the node's anchor, and there is | no such thing as an alias node. | | - Kinds of nodes are: scalar, map, sequence. | | - Each node has a "class". | | | Sequential Access: | | - Our model is a "tree". In other words, each node has | exactly one parent. | | - To record random access links, we use a special kind | of node (alias) and we attach a special chunk of data | to each node (anchor). | | - Kinds of nodes are: scalar, map, sequence, and alias. | | - Each node has a "class" and an "anchor". | | | To convert from random to sequential access, the first time | a node is encountered, it is serialized and given an anchor. | Subsequent occurances of the node is replaced with an anchor | having the same anchor value as the sequential copy of the node. | To convert from sequential to random access, each time an | anchor node is encountered, it is replaced with the node | having an anchor of the same number. | | What I'd like is for the specification to once again split | these two models... beacuse they *are* very different. This | is my primary issue. In particular, "anchor" is *not* an | implicit type. It is very special and must be treated as | such. It is not reported in the random access model. | | ... | | Now, that said, I was playing with a small altercation of the | information model such that our "branch" nodes were map nodes. | This is kinda neet. It allows us to make the differnce between | "branch" and "scalar" nodes; instead of "sequence", "map" and | "scalar" nodes. What's neat, is that sequence can be made | into a type... a special case where the keys of the map are | integers. | | | | foo: &001 bar | | | | can be | | | | foo: | | &: 001 | | =: bar | | I don't think we should speculate how a language which | doesn't support aliases (such as C) would do this. I think | that it's up to the C implementation; or, the C implementation | may only deal with YAML via the sequential model. | | Best, | | Clark | | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core -- Clark C. Evans Axista, Inc. http:\\axista.com 800.926.5525 Collaborative Software for Project Management Patriotisim means protecting core values during difficult times, not pasting a flag on your SUV and repealing the Bill of Rights. |
From: Clark C . E. <cc...@cl...> - 2001-11-11 23:08:47
|
Brian just asked the question on the list if alias nodes can have types. The answer is no. An alias node is a figment of the serialization model only. When loaded into the random access model it is *replaced* with the node that has the same anchor. Thus, we need to update the productions to forbid a !class on the *alias node. Clark On Sun, Nov 11, 2001 at 06:02:09PM -0500, Clark C . Evans wrote: | Also, the producion should be ammended to not allow | an anchor attribute on the same line as an alias node. | | Since... | | $a = ['val'] | $b = $a | $c = $b | $d = [$a,$b,$c] | | is the same as... | | $a = ['val'] | $b = $a | $c = $a | $d = [$a,$b,$c] | | We know that... | | - &001 | - val | - &002 *001 | - *002 | | is the same as... | | - &001 | - val | - *001 | - *001 | | Thus... we can soundly forbid anchor attributes | on an alias node. | | | Clark | | On Sun, Nov 11, 2001 at 05:50:01PM -0500, Clark C . Evans wrote: | | On Sun, Nov 11, 2001 at 01:46:40PM -0800, Brian Ingerson wrote: | | | On 11/11/01 10:39 -0500, Clark C . Evans wrote: | | | > On Sun, Nov 11, 2001 at 12:11:22PM +0200, Oren Ben-Kiki wrote: | | | > | | | | > | As for his concern about anchors and types messing up the | | | > | information model. I beg to differ. | | | > | | | | > | Type: Perl/Python/Java/SmallTalk/Lisp/JavaScript/C++ all have at least | | | > | *some* notion of run time type information. C doesn't, and indeed when | | | > | loading a YAML file into C you *must* encode the type info somehow. | | | > | | | > I'm not concerned about types. I'm concerned about | | | > the anchors. Do we have a tree /w anchors or a graph? | | | | | | I think I agree with Oren that almost all languages support | | | anchors, and you could use a map wrapper to round-trip through | | | those that don't. | | | | We have *two* information models; one is our "native" or | | "random access" model, and the other is our "serialization" | | "sequential access" model. The August specification made | | a distinction between the two, and I think we need to move | | back to this distinction. | | | | Random Access: | | | | - Our model is a "graph", in other words, a node may have | | multiple "incoming arrows" or "parents". | | | | - Alias nodes and anchors are not part of this model; | | you can't ask for the node's anchor, and there is | | no such thing as an alias node. | | | | - Kinds of nodes are: scalar, map, sequence. | | | | - Each node has a "class". | | | | | | Sequential Access: | | | | - Our model is a "tree". In other words, each node has | | exactly one parent. | | | | - To record random access links, we use a special kind | | of node (alias) and we attach a special chunk of data | | to each node (anchor). | | | | - Kinds of nodes are: scalar, map, sequence, and alias. | | | | - Each node has a "class" and an "anchor". | | | | | | To convert from random to sequential access, the first time | | a node is encountered, it is serialized and given an anchor. | | Subsequent occurances of the node is replaced with an anchor | | having the same anchor value as the sequential copy of the node. | | To convert from sequential to random access, each time an | | anchor node is encountered, it is replaced with the node | | having an anchor of the same number. | | | | What I'd like is for the specification to once again split | | these two models... beacuse they *are* very different. This | | is my primary issue. In particular, "anchor" is *not* an | | implicit type. It is very special and must be treated as | | such. It is not reported in the random access model. | | | | ... | | | | Now, that said, I was playing with a small altercation of the | | information model such that our "branch" nodes were map nodes. | | This is kinda neet. It allows us to make the differnce between | | "branch" and "scalar" nodes; instead of "sequence", "map" and | | "scalar" nodes. What's neat, is that sequence can be made | | into a type... a special case where the keys of the map are | | integers. | | | | | | | foo: &001 bar | | | | | | can be | | | | | | foo: | | | &: 001 | | | =: bar | | | | I don't think we should speculate how a language which | | doesn't support aliases (such as C) would do this. I think | | that it's up to the C implementation; or, the C implementation | | may only deal with YAML via the sequential model. | | | | Best, | | | | Clark | | | | _______________________________________________ | | Yaml-core mailing list | | Yam...@li... | | https://lists.sourceforge.net/lists/listinfo/yaml-core | | -- | Clark C. Evans Axista, Inc. | http:\\axista.com 800.926.5525 | Collaborative Software for Project Management | | Patriotisim means protecting core values during difficult times, | not pasting a flag on your SUV and repealing the Bill of Rights. | | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core -- Clark C. Evans Axista, Inc. http:\\axista.com 800.926.5525 Collaborative Software for Project Management Patriotisim means protecting core values during difficult times, not pasting a flag on your SUV and repealing the Bill of Rights. |
From: Brian I. <in...@tt...> - 2001-11-11 23:18:53
|
On 11/11/01 18:02 -0500, Clark C . Evans wrote: > Also, the producion should be ammended to not allow > an anchor attribute on the same line as an alias node. > > Since... > > $a = ['val'] > $b = $a > $c = $b > $d = [$a,$b,$c] > > is the same as... > > $a = ['val'] > $b = $a > $c = $a > $d = [$a,$b,$c] > > We know that... > > - &001 > - val > - &002 *001 > - *002 > > is the same as... > > - &001 > - val > - *001 > - *001 > > Thus... we can soundly forbid anchor attributes > on an alias node. Well I don't really care. I don't think an emmitter would do it in the first place, and any human writing up YAML with aliases is probably psychotic. Cheers, Brian |
From: Brian I. <in...@tt...> - 2001-11-12 00:34:35
|
On 11/11/01 15:18 -0800, Brian Ingerson wrote: > On 11/11/01 18:02 -0500, Clark C . Evans wrote: > > Also, the producion should be ammended to not allow > > an anchor attribute on the same line as an alias node. > > > > Since... > > > > $a = ['val'] > > $b = $a > > $c = $b > > $d = [$a,$b,$c] > > > > is the same as... > > > > $a = ['val'] > > $b = $a > > $c = $a > > $d = [$a,$b,$c] > > > > We know that... > > > > - &001 > > - val > > - &002 *001 > > - *002 > > > > is the same as... > > > > - &001 > > - val > > - *001 > > - *001 > > > > Thus... we can soundly forbid anchor attributes > > on an alias node. > > Well I don't really care. I don't think an emmitter would do it in the first > place, and any human writing up YAML with aliases is probably psychotic. BTW, now that we have coined and defined 'anchors' and 'aliases', can we make the proposed 'org.yaml.ptr', be 'org.yaml.ref'? This would be less confusing to Perlers. (Although the point could be made that they still need to grok 'map' and 'seq') print YAML->emit([ \\\42 ]); --- - !ref =: !ref =: !ref =: 42 # an array where each slot points to the next and # the last slot is 42 $a->[$_] = \ $a->[$_+1] for 0..2; $a->[3] = 42; print YAML->emit($a); --- - !ref &001 (a ref to a ref to a ref to 42) =: !ref &002 =: !ref &003 =: 42 - *001 (a ref to a ref to 42) - *002 (a ref to 42) - *003 (42) $a = \\\\\\\\42 $b = $$$a; $c = $$$b; print YAML->emit( [ $a, $b, $c ] ); --- - !ref =: !ref =: !ref &001 =: !ref =: !ref &002 =: !ref =: !ref =: !ref =: 42 - *001 - *002 Well I'm glad we're using single space indent :) # A self reference $a->[0] = \ $a->[0]; print YAML->emit($a); --- - !ref &001 = *001 I finally grok it well. This is actually very nice. Thanks Oren. |
From: Clark C . E. <cc...@cl...> - 2001-11-12 02:17:21
|
On Sun, Nov 11, 2001 at 04:34:32PM -0800, Brian Ingerson wrote: | BTW, now that we have coined and defined 'anchors' and 'aliases', can we | make the proposed 'org.yaml.ptr', be 'org.yaml.ref'? This would be less | confusing to Perlers. Fine with me. Alias and Reference it is. Alias is a special *kind* of node found only in the serialization model (and used to represent a graph), and Reference is an explicit *class* of map found in both models. Reference will have direct implementations in Perl and C++ (and C) -- but will be represented as a special YAML class (map) in Python and Java. In any case, references will round-trip just wonderfully. Yippee! | Well I'm glad we're using single space indent :) Every day I get happier with single spaces. In fact, I'm starting to write some of my python code that way now... neat. | # A self reference | $a->[0] = \ $a->[0]; | print YAML->emit($a); | | --- | - !ref &001 | = *001 | | I finally grok it well. This is actually very nice. Thanks Oren. I agree. Oren did a great job. And I'm happy that we've finally agreed that there is a difference between aliases and references. Way cool. Best, Clark |
From: Alan J. <ja...@po...> - 2001-11-30 17:49:02
|
On Sun, 11 Nov 2001, Brian Ingerson wrote: > Any human writing up YAML with aliases is probably psychotic. Nolo contendere... but are you trying to say that aliases aren't intended for human use? I was mocking up a YAML replacement for the NetSaint config file, and something like host1: - &HTTP name: http port: 80 notify: webadmin - &HTTPS name: https port: 443 notify: webadmin host2: - *HTTP host3: - *HTTPS seemed useful to me. Though less useful when alias names are restricted to be digits only, as opposed to word characters, like I'm doing above... Is there a reason for this? Might it be changed, or was it discussed and rejected long ago? (I'm not trying to start trouble, really...) Also, is the whitespace in the above example correct? I wasn't certain when reading the spec. Thanks, Alan |
From: Brian I. <in...@tt...> - 2001-11-30 18:05:43
|
On 30/11/01 12:49 -0500, Alan Jaffray wrote: > On Sun, 11 Nov 2001, Brian Ingerson wrote: > > Any human writing up YAML with aliases is probably psychotic. > > Nolo contendere... but are you trying to say that aliases aren't intended > for human use? I was mocking up a YAML replacement for the NetSaint config > file, and something like > > host1: > - &HTTP > name: http > port: 80 > notify: webadmin > - &HTTPS > name: https > port: 443 > notify: webadmin > > host2: > - *HTTP > > host3: > - *HTTPS > > seemed useful to me. Though less useful when alias names are restricted > to be digits only, as opposed to word characters, like I'm doing above... > Is there a reason for this? Might it be changed, or was it discussed and > rejected long ago? (I'm not trying to start trouble, really...) I'm not opposed to it. Guys? > > Also, is the whitespace in the above example correct? I wasn't certain > when reading the spec. Perfect. :) Cheers, Brian |
From: Clark C . E. <cc...@cl...> - 2001-11-30 18:58:11
|
| > seemed useful to me. Though less useful when alias names are restricted | > to be digits only, as opposed to word characters, like I'm doing above... | > Is there a reason for this? Might it be changed, or was it discussed and | > rejected long ago? (I'm not trying to start trouble, really...) | | I'm not opposed to it. Guys? This is ok. Way way back we kept it to numbers since it wouldn't round trip (it would be written back out as numbers). Back then we didn't have many styles and comments... so it was the only thing that didn't round-trip nicely. Now that we have comments and other things that don't round trip, I don't see the harm. | > Also, is the whitespace in the above example correct? | > I wasn't certain when reading the spec. You did wonderfully. Note: Since we've limited in-line scalars to one line, (which hasn't made the spec yet), the following two items are now identical: --- - one - two --- - one - two --- Right? Or are the above going to be illegal? I think it should just introduce insignficant whitespace. Best, Clark |
From: Clark C . E. <cc...@cl...> - 2001-11-30 20:23:14
|
| host1: | - &HTTP | name: http | port: 80 | notify: webadmin | - &HTTPS | name: https | port: 443 | notify: webadmin | | host2: | - *HTTP | | host3: | - *HTTPS Ok. We just had a talk and the above syntax should be an error beacuse it leads to whitespace significance problems in the next-line case. If you want to separate your maps, you'd have to use comments. host1: - &HTTP name: http port: 80 notify: webadmin - &HTTPS name: https port: 443 notify: webadmin # host2: - *HTTP # host3: - *HTTPS Minor change really.... |