Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

## [Yaml-core] formalism in YAML

 [Yaml-core] formalism in YAML From: Clark C. Evans - 2003-10-21 06:10:10 ```Sean, let me respond to your inquiry with a few definitions: graph: A collection of nodes and edges. (FOLDOC) node: A point or vertex in a graph. (FOLDOC) directed graph: A graph with one-way edges. (FOLDOC) connected graph: A graph such that there is a path between any pair of nodes via zero or more other nodes. (FOLDOC) colored graph: A graph together with a colouring function which assigns to each node in the graph a colour. (NIST DADS) labeled graph: A graph which has labels associated with each edge or each vertex. (NIST DADS) function: A rule of correspondence betweeen two sets such that there is a unique element in the second set assigned to each element in the first set. (Websters) (A more boring definition from FOLDOC is below) rooted graph: A directed graph, with a special 'root' node such that every node is connected, at least indirectly, through a series of edges starting at the root. (my definition) In YAML we use these terms with exactly these meanings! I think of a YAML graph as the 'rooted' variety, that is a directed graph which is connected. Further, I picture a YAML graph has having coloured nodes (the data type) and each edge being labeled. A YAML graph has three sorts of nodes (vertexes): (a) scalar values, where the node has only 'incoming' edges visualized as arrows pointing to it; (b) sequences (arrays); where the 'values' of the sequence are pointed to by outgoing edges labeled using subsequent integers (c) mappings (hashtables); where the 'values' of the mapping are pointed to by outgoing edges labeled using, a set of unique 'keys'. The only 'tricky' thing about this visualization is that the keys are not just limited to strings, but can in fact be other nodes, including sequences and hashtables. This is a slightly mind-bending trick; but it is needed to model complex keys which are being found with increasing regularity as dictionary keys. Now to address your questions... On Mon, Oct 20, 2003 at 07:24:44PM -0700, Sean O'Dell wrote: | Graphs are also "related" values; such as the result set of a function, | generally not from an arbitrary data set such as in YAML (aside from | specific specifications). I think you are using graph in the following sense: 1. A diagram that exhibits a relationship, often functional, between two sets of numbers as a set of points having coordinates determined by the relationship. Also called plot. 2. A pictorial device, such as a pie chart or bar graph, used to illustrate quantitative relationships. Also called chart. (American Heritage Dictionary) While this is a good definition, this is not the definition typically used in 'computer science' or 'abstract math' speak. The definition of 'graph' in this sense is much more fluid and fundamental. While a graph (especially directed graphs) are most often functional in computer systems, they typically do not have integers or real numbers for vertexes. | specifications). The purpose of a graph is to show relationship. You would | draw up a graph to illustrate how the data in a series of YAML nodes are | related, but the series itself is not a graph. Ahh, but the point of using a graph *is* to show relationship between objects -- that is what computer science is all about. Two objects are connected when they have something 'in common' For example: - first: clark last: evans - first: oren last: ben-kiki - first: brian last: ingerson This specific 'series' is illustrating a relation between three people; and the subordinate mappings, draw a linkage from a shared anchor 'first' or 'last' to each one of these people. It is very much a visual picture. Very much a graph. In fact, we even have a way to specify a 'path' in the graph: /1/first -> 'oren' In YPath, a path named with 'edge labels' can be used to uniquely locate any node in a YAML graph from its root. This is both simple and powerful. If you need more convincing, read up a bit on Petri-Nets, as these are used heavily in manufacturing to solve all sorts of scheduling constraints on the production line. It converts what is essentially an 'algebraic' problem (non-numerical) into a picture that humans can peek at, poke, and sometimes even solve. | A YAML element is a value and its associated name, which may be qualified | with a type identifier and a namespace. A YAML element may be a simple | string value, an array of elements or an associative hash of elements. YAML | elements are grouped into sets and may be arranged according to a particular | specification or arbitrarily. A YAML document is a file or other storage | unit which contains a set of elements embodied as a YAML stream. A YAML | stream is a series of Unicode characters which represent the YAML elements, | plus delimiting characters such as spaces and newlines character. Well, two problems: 1) You use element instead of node (or vertex). Node is just right for YAML (see above). 2) Your definition of element is just plain wrong for YAML (although it does match how XML uses the word) -- you seem to be trying to attach the 'label' and the 'value' of a node together (mixing edges and vertexes). Besides not accurately modeling native data structures, it actually prevents some very useful graphs (or you can still represent things, but it takes more elements). The rest of your paragaph seems like an alternative way of writing what I've done, rather than rewriting (which I've done again and again and again and again based on lots of people's feedback), I could actually use low-level help with verifying that it reads well and makes sense. So, for example, if you could pick out particular sentances and explain why they don't make sense or how they could be imprroved, this woudl help greatly. Or, alternatively, if you could suggest a better "outline" this could also be helpful. Thank you so much for thinking about this stuff. It is always nice to have people following along and poking us so we make sure to cross all of the eyes and dots to the Ts. Best, Clark PS. Following is the more formal, boring definition of function, to which both our sequence (array) and mapping (hash) satisfy: If D and C are sets (the domain and codomain) then a function f from D to C, normally written "f : D -> C" is a subset of D x C such that: 1. For each d in D there exists some c in C such that (d,c) is an element of f. I.e. the function is defined for every element of D. 2. For each d in D, c1 and c2 in C, if both (d,c1) and (d,c2) are elements of f then c1 = c2. I.e. the function is uniquely defined for every element of D. ```