Peter,

I heard there was a XSLT to convert XML into YAML.

Could you confirm this and point me to it?

I am also interested in your YPath project. Especially since node numbering starting at either [0] or [1] depending on IE5 support has always bugged me.

Thanks,
Jake



On 8/9/2011 8:10 AM, Peter Murphy wrote:
All,

I've been thinking long and hard about a YPath specification. This was
inspired by "Trans" schema language, and also my ideas for using YPath
expressions based on tags. The consensus was that YPath had not ever
been specced out. So I thought: why not make up one?

At this stage, I'm more interested in the requirements than the
syntax. I would like a YPath that does what it should do - address
parts of a YAML document - and nothing more. While there is (and will
be) a lot of ideas lifted off XPath 1.0, there are some mistakes that
should not be followed - a lot of its bloat consists of stuff added
for compatibility with XSLT. (And XPath 2.0 is even bigger, I
believe.)

In future, I'll put it in HTML and place it somewhere on my site. This
may be a bit long, and for that I apologize. But what I would like is
feedback on this from yourselves. Does this sound the best way to go?
I haven't even got into the nitty gritty of location steps (which is a
nice idea from XPath I would like to grab). It's principles I'm
interested in.

Best regards,
Peter

Introduction to Spec follows:

1	Introduction

YPath is a language designed to address parts of a YAML document. The
name of “YPath” is inspired by XPath: a URL-like path notation
designed for finding information inside an XML document.
Like XPath, YPath operates on the abstract structure of documents,
rather than its surface syntax. In particular, YPath acts on the
representation graph of a YAML document, where data is represented as
a rooted, connected, directed graph of nodes. Nodes can represent
scalars objects such as strings or integers; nodes can also represent
collections such as sequences or mappings, which in turn reference
other nodes.
YPath expressions specify patterns for matching nodes. By taking an
initial node as an argument, it returns a set of nodes (possibly
including itself) based on whether they match the pattern or not.
1.1	Goals

There are several goals for the YPath specification, and they are
listed here in no particular order.
1.1.1	YPath should be able to address any part of a YAML document

All parts of a YAML document are represented in its representation
graph, and YPath should be able to access any node inside it.
Moreover, it should be possible to write YPath expressions that access
exactly one node inside it chosen by the user.
One difference between YAML and most other data formats is that it can
represent composite keys inside mappings. YPath should be able to
access the node for such keys, and it should be able to access the
node for its matching value inside the mapping. Finally, YPath should
be able to access any node within composite keys.
1.1.2	YPath should be as simple as possible

Too many specifications become bloated as multiple parties add their
own desired features to it. In contrast, YPath is a language for
accessing parts of a YAML document, and only parts of a YAML document.
Features can be added, but only if they aid users in the primary goal:
addressing any part of a YAML document.
1.1.3	YPath should be intuitive to write and read

YPath should be based on concepts that are familiar to most users, and
should use the most appropriate syntax for these concepts. For
example, the forward slash character “/” is used for indicating a
descent of one layer into the data structure – the same purpose it is
used in XPath.
Note: the syntax of YPath is up for grabs, but this use of forward
slash seems to be a no-brainer!
1.1.4	Data is more important than its expression

The same data may be expressed several different ways in a YAML
document. For example, a string could theoretically be expressed in
five different ways. YPath users would be more interested in
extracting the data into a format they control than worry about the
indentation levels used to represent it. For that reason, YPath should
ignore presentation and concern itself with the data itself.  The
representation graph is already part of the YAML specification, so
YPath might as well piggy back on to it.
1.1.5	YPath should not be restrictive

YPath expressions should be based on the full set of data in the
representational graph model. For example, all nodes have a tag, so
expressions should be able to target particular nodes based on that
tag. In addition, regular expressions should be a part of the
language; simple equality checking would not be flexible enough for
many users.
1.1.6	YPath results are also YAML documents

Sounds crazy, but think about it. If YPath returns node-sets, then one
can add a node representing the node-set, with references to its
content nodes. Ergo: the YPath result is also a representation graph.
This allows one to “Construct” it into a native structure.
Alternatively, by “Serializing” and “Presenting”, one can make another
YAML document. See 3.1 of the YAML spec.
(Yes, this is not really specification language, but this is draft,
and it’s late.)

1.1.7	YAML streams are not YAML documents, but YPath should support them anyway

A YAML stream may consist of several YAML documents, and YAML
processors act on streams. However, a YPath expression acts on
representation graph of a document. The YPath syntax should allow
users to address particular documents in a stream.
1.2	Prior Art

The only information necessary to understand this specification is the
YAML specification (1.0, 1.1, or 1.2) and Unicode.
YPath also uses C-style escaping and regular expressions.
JSON is a subset of YAML 1.2, and thus should be able to be targeted by YPath
1.3	Relation to XPath

There is no direct relationship between XPath and YPath. However, many
readers will be familiar with the XPath standard, and thus would be
interested in a comparison with YPath.
XPath acts on XML, which is a document-centric text format. The main
ingredients of an XML document are elements, attributes and text, plus
possibly processing instructions. XPath provides a mechanism for
expressions to address these parts, and models the XML document as a
tree of nodes, including element nodes, attribute nodes and text
nodes. In contrast, YAML is a data-centric text format, where the
three primitives are mappings, sequences and scalars. YPath is
designed to address these primitives, and models the document as a
representation graph. Never the less, the philosophy for both XPath
and YPath is roughly the same: use expressions that reference the
abstract, logical structure of a file, rather than the actual file
positions of data within it.
YPath uses other concepts previously described in XPath. In
particular, the “axis-node test-predicate” model for XPath location
steps is reused in YPath.
However, there are significant differences between XPath and YPath.
1.	XPath is the result of an effort to provide a common syntax and
semantics for functionality shared between XSL Transformations and
XPointer, two other specifications associated with XML. In contrast,
YPath has no dependency or interaction with any other specification,
apart from YAML.
2.	XPath (version 1.0) expressions can be evaluated to yield four
types of objects:  node sets, Booleans, numbers and strings. This
information can then be used within XSLT. In contrast, YPath
expressions only yield node sets. For similar reasons, YPath lacks the
arithmetic operators and functions present in XPath.
3.	An XML file contains exactly one XML document, and XPath acts on
that document. In contrast, a YAML stream may contain many YAML
documents.
4.	XPath classifies the root node of a document separately from the
document element. In contrast, there is no specific classification for
document roots in YAML. The root of a YAML document can be a mapping,
a sequence or a scalar, and is classified as one of these types.
5.	XML elements and attributes have names, and XML elements may be
assigned one or more namespaces (which are also represented as nodes
in the XPath data model). In contrast, YAML nodes do not have names,
and thus are not assigned to a namespace. YAML tag values may have
namespaces, but this data is contained in the node for the tag, rather
than treated independently.
6.	XPath expressions can be used to reference XML comments. YAML files
have comments, but are not part of the representation graph, and are
ignored by YPath expressions.
7.	The nodes in an XML document have a clear hierarchical relationship
between them. For example, one element can contain another or the
reverse, but it is impossible for two elements to contain each other
at the same time. In contrast, it is possible for nodes to contain
themselves through the use of anchors and aliases. This makes it easy
to map “child” or “descendant” relationships, but less easy to find
“parent” or “ancestor” relationships. For this reason, YPath does not
support these types of axes.
1.4	Implementation

A limited implementation can be found here:
http://pyyaml.org/browser/trunk/TestingSuite/ypath.yml?rev=71