Re: [Exist-open] Xpath, XMLDB, and eXist

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> mmm, finally what you suggest:
>
> notices/notice[.//sencence &= 'house']/author
>
> is the same as i suggest:
>
> /notices/notice//sencence[. &= 'house']../../author
>
> and it doesn't return the same number of authors than sencences, so we
> don't know which author corrsponds to each sentence.
>
> I think the solution is what you have said, a nested query. You get all
> notices that contains sencences with the word 'house', and for each
notice,
> you query for the sencences. Then you can compound the result.
>
> The problem with this solution is that if you have an additional level in
> the tree, for example, the first level tag is newspaper, and this
newspaper
> has a publication year, and you want to show this year together with the
> sencences, you have to make three loops:
>
> 1) Get the newspapers that have sentences that contain the word 'house'.
> 2) For each newspaper, get the notices that have sentences that contain
the
> word 'house'.
> 3) For each notice, get the sencences that contain the word 'house'.
>
> And finally, we can compound the result.
>
> But i don't see any other solution.
>

I think what's behind all this is a rather important fact. XPath isn't
intended as a self-sufficient query language. It's a way of describing the
location of any specified part of any XML document, or, a little more
precisely, of specifying any set of such locations that can be described by
a single complete XPath expression.  As such it is a vital tool in any
mechanism for querying (or manipulating) XML, but to do more than meet that
basic function it has to either be extended or supplemented by other tools.

As I see it, eXist works both by extension and supplementation of XPath. The
XPath extensions Wolfgang has so far implemented, however, are confined (and
I would say correctly so, in keeping with key principles of XML) to filling
out what many regard as a serious and unjustifiable shortcoming of XPath in
the current spec: standard XPath is great for navigating document structure,
but pretty weak for matching text content or attribute values. Wolfgang's
extensions (which, he would be the first to acknowledge, were influenced by
others, Howard Katz in particular) address those weaknesses in a way that
strikes me as wholly continuous with the existing spec and certainly worthy
of serious consideration by the W3C for incorporation into future revisions.
However, "all" they do is give the standard XPath syntax more power to
specify its targets more precisely in terms of textual content. They don't
attempt to extend or modify the approach XPath takes to getting to those
targets.

In particular, they don't alter the inability of an XPath engine to
"backtrack" and start over if an expression fails near its right-hand end.
This is the result of (W3C) design decisions, made for complex technical
reasons which are mainly inspired by the stress in XML specification circles
on relatively easy implementability. So that's where what I call
"supplementing" XPath comes in. eXist offers nested queries for precisely
this purpose, to prune and/or merge nodesets via repeated applications of
separate Xpath expressions, and in association with Cocoon it can use XSLT
pipelining, DOM methods and/or SAX filter chains to achieve the same ends.
In my uses of eXist, where for reasons I won't go into here I need to
minimize dependence on Java, but where my queries are structurally quite
similar to those Mario wants to do, I use eXist for the "grunt work" of
retrieval, then prune/merge its result sets by using the same sort of
XSLT/SAX/DOM tools, but in C/C++ libraries called from Perl.

I don't regard the need to do this as in any sense a limitation of  eXist.
It is, arguably, a limitation of XPath as currently specified, but then the
implementation implications of building a significant backtracking capacity
into XPath would be pretty severe, and I'm not convinced they are justified.
Even when implemented and debugged, an XPath engine capable of retrieving
Mario's desired result via a single expression might end up gobbling far
more resources than the apparently more laborious triple iteration currently
required.

In short, eXist, in the best traditions of application design and Open
Source, uses existing (sorry!) tools to create a new tool that allows us to
do desirable but previously impossible things. But it is itself just a tool
among others, and not every tool has or should have aspirations to become a
Swiss Army Knife. Indeed many other once trim and lean tools have become
bloated and obese as a result of such ambitions. I would like to see
Wolfgang expand the implementation of XPath to incorporate more of the spec,
especially ability to use all axes, but I for one wouldn't be too keen to
encourage him to take extension any further. In my view, it's enough to give
us ready integration between eXist and other tools also at our disposal.

Michael Beddow

Re: [Exist-open] Xpath, XMLDB, and eXist

eXist-db is a feature rich Open Source native XML database

Re: [Exist-open] Xpath, XMLDB, and eXist