Re: [Exist-open] Utilizing Saxon for XSLT 2.0

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> I'm looking over Exists implementation of XQuery and wondering if this
> isn't something to take advantage of the when it comes to its further
> implementation. Why not use Saxon's XPath/XQuery API to support Exist
> instead of maintaining a custom implementation. Saxon provides as clear
> "service provider" api for "mounting" its functionality onto different
> data models, there already exist implementations for DOM, JDOM, XOM etc.
> Wouldn't it be possible to just provide an implementation that uses the
> internal model of eXist?

The design of eXist's XQuery engine is actually very much inspired by
Saxon, and there are many small details where we have indirectly
re-used Saxon code (collations, date/time handling, many functions
...). I always consult Saxon before writing my own stuff and I
actually re-used another Saxon module just today (see
org.exist.xquery.util.RegexTranslator).

I thus think I know the source code of Saxon quite well. It would
certainly be possible to plug eXist's data model into the api provided
by Saxon. However, the question is what you could expect and gain from
such an integration. After all, there are huge conceptual differences
between the two: Saxon is a stream-based processor and this somehow
assumes that the node tree is entirely in memory or stored in a way
that supports fast streaming access along the main XPath axes. When
you look at the Saxon source, you will find that most of the query
processing is indeed implemented in terms of iterators. This makes
perfect sense for in-memory processing, where the natural way to
implement XPath axis navigation is to traverse the node tree kept in
memory.

Now, tree traversals are exactly the thing eXist tries to avoid
wherever possible. I don't say this is good or bad. But accessing a
node on persistent storage is much more expensive than accessing a
node in memory. Contrary to a stream-based processor, xml databases
thus mainly depend on intelligent index structures for performance.
Like a relational db, they are lost without indexes.

eXist usually tries to process an XPath expression only based on the
available indexes, using structural joins between node sets instead of
tree traversals (again similar to an rdbms). Only if no index is
available will eXist fall back to a streaming traversal of the stored
node tree.

I don't say it's impossible, but I can hardly see a way to effectively
combine this index-based processing with Saxon's design. However,
there's one area where I have been seriously thinking about
integrating Saxon: querying temporary xml fragments constructed in an
XQuery. eXist keeps those fragments in a structure identical to
Saxon's "tiny tree" data model. So why not use Saxon to process
queries on in-memory fragments?

Anyway, if someone wants to further explore the possibilities provided
by Saxon, I'm open to it.

> 2.) It provides a possible "bridge" directly to JAXP 1.3  for eXist,
> possibly allowing users to interact with the eXist contents simply as if
> it were just a JAXP implementation. Making tools which utilize JAXP's
> XPath and future XQJ API be capable of interacting "transparently" with
> eXist as if it were just an implementation of of these API's.

I agree that having a bridge to JAXP 1.3 and XQJ should certainly be
on the road map for eXist, independent from this discussion.

Wolfgang

Re: [Exist-open] Utilizing Saxon for XSLT 2.0

eXist-db is a feature rich Open Source native XML database

Re: [Exist-open] Utilizing Saxon for XSLT 2.0