From: Wolfgang M. <wol...@gm...> - 2005-05-30 19:08:04
|
> I'm looking over Exists implementation of XQuery and wondering if this > isn't something to take advantage of the when it comes to its further > implementation. Why not use Saxon's XPath/XQuery API to support Exist > instead of maintaining a custom implementation. Saxon provides as clear > "service provider" api for "mounting" its functionality onto different > data models, there already exist implementations for DOM, JDOM, XOM etc. > Wouldn't it be possible to just provide an implementation that uses the > internal model of eXist? The design of eXist's XQuery engine is actually very much inspired by Saxon, and there are many small details where we have indirectly re-used Saxon code (collations, date/time handling, many functions ...). I always consult Saxon before writing my own stuff and I actually re-used another Saxon module just today (see org.exist.xquery.util.RegexTranslator). I thus think I know the source code of Saxon quite well. It would certainly be possible to plug eXist's data model into the api provided by Saxon. However, the question is what you could expect and gain from such an integration. After all, there are huge conceptual differences between the two: Saxon is a stream-based processor and this somehow assumes that the node tree is entirely in memory or stored in a way that supports fast streaming access along the main XPath axes. When you look at the Saxon source, you will find that most of the query processing is indeed implemented in terms of iterators. This makes perfect sense for in-memory processing, where the natural way to implement XPath axis navigation is to traverse the node tree kept in memory. Now, tree traversals are exactly the thing eXist tries to avoid wherever possible. I don't say this is good or bad. But accessing a node on persistent storage is much more expensive than accessing a node in memory. Contrary to a stream-based processor, xml databases thus mainly depend on intelligent index structures for performance. Like a relational db, they are lost without indexes. eXist usually tries to process an XPath expression only based on the available indexes, using structural joins between node sets instead of tree traversals (again similar to an rdbms). Only if no index is available will eXist fall back to a streaming traversal of the stored node tree. I don't say it's impossible, but I can hardly see a way to effectively combine this index-based processing with Saxon's design. However, there's one area where I have been seriously thinking about integrating Saxon: querying temporary xml fragments constructed in an XQuery. eXist keeps those fragments in a structure identical to Saxon's "tiny tree" data model. So why not use Saxon to process queries on in-memory fragments? Anyway, if someone wants to further explore the possibilities provided by Saxon, I'm open to it. > 2.) It provides a possible "bridge" directly to JAXP 1.3 for eXist, > possibly allowing users to interact with the eXist contents simply as if > it were just a JAXP implementation. Making tools which utilize JAXP's > XPath and future XQJ API be capable of interacting "transparently" with > eXist as if it were just an implementation of of these API's. I agree that having a bridge to JAXP 1.3 and XQJ should certainly be on the road map for eXist, independent from this discussion. Wolfgang |