>Most of the static analysis that it requires is already done by Saxon (at least in Saxon-SA), though not all.

 

What do you mean by this statement? You mean it should have the entire elements list (required for XQuery from input XML’s).  

 

No, I mean that it's not dissimilar from the analysis that is done by Saxon-SA to detect invalid path expressions. For example if you write

 

declare variable $doc as document-element(schema-element(purchase-order))

for $x in $doc/a/b[c]

where $x/x/y/z eq 2

return $x/i/j/k

 

then Saxon is going to give you an error message if the schema doesn't allow a purchase-order/a/b/i/j/k element, or if the type of purchase-order/a/b/x/y/z is non-numeric. Of course the analysis isn't complete, and it does rely on the types of function parameters (but not local variables such as $x above) being declared. This analysis can tell you that elements a,b,c,x,y,z,i,j,k are referenced. What it can't do (this requires further analysis) is to tell you that elements p,q, and r *aren't* referenced (and can therefore be pruned from the source tree). To do that, you need to look at whether the query contains other paths, for example "//*" or preceding-sibling::x, which the current type analysis ignores.

 

Ideally the analysis done for document projection should be on a per-document basis. If there are multiple documents, you want to do the analysis separately for each one. For example, you want to distinguish paths that are used on a source document from paths that are used on a temporary tree constructed within the query. That would require a lot more dataflow analysis than Saxon currently attempts.

 

>What are your plans to implement Document Projection in your future versions of Saxon?

 

I don't do plans. That's how I keep the project responsive. Perhaps I'll start with a manually-driven approach where the user tells Saxon which paths in the source document to preserve, and then try to automate it in a second phase. The manual approach will almost certainly be needed for most XSLT applications, since it's very hard to do static analysis of apply-templates, even with a schema.

 

Michael Kay

http://www.saxonica.com/