Just recently I ran into a interesting problem that impacted the performance of an XPath expression in a big way. I have a ~300Mb XML data set that is being queried via a user-definable parameter that is passed to a variable (let's call it $filter). $filter is the name of an element being searched for, so in order to find it with in the data set, I call it with the following XPath:

let $filter-nodes := $xml-doc//*[name() = $filter]

When I tested this expression, it took up to 2 minutes for it to return a set of elements. Just as a test, I ran this query instead for comparison purposes:

let $filter-node := $xml-doc//filterNode

The query now took only half a second to complete, but now I have a new problem: the XPath expression is explicitly hardcoded to find only that one node. That means that if I want to retain the functionality of having the user-defined parameter, I have to do this:

switch ($filter-node)
  case "filterNode1" return $xml-doc//filterNode1
  case "filterNode2" return $xml-doc//filterNode2
  case "filterNode3" return $xml-doc//filterNode3
  default return ()

Needless to say, this can be a bit cumbersome as there are roughly 20 different cases I need to test for, but at least the script completes in less than a second. That is a performance improvement of a hundred-fold or greater over using the wildcard with a predicate.

My question is, is there a better way to run this query without having to unroll it like this? Here's my eXist version information:

Running with Java 1.6.0_33 [Apple Inc. (Java HotSpot(TM) 64-Bit Server VM) in /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home] 
[eXist Version : 2.0-tech-preview] 
[eXist Build : 20120625] 
[SVN Revision : 0000] 
[Operating System : Mac OS X 10.7.4 x86_64] 

David Finton