Hi list,

Recently I stumbled upon a surprising optimization which I wanted to share.

The application I am working on processes several types of XML documents. These document types share a similar structure, but have different root element names. They look a bit like this:

<XXXX> <!-- Type specific root element name -->
  <Header> <!-- Generic header -->
  <Body>     <!-- Type specific body -->

Because not all root element names are known beforehand, sometimes the following construct is used:

for $doc in collection("/db/data")
return $doc/element()/Header/DocumentId/string() (: Or something more interesting than string() :)

This is pretty slow, even with an index on DocumentId and query rewriting enabled, e.g., with 5000 documents it takes eXist (1.2.5dev-r8333) about 20 seconds to return a result. In this case, the element() path element can be replaced by element()[1], since document nodes have at most one child element. This improves performance dramatically: query time drops to 1 s.

I hope this helps someone who is trying to optimize his XQueries. Although it would be nice if eXist performed this optimization automatically.

With kind regards,
Pieter Deelen