Finally managed to achieve the desired behavior. In contrast to yesterday’s

code snippet, the solution involves

 

-       invoking saxon:stream

-       isolating result processing into a separate path expression

 

i.e.

 

      XQueryExecutable xqueryExecutable = xqueryCompiler.compile(

"for $x in saxon:stream(doc('" + input + "')/*/*) " +

"return string($x)");

      XQueryEvaluator query = xqueryExecutable.load();

      return query.iterator();

 

This now has timing independent of the document size:

 

      t(1): 0 msec

      t(64K): 718 msec

      t(492K): 5648 msec

 

Thanks for making this work.

Gunther

 

 

From: Rademacher, Gunther [mailto:Gunther.Rademacher@softwareag.com]
Sent: Dienstag, 18. Februar 2014 16:20
To: saxon-help@lists.sourceforge.net
Subject: Re: [saxon] Question on streaming via s9api

 

Just noticed that my previous post contains both doc() function reference

and the setSource invocation, of which of course only on is necessary. In

fact I tried either, but could not get initial results in time independent

of the document size.

 

Sorry for the confusion.

 

 

_____________________________________________
From: Rademacher, Gunther
Sent: Dienstag, 18. Februar 2014 15:57
To: 'saxon-help@lists.sourceforge.net'
Subject: Question on streaming via s9api

 

 

I am trying to process a large XML document from Java, such that the

application can pull results in linear time, i.e. independent of the

document size.

 

My naive attempt was to use the iterator from XQueryEvaluator like this:

 

      XQueryExecutable xqueryExecutable = xqeryCompiler.compile("doc('" + input + "')/*/*/string()");

      XQueryEvaluator query = xqueryExecutable.load();

      query.setSource(new StreamSource(new FileInputStream(new File(new URI(input)))));

      return query.iterator();

 

however a fair amount of time (depending on document size) is spent when

setSource is called, and most of the remaining time goes into fetching

the first result:

 

      t(0): 3885 msec

      t(1): 8628 msec

      t(65536): 8659 msec

      t(503524): 9814 msec

 

Also tried saxon:stream, or using query.run with a SAXDestination, but no success.

 

Can this be made to work similar to the above code fragment? Am I possibly

missing something obvious?

 

I am doing this on Saxon-EE 9.5.1.4J.

 

Thanks

Gunther

 

 

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Karl-Heinz Streibich (Vorsitzender/Chairman), Dr. Wolfram Jost, Arnd Zinnhardt; - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Dr. Andreas Bereczky - http://www.softwareag.com