(A) I think that nux basically preprocesses the large file to split it up into lots of small files which it hands to Saxon one at a time. This is quite similar to the "streaming copy" facility that Saxon provides in XSLT, and which will be available in the next release for XQuery too.
 
(B) With statements like "Discarding Nux or Saxon (but keeping xquery)" you seem to be a little bit confused about the components of the system. XQuery is just a language, you need something to process it, and that's being done by Saxon (optionally with a bit of help from Nux). And of course it's these components that are doing the work and therefore taking the time.
 
There may be ways of speeding your query up, it's hard to tell without seeing it! One thing to experiment with is the -pull option. For many queries that seems to make little difference, but for some it can be quite significant.
 
Incidentally, running a 4second query ten times is not enough to get the Java VM up to full speed, so don't assume that you can extrapolate these figures.
 
Michael Kay
http://www.saxonica.com/
 


From: saxon-help-bounces@lists.sourceforge.net [mailto:saxon-help-bounces@lists.sourceforge.net] On Behalf Of Ryad Ben-El-Kezadri
Sent: 19 March 2008 15:29
To: saxon-help@lists.sourceforge.net
Subject: Re: [saxon] Problem in parsing not so large XML files with Saxon/XQJ

A)About the file size

I agree that if I allocate more memory to the JvM
Saxon will work with my 16MB file.

But 16MB is just a test before
using 1GB file trace & more

As my XML trace files are completely flat,
I was wondering why Nux + Saxon can process these files
and XQJ+ Saxon cannot.

Tell me if I'm wrong but I've the feeling that
Saxon (when used with Nux) does not directly work over the whole file
but just over a part that has been extracted by a SaX/Stax parser.

B)About speed

Concerning the speed issue for the 16Mb file processing, my mean measurement over 10 xquery are:
a)A simple xerces parser with equivalent Xquery code : 1sec per query
b)Nux/Saxon with Stax with pure Xquery : 2,5sec per query
c)Nux/Saxon with Sax with pure Xquery : 4 sec per query

I believe that the 1,5 overhead sec of b) compared to a) is not to bad (good indeed)
but I hope to win an other 0,7 sec with optimization.

A rapide profiling of the code shows that the time consuming parts
are the wrappings (Nux/Saxon) over the stax parser.

Is a smart tuning of saxon/nux through get/set functions possible?
If not, is there a direction (even a little bit more complex) to speed up my application:
Discarding Nux or Saxon (but keeping xquery) or using something else?

Thanks for your time
Ryad