(A) I think that nux basically preprocesses the large file to split it up
into lots of small files which it hands to Saxon one at a time. This is
quite similar to the "streaming copy" facility that Saxon provides in XSLT,
and which will be available in the next release for XQuery too.
(B) With statements like "Discarding Nux or Saxon (but keeping xquery)" you
seem to be a little bit confused about the components of the system. XQuery
is just a language, you need something to process it, and that's being done
by Saxon (optionally with a bit of help from Nux). And of course it's these
components that are doing the work and therefore taking the time.
There may be ways of speeding your query up, it's hard to tell without
seeing it! One thing to experiment with is the -pull option. For many
queries that seems to make little difference, but for some it can be quite
Incidentally, running a 4second query ten times is not enough to get the
Java VM up to full speed, so don't assume that you can extrapolate these
[mailto:saxon-help-bounces@...] On Behalf Of Ryad
Sent: 19 March 2008 15:29
Subject: Re: [saxon] Problem in parsing not so large XML files with
A)About the file size
I agree that if I allocate more memory to the JvM
Saxon will work with my 16MB file.
But 16MB is just a test before
using 1GB file trace & more
As my XML trace files are completely flat,
I was wondering why Nux + Saxon can process these files
and XQJ+ Saxon cannot.
Tell me if I'm wrong but I've the feeling that
Saxon (when used with Nux) does not directly work over the whole file
but just over a part that has been extracted by a SaX/Stax parser.
Concerning the speed issue for the 16Mb file processing, my mean measurement
over 10 xquery are:
a)A simple xerces parser with equivalent Xquery code : 1sec per query
b)Nux/Saxon with Stax with pure Xquery : 2,5sec per query
c)Nux/Saxon with Sax with pure Xquery : 4 sec per query
I believe that the 1,5 overhead sec of b) compared to a) is not to bad (good
but I hope to win an other 0,7 sec with optimization.
A rapide profiling of the code shows that the time consuming parts
are the wrappings (Nux/Saxon) over the stax parser.
Is a smart tuning of saxon/nux through get/set functions possible?
If not, is there a direction (even a little bit more complex) to speed up my
Discarding Nux or Saxon (but keeping xquery) or using something else?
Thanks for your time