Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Cache and reuse XQuery results in Java

ailli
2010-05-05
2012-10-08
  • ailli
    ailli
    2010-05-05

    I have read a lot of threads about this topic but still I haven't found a
    suitable solution.

    My setup:
    -Saxon 9.2
    -Java XQJ

    What I want to do is the following:
    1.) Parse mutiple XML files into memory while applying some filter logic and
    cache the results in memory for performance reasons.
    2.) Merge these caches from 1.) into one cache, again for performance reasons.
    3.) Execute multiple queries on the merged cache from 2.)

    For now I have found a solution, which has only one major draw back. The
    problem is, that I store an XQResultSequence or XMLStreamReader object
    internally in step 2.). For now I haven't found a way to reuse/deep-copy these
    objects after every query. From what I have learnt so far, it is not possible
    to deep-copy the tree structure.
    So what I do is repeating step 2.) for every request to get a fresh
    XQResultSequence/XMLStreamReader for every query. This is rather unfortunate
    and inefficient.

    Is there a way to either
    -Rewind/reuse an XQResultSequence/XMLStreamReader returned by Saxon or
    -Cast XQResultSequence/XMlStreamReader into a type, that is not being consumed after first use (ie. Document) or
    -any other solution, that I didn't think of yet?

    Basically: What would be the right way to get an in memory representation of
    the results of an XQuery statement in Java?
    Thank your very much, this keeps me busy since days.

     
  • Michael Kay
    Michael Kay
    2010-05-05

    Well, my advice would be to use the s9api interface rather than XQJ. I
    designed the s9api interface in large measure because of my frustration that
    this kind of thing was too difficult in XQJ. The stupid thing is that Saxon is
    going out of its way to enforce silly restrictions in interfaces like
    XQResultSequence, where the restrictions make both your life as a user and
    mine as an implementor more difficult.

    I suspect you won't find an adequate solution without departing from pure XQJ
    interfaces. You can probably do it within an XQJ framework (get the XQItem
    from an XQResultSequence, cast it to SaxonXQItem, extract the underlying
    net.sf.saxon.om.Item, etc), but once you depart from pure XQJ code, you lose
    all the benefits you might have gained from portability, so you might as well
    use s9api interfaces to start with.

     
  • ailli
    ailli
    2010-05-05

    Actually I was afraid you'd answer that. But since it is an inhouse
    application not being disstributed a lot I'll write an XQuery Helper using the
    s9api. This will give me the benefits of getting rid of XQJ and still having a
    single point of change in my code once the processor should change.

    Thank you for your quick reply, everything in the scope of Saxon seems to be
    really fast. ;)

     
  • ailli
    ailli
    2010-05-05

    While working on the mirgation from XQJ to S9API I found, that you have the
    handy function compileModule().
    May I ask what performance increase I might expect from compiling my XQuery
    modules compared to importing them (by providing the location) in my query
    statement? Is it worth buying the Saxon-PE version because of that feature?

    One thing I am not 100% certain about is what type to use for caching XML
    documents and results from my queries in my Java application. Is it
    appropriate to use Document as shown in one of the samples or is there a more
    efficient object type that should be used? It seems that Document involves a
    lot of overhead when binding into a new query.

    Thank you so much for your information.

     
  • Michael Kay
    Michael Kay
    2010-05-05

    compileModule() is only useful to you if the same module is used (imported) by
    more than one query; it saves the work of recompiling the module each time it
    is imported, and it allows the memory occupied by the module to be shared.

    If you're using s9api, you should use the DocumentBuilder to build documents
    in memory; the resulting type is an XdmNode. Similarly, the results of a query
    will result in a sequence of XdmItem objects, which can be cast to XdmNode if
    they are nodes.

     
  • ailli
    ailli
    2010-05-05

    Thanks a lot!
    Using the native Saxon API the response times of my webapplication decreased
    by a factor of about 10 and I have not even started optimizing yet. Pretty
    cool!

    Thank you so much for your help.