I managed to get it working using s9api using java 1.6.0. It was superbly fast which was great, but I do need to get it working on 1.4.2 at the moment. To that end I tried to change my code to use the sxpath classes. This seemed like it might work, then i couldn't figure out how to get the xpath to return a nodeset, I could only get it to give me a string.

I then tried changing back to my original code using jaxen, but trying to not load the document twice. Here's the gist of the orginal code again:
       
                        System.setProperty("javax.xml.transform.TransformerFactory",
                                                        "net.sf.saxon.TransformerFactoryImpl");
                        System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
                                                "net.sf.saxon.dom.DocumentBuilderFactoryImpl");
                Source xslSource = new StreamSource(oXSLFile);
               Source xmlSource =
new StreamSource(oXMLFile);

               TransformerFactory oTransFactory = TransformerFactory.newInstance();

               Transformer oTransformer = oTransFactory.newTransformer(xslSource);

               
               Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(oXMLFile);


               for (however many times)

               {

                       org.jaxen.dom.XPath oPath =
new org.jaxen.dom.XPath(sElementPath);
                       oPath.setNamespaceContext(
new ETFTransform.TransformNamespaceResolver(oTransform.getODMNamespaceURI()));        
                       oTransformer.setParameter(VISTransform.
SELECTED_ELEMENTS, oPath.selectNodes(doc));
                       oTransformer.transform(xmlSource,
new StreamResult(new FileOutputStream(oOutput)));
               }


So I have a tree for "doc" - is this a TinyTree? If not, how do I get it to use TinyTree? I tried using doc as the input to transform() to avoid reparsing the input but it won't let me pass a Document, it needs a Source. Can I get it to use the tree I already have somehow?

I don't know why I'm finding this stuff so hard but it's very confusing trying to work our what's going on and what classes are actually being used.

Cheers,
Kevin



From: "Michael Kay" <mike@saxonica.com>
To: "'Mailing list for the SAXON XSLT and XQuery processor'" <saxon-help@lists.sourceforge.net>
Date: 05/12/2008 08:47
Subject: Re: [saxon] Passing NodeSet parameter from Java





Transformer trans = factory.newTransformer(source) is just a shorthand for
 
is just a shorthand for
 
Templates templ = factory.newTemplates(source);
Transformer trans = templ.newTransformer();
 
I usually use the longer form, but it makes no difference unless you want to use the stylesheet more than once.
 
You're parsing and building the XML document here twice, which is very wasteful; it can't be a good idea to load two separate XPath engines either. Unfortunately the JAXP interfaces for doing this kind of thing aren't especially well coordinated, and unless you really need to keep yourself processor-independent, I would do it with s9api instead. It would then be:
 
Processor proc = new Processor();
DocumentBuilder builder = proc.newDocumentBuilder();
XdmNode doc = builder.build(xmlSource);
 
XsltCompiler xslt = proc.newXsltCompiler();
XsltExecutable exec = compiler.compile(xslSource);
 
XPathCompiler xpath = proc.newXPathCompiler();
 
for (...each instance...) {
    XPathSelector selector= xpath.compile(sElementPath).load();
    selector.setContextItem(doc);
    XdmValue nodes = selector.evaluate();
 
    XsltTransformer tr = exec.load();
    tr.setContextItem(doc);
    tr.setParameter(parameterName, nodes);
    tr.setDestination(....)
    tr.transform();
}
 
That's from memory, I might have got some of the names wrong, but it gives you the general flavour.
 
Michael Kay
http://www.saxonica.com/
 


From: Kevin Burges [mailto:KevinBurges@formedix.com]
Sent:
05 December 2008 01:55
To:
Mailing list for the SAXON XSLT and XQuery processor
Subject:
Re: [saxon] Passing NodeSet parameter from Java



I'm so glad! Yes I was reusing the same transformer. Here's the gist of what I was doing:


               Source xslSource =
new StreamSource(oXSLFile);
               Source xmlSource =
new StreamSource(oXMLFile);

               TransformerFactory oTransFactory = TransformerFactory.newInstance();

               Transformer oTransformer = oTransFactory.newTransformer(xslSource);

               
               Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(oXMLFile);


               for (however many times)

               {

                       org.jaxen.dom.XPath oPath =
new org.jaxen.dom.XPath(sElementPath);
                       oPath.setNamespaceContext(
new ETFTransform.TransformNamespaceResolver(oTransform.getODMNamespaceURI()));        
                       oTransformer.setParameter(VISTransform.
SELECTED_ELEMENTS, oPath.selectNodes(doc));
                       oTransformer.transform(xmlSource,
new StreamResult(new FileOutputStream(oOutput)));
               }


So I'm loading the input file into a Document once, and for each transform doing xpaths on it to get parameters which are passed to transform(). Perhaps I'd be better passing the xpath directly into the stylesheet and getting the nodeset in there - would that avoid some overhead? This is someone elses old code so I'm not exactly sure why it's written that way. Also I know I can change it to use Saxon for the xpath - would this speed things up or use less memory?


As you can see, it's not using a Templates object, at least not directly, it's just creating a Transformer with the xslt and reusing that Transformer. Do I only need to use Templates if I am creating a new Transformer each time? With the code above would it be better to reuse the Transformer and call clearDocumentPool() each time, or create a new Transformer?


For the moment I've added in clearDocumentPool() but I can't test it till the morning.


Thanks again for the help, it's invaluable. I've got a lot of experience with using xslt, but not so much with calling it within a java app.


Cheers,
Kevin


From: "Michael Kay" <mike@saxonica.com>
To: "'Mailing list for the SAXON XSLT and XQuery processor'" <saxon-help@lists.sourceforge.net>
Date: 04/12/2008 18:27
Subject: Re: [saxon] Passing NodeSet parameter from Java






You are probably reusing the same JAXP Transformer, which tends to be recommended practice for Xalan but not for Saxon. For Saxon it's best to create a new Transformer for each transformation (but reuse the Templates object, which holds the compiled stylesheet). Alternatively, call ((Controller)transformer).clearDocumentPool() before each transformation.

 

The rule with Saxon is: if you want successive transformations to use the same loaded documents (for example, lookup documents) then use the same Transformer; if you don't, then either create a new Transformer or clear out its document pool by hand.

 

Michael Kay

http://www.saxonica.com/


From: Kevin Burges [mailto:KevinBurges@formedix.com]
Sent:
04 December 2008 18:03
To:
Mailing list for the SAXON XSLT and XQuery processor
Subject:
Re: [saxon] Passing NodeSet parameter from Java



I'll try that then, thanks. Do I just create and set up the Configuration, then create my input document using Configuration.buildDocument()?


The main reason I am trying to switch to Saxon is for memory reasons - I'm dealing with large documents and am hitting the limit of what I can do with Xalan. With a 45mb input file it was not able to complete the series of 493 transformations I need (out of memory error after about 8 transformations). If I reduce the input file to about 38mb it completes ok. I wanted to switch to Saxon so the process will work with larger files.


After I switched to Saxon I noticed that doing individual transformations seemed to be running much faster, however I found a big problem - the series of 493 transformations broke after only 4 transformations (again, out of memory).

I cut down the test to a smaller input file (2mb, 10 transformations) and ran it with both Xalan and Saxon. Both completed, and Saxon was faster as you would expect. I'm forcing a garbage collection between each transformation, and I got some debug output from that. What it appears to show is that when using Xalan, the memory use does not go up from transform to transform - the GC clears out everything. But with Saxon the memory is rising by around 5mb each time. If you scale this up to the much larger file, and the 493 transformations, you can see there would be a problem there.


Is there something I can do to stop the memory continually rising?
(I've included the debug output below)


Cheers,
Kevin


Xalan

- Xmx1150 10Forms                        MEMORY FREED OK

(EVFPublicationManager.applyTransform() - after creating input,before transform)

Mb Freed: n/a

Mb Used:  56

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 7

Mb Used:  68

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 26

Mb Used:  52

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 12

Mb Used:  53

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 14

Mb Used:  49

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 11

Mb Used:  50

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 3

Mb Used:  59

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 20

Mb Used:  49

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 10

Mb Used:  50

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 10

Mb Used:  51

(EVFPublicationManager.applyTransform() - after 1 of multipart transform)

Mb Freed: 3

Mb Used:  57




Saxon

- Xmx1150 10Forms                        MEMORY NOT FREED

(EVFPublicationManager.applyTransform() - after creating input,before transform)

Mb Freed: n/a

Mb Used:  55

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 28

Mb Used:  35

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 4

Mb Used:  43

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 2

Mb Used:  48

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 5

Mb Used:  53

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 6

Mb Used:  53

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 3

Mb Used:  59

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 5

Mb Used:  64

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 6

Mb Used:  67

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 7

Mb Used:  73

(EVFPublicationManager.applyTransform()- after one part of multi-transform)

Mb Freed: 3

Mb Used:  78


From: "Michael Kay" <mike@saxonica.com>
To: "'Mailing list for the SAXON XSLT and XQuery processor'" <saxon-help@lists.sourceforge.net>
Date: 04/12/2008 14:17
Subject: Re: [saxon] Passing NodeSet parameter from Java







In principle it should still be possible to use a level 2 DOM if you call Configuration.setDOMLevel(2). However, I don't think this has been tested in the most recent releases, so I can't guarantee it still works.


Michael Kay

http://www.saxonica.com/



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world

http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________
saxon-help mailing list archived at
http://saxon.markmail.org/
saxon-help@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/saxon-help
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/_______________________________________________
saxon-help mailing list archived at
http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help