Thanks for the detail. It appears that you're not accumulating documents in memory, which is the problem I assumed you had, but rather instances of a ThreadLocal cache. That's a new problem I haven't seen before.

It's not clear why the error messages are correlated with use of xsl:perform-sort. Internally, the IntHashMap is used as a cache of Converter objects used to perform cast operations. It's probably just coincidence that type conversions in your application are happening primarily in the course of sorting.

It's not immediately clear what I should be doing to clean up when the web application stops. This page is relevant:

http://wiki.apache.org/tomcat/MemoryLeakProtection

but it talks about cleanup actions that Tomcat makes, not about what the application writer should do. I'll do some more research.

As for your application running slower, the most obvious reason would be if it has changed to use a DOM to represent the tree instead of using Saxon's native TinyTree, which is much faster. I haven't seen the code you had before, so I can't tell whether you were previously using a TinyTree. Certainly, you should avoid using DOM with Saxon unless you have a very good reason. (Apart from performance, there are also thread-safety issues: the TinyTree is thread-safe while DOM isn't). Use the s9api DocumentBuilder to build the TinyTree as an XdmNode.

Michael Kay
Saxonica

On 03/02/2012 06:51, Gerry Kaplan wrote:

Hi Michael,

 

I tried what you said below, but the problem persists (and the overall process seems to be running slower now too).

 

First, I wrote a URIResolver that properly caches a DOMSource for the requested XML document. This appears to work perfectly, and returns a DOMSource.

Then, I refactored my class as follows:

 

In @PostConstruct method:

  ClassPathResource xslt = new ClassPathResource("my.xsl");

  StreamSource xslCalc6 = new StreamSource(xslt.getInputStream());

  Processor proc = new Processor(false);

  docBldr = proc.newDocumentBuilder();

 

  XsltCompiler comp = proc.newXsltCompiler();

  exp = comp.compile(xslCalc6);

 

In the processing method of the class (which returns a Document back to the caller):

try {

  DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance();

  dfactory.setNamespaceAware(true);

  Document dom;

 

  try {

    dom = dfactory.newDocumentBuilder().newDocument();

  } catch (ParserConfigurationException e) {

    throw new SaxonApiException(e);

  }

  XdmNode src = docBldr.build(source);

 DOMDestination resultTree = new DOMDestination(dom);

  XsltTransformer trans = exp.load();

  trans.setInitialContextNode(src);

  trans.setDestination(resultTree);

 trans.setURIResolver(uriResolver);

 trans.transform();

 return dom;

} catch …

 

The “my.xsl” file contains TWO “xsl:perform-sort” statements.

 

The class seems to execute properly (although everything seems slower than before when I used JAXP and kept a handle to the Transformer), HOWEVER, if I modify something in Eclipse which causes it to reload the application context, I get the following messages to the console:

 

Feb 3, 2012 1:31:44 AM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks

SEVERE: The web application [/digitalguides] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@694d694d]) and a value of type [net.sf.saxon.expr.sort.IntHashMap] (value [net.sf.saxon.expr.sort.IntHashMap@28532853]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.

Feb 3, 2012 1:31:44 AM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks

SEVERE: The web application [/digitalguides] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@43ca43ca]) and a value of type [net.sf.saxon.expr.sort.IntHashMap] (value [net.sf.saxon.expr.sort.IntHashMap@4c044c04]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.

 

The interesting part is that there are always two of these for each one call to the method. That corresponds to the two “perform-sort” executions. If the method runs twice, this message will print four times, and so on. So now I am not holding onto the transformer, yet memory leaks are still being reported. Do you have any ideas about this?  I’m worried because this is a core method and will be called repeatedly.

 

Thanks,

Gerry

 

From: Michael Kay [mailto:mike@saxonica.com]
Sent: Wednesday, February 01, 2012 3:58 AM
To: saxon-help@lists.sourceforge.net
Subject: Re: [saxon] Possible memory leak with Saxon 9HE

 

On 31/01/2012 22:45, Gerry Kaplan wrote:

I am building an application that relies heavily on XSL transformations. In order to avoid repetitive XSL loading/compiling and document fetching (within the XSL), I have created a class that encapsulates the Transformer object, creating it at instantiation and holding it until the application shuts down. Although the XSL itself is not particularly big, it fetches a large XML document (using document() function) which takes time. Therefore, caching is essential.

 

 

Don't reuse the Transformer unless you want to reuse the resources it has acquired. Cache the Templates object (the compiled stylesheet), and create a new Transformer for each transformation.

You can reuse the Transformer (serially, in the same thread) if you really want to, and it can be useful to do so if there are static documents such as lookup data used in every transformation. But if you don't want to hold on to the documents used by one transformation in subsequent transformations, then either (a) create a new Transformer (recommended), or (b) call its reset() and clearDocumentPool() methods before reusing it.

Creating a new Transformer for each transformation does not prevent you retaining source documents in memory and using them repeatedly: write a URIResolver that manages the document pool, deciding which documents to keep and which to discard. For this kind of thing it's much easier to use s9api interfaces rather than JAXP (which doesn't recognize any kind of in-memory document other than a DOM).

Unfortunately the JAXP documentation gives very poor advice as far as Saxon is concerned: it advises reusing a Transformer in order to reuse resources, but reusing resources means retaining resources, and retaining resources that don't need to be retained is what many Java users characterize as a "memory leak".

Michael Kay
Saxonica



------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2


_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help