Menu

saxon performance

Help
Vilo Repan
2011-09-14
2012-10-08
  • Vilo Repan

    Vilo Repan - 2011-09-14

    I use saxon in cooperation with xsom parser. Yesterday I was profiling my
    application and I found some performance issues in saxon. In method
    net.sf.saxon.dom.DOMObjectModel.getDocumentBuilder(Result) code always create
    DocumentBuilderFactory and DocumentBuilder object and then docBuilder =
    dfactory.newDocumentBuilder();

    It take 40-50% of time to parse xsd schema I tried to move it to a static
    block, and create final class member DocumentBuilder like this:

            private static final DocumentBuilder docBuilder;
    
        static {
    
            try {
    
                DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance();
    
                docBuilder = dfactory.newDocumentBuilder();
    
                    } catch (ParserConfigurationException e) {
    
                            throw new RuntimeException("Couldn't create document builder, reason: " + e.getMessage(), e);
    
                    }
    
        }
    

    then in getDocumentBuilder(Result) I only call

    Document out = docBuilder.newDocument();
    

    with that change I made about 40-50% faster xsd schema parsing. I use saxon as
    maven dependency, now as private version. It would be great if this fix,
    properly tested could be in next scheduled release...

    Can somebody have a look at this?

    Thanks.

     
  • Michael Kay

    Michael Kay - 2011-09-14

    I don't know the internals of DSOM, and I'm not sure how much control you have
    over the way Saxon is used. Saxon with DOM is not a good combination; there
    are many inefficiencies. It's much better (by a factor of 10) to use Saxon's
    internal tree model. If you do want Saxon's output as a DOMResult, it's much
    better to initialise the DOMResult with a Document node and allow Saxon to
    build the tree underneath this DOMResult; if you don't do this, Saxon has to
    find an implementation of DOM, and it uses the JAXP DocumentBuilderFactory
    mechanism to do this (it doesn't include its own DOM implementation).

    Although this path is always going to be very inefficient, you are right to
    point out that if people do multiple transformations, we should only need to
    do the search for a DocumentBuilderFactory once, and I will make this change.

     
  • Vilo Repan

    Vilo Repan - 2011-09-14

    AFAIK I don't have any control over xsom or saxon, but I'll check it anyway.
    Thanks for that change.
    When can I expect that change will be in maven repository?

    Thanks.

     
  • Michael Kay

    Michael Kay - 2011-09-14

    Saxon is not distributed via Maven. We produce periodic maintenance releases
    published here on SourceForge. I'm not planning to put this change into a
    maintenance release - the general policy is to treat performance improvements
    as enhancements rather than bug fixes and only issue them in the next major
    release (i.e. 9.4) - which will be ready when it's ready.

     
  • Vilo Repan

    Vilo Repan - 2011-09-14

    Thanks for quick responses.