#189 Comment before document element break transform

Attila Szegedi

I have a trivial document (saved as "id.xml"):


And a trivial XSLT file (saved as "ad.xslt") that's just an identity transform:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|*|processing-instruction()|comment()" priority="-2">
<xsl:apply-templates select="*|@*|text()|processing-instruction()|comment()"/>

The following code, using no dom4j but just built-in javax.xml.parsers.* works, and prints a non-null element:

import java.io.File;
import java.net.URL;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Source;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;

public class XsltTest {
public static void main(String[] args) throws Exception {
final Source docSource = new DOMSource(DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("id.xml")));
final DOMResult result = new DOMResult();
TransformerFactory.newInstance().newTransformer(new StreamSource(
new File("ad.xslt"))).transform(docSource, result);
final Document doc = (Document)result.getNode();

It prints "[x: null]", as expected.

Now, replacing the "final Source docSource" declaration with a dom4j Document (using DOMDocumentFactory):

    final SAXReader reader = new SAXReader();
    reader.setDocumentFactory(new DOMDocumentFactory());
    final Source docSource = new DOMSource((Document)reader.read(new File("id.xml")));

will cause the program to print "null" - the transformed document now has no root element!

Removing the comment from before <x/> fixes the problem, so it's the comment that makes it fail.

This is with 1.6.1


  • Attila Szegedi
    Attila Szegedi

    For now, worked around it with a private build of dom4j. NOTE: I didn't find and fix the bug, I worked around it. What I did is I took the SAXReader.read(InputSource) method and doctored it so that it removes any top-level comments from the document before returning it. Would still prefer a proper fix.