Menu

#1815 [regression] Node.removeChild() can throw "XMLDocument cannot be cast to HTMLDocument"

Latest SVN
closed
None
1
2016-08-25
2016-08-25
No

This regression was introduced in HtmlUnit 2.19. Calling Node.removeChild() can result in the following exception being thrown:

com.gargoylesoftware.htmlunit.ScriptException: Exception invoking removeChild
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:883)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:815)
    at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.executeListenerEvent(XMLHttpRequest.java:324)
    at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.setState(XMLHttpRequest.java:239)
    at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.doSend(XMLHttpRequest.java:789)
    at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.access$0(XMLHttpRequest.java:690)
    at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest$1.run(XMLHttpRequest.java:644)
   <<snip>>
Caused by: java.lang.RuntimeException: Exception invoking removeChild
   <<snip>>
Caused by: java.lang.ClassCastException: com.gargoylesoftware.htmlunit.javascript.host.xml.XMLDocument cannot be cast to com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument
    at com.gargoylesoftware.htmlunit.html.HtmlElement.detach(HtmlElement.java:1343)
    at com.gargoylesoftware.htmlunit.html.DomNode.remove(DomNode.java:1154)
    at com.gargoylesoftware.htmlunit.javascript.host.dom.Node.removeChild(Node.java:386)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   <<snip>>

This happens at this line when the parent is an XMLDocument and the child is a subclass of HtmlElement. This in turn occurs when an XHTML document is retrieved through XMLHttpRequest. This is the only time xml nodes are parsed as html elements.

The bug is likely due to the assumption that HtmlElement.getPage() always returns an HTMLDocument. This does not appear to be the case for XHTML and check are probably needed.

Steps to reproduce

Here are test files I used:

test1.html

<!DOCTYPE html>
<html>
<head>
<script type="text/javascript">
function test() {
    var xhr = new XMLHttpRequest();
    xhr.onload = function (e) {
        console.log("onload: " + e);
        var xml = xhr.responseXML;
        console.log("firstChild: " + xml.firstChild)
        console.log("firstChild.nodeName: " + xml.firstChild.nodeName)
        console.log("firstChild.parentNode: " + xml.firstChild.parentNode)
        xml.removeChild(xml.firstChild)
        console.log("firstChild: " + xml.firstChild)
    }
    xhr.onerror = function (e) {
        console.log("onerror: " + e);
    }
    xhr.open("GET", "test1.xml");
    console.log("sending")
    xhr.send();
}
</script>
</head>
<body>
<input type="button" onclick="test()" value="test"/>
</body>
</html>

test1.xml

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml">
</html>
        try (WebClient wc = new WebClient()) {
            List<String> alerts = new ArrayList<>();
            wc.setAlertHandler(new CollectingAlertHandler(alerts));

            HtmlPage p = wc.getPage("http://.../test1.html");
            p.executeJavaScript("console.log = function (s) { alert(s) }");

            p.<HtmlButtonInput>getFirstByXPath("//input[@type='button' and @value='test']").click();
            wc.waitForBackgroundJavaScriptStartingBefore(1);

            final String[] expected = {
                    "sending",
                    "onload: [object ProgressEvent]",
                    "firstChild: [object HTMLHtmlElement]",
                    "firstChild.nodeName: html",
                    "firstChild.parentNode: [object HTMLDocument]",
                    "firstChild: [object HTMLHtmlElement]",
            };
            assertEquals(StringUtils.join(expected, "\n"), StringUtils.join(alerts, "\n"));
        }

Some more details

  • I needed to upload test1.html and test1.xml to a web server to get XMLHttpRequest to work. YMMV.

  • IE/Chrome/Firefox outputs:

sending
onload: [object ProgressEvent]
firstChild: [object HTMLHtmlElement]
firstChild.nodeName: html
firstChild.parentNode: [object XMLDocument]
firstChild: null
  • HtmlUnit 2.20's outputs is exactly the same except the last line is missing due to the exception.

  • If I remove the xmlns="http://www.w3.org/1999/xhtml" part from test1.xml (i.e. change it to a plain xml), the problem does not occur and the output changes to the following in IE/Chrome/Firefox/HtmlUnit:

sending
onload: [object ProgressEvent]
firstChild: [object Element]
firstChild.nodeName: html
firstChild.parentNode: [object XMLDocument]
firstChild: null

Discussion

  • Atsushi Nakagawa

    Wtf, I can't edit the bug to fix up the formatting error after test1.xml.. Did you guys get rid of the edit button?

     

    Last edit: Atsushi Nakagawa 2016-08-25
    • Ahmed Ashour

      Ahmed Ashour - 2016-08-25

      The Edit is there (at least for me)

       
  • Ahmed Ashour

    Ahmed Ashour - 2016-08-25
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -69,6 +69,8 @@
     </html>
     ~~~
    
    +
    +
     ~~~
            try (WebClient wc = new WebClient()) {
                List<String> alerts = new ArrayList<>();
    
     
    • Atsushi Nakagawa

      I think you need to add something like Test.java in between to make them separate.

       
  • Ahmed Ashour

    Ahmed Ashour - 2016-08-25
    • status: open --> accepted
    • assigned_to: Ahmed Ashour
     
  • Ahmed Ashour

    Ahmed Ashour - 2016-08-25
    • status: accepted --> closed
     
  • Ahmed Ashour

    Ahmed Ashour - 2016-08-25

    Thanks for reporting, fixed in SVN.

    You can get latest build from https://ci.canoo.com/teamcity/viewLog.html?buildTypeId=HtmlUnit_FastBuild&buildId=lastSuccessful&tab=artifacts (once green)

     
    • Atsushi Nakagawa

      Wow that was quick! Thanks as always.

       

Log in to post a comment.