Hi,
the problem occurs using xpath.
I can reproduce the error below with the following snippet in htmlunit2.9
final WebClient client = new WebClient( BrowserVersion.FIREFOX_3_6);
final HtmlPage page = client
.getPage("http://www.amazon.co.uk/Bestsellers-Car-Motorbike/zgbs/automotive/ref=zg_mg_tab/");
final List<?> l = page.getByXPath("/descendant-or-self::node()");
for (final Object object : l) {
final DomNode node = (DomNode) object;
final List<?> ol = node.getByXPath("child::ol");
if (ol.size() > 0)
System.out.println(ol);
}
Exception in thread "main" java.lang.RuntimeException: Could not
retrieve XPath >child::ol< on
at com.gargoylesoftware.htmlunit.html.xpath.XPathUtils.getByXPath(XPathUtils.java:94)
at com.gargoylesoftware.htmlunit.html.DomNode.getByXPath(DomNode.java:1365)
at main(MainHTMLUnit.java:75)
Caused by: java.lang.RuntimeException: Could not resolve the node to a handle
at org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode(DTMManagerDefault.java:576)
at org.apache.xpath.XPathContext.getDTMHandleFromNode(XPathContext.java:184)
at com.gargoylesoftware.htmlunit.html.xpath.XPathUtils.evaluateXPath(XPathUtils.java:129)
at com.gargoylesoftware.htmlunit.html.xpath.XPathUtils.getByXPath(XPathUtils.java:72)
... 2 more
Just a hint:
This could happen when the DOM changes during the XPATH execution, it can frequently occur in JQuery tests when it uses xpath instead of getHtmlElementById.
This is "normal" as the DOM is modified while you iterate on it with the XPath results.
To avoid the problem, it seems to be enough here to call something like:
client.waitForBackgroundJavaScriptStartingBefore(2000);
before calling getByXPath.
More generally, if you want to "stop the world" while navigating on the page, you may have a look at the GAEJavaScriptExecutor that allows to run "background" JS code in the main thread.