Test code:
WebClient client=new WebClient();
client.getOptions().setThrowExceptionOnScriptError(false);
HtmlPage p=client.getPage("http://news.sznews.com/content/2016-09/21/content_13889421.htm");
logger.info(p.asText());
It was good in htmlunit 2.14
Did a quick analysis. The dom tree of the page contains a text node with that html-content as text. This might be the result of a html parser problem or more likely some javascript problem.
If you are able to point to the problematic code i will try to fix it. But the page is far to complex and im not able to read/understand all this chars :-).
Sorry our time is limited; you have to help us a bit with this problem.
OK, I will try to digger more. thanks.
I reproduced the problem with this minimal html:
It's the self closed iframe node causing the problem.
Have added a simple test case for this and it looks like the browsers are failing in the same way. Please verify.
I don't understand what do you mean 'browsers are failing in the same way'. I open the test page in firefox and it shows the text 'aaaaa' and an empty iframe. In htmlunit, the asText() method outputs html code:
Try this with your real browser i hope this helps.
And please have a look at at commit 13004.
I tried your test case too and it's same for me. My point is that, with real browser, I don't see any html code. With asText() in htmlunit, I see the html code:
Ah ok, now i got your point (hopefully). Will have a look.
Think this is fixed now. Sorry for the long journey until i got your point.
Thanks. I confirm it's fixed. Sorry I didn't make my point clear enough.