Javascript code wrapped in an XML CDATA block within a <script> tag on a web page causes HtmlUnit to throw the following exception:
net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: illegally formed XML syntax
This can be replicated with the following Java code:
(new net.sourceforge.htmlunit.corejs.javascript.Parser()).parse("<![CDATA[obj1.obj2.func1();]]>", "", 1);
This occurs on web pages that use Oracle ADF (I believe this is the same problem reported in Bug #1991). These web pages are accepted by Firefox and other major browsers.
Tracing through the code, I can see that the TokenStream.getNextXMLToken() method successfully scans over the CDATA block, but then rather than processing the contents of the block, it just does:
parser.addError("msg.XML.bad.form");
I'm not sure why it does this, or the right way to fix it. With a little guidance, I may be able to provide a fix. Or if someone can give me one, it would be great. Otherwise, I'm being forced to use Selenium with Firefox, which is introducing other problems.
BTW, I don't know why SourceForge trashed the formatting. I didn't enter the text all globbed together like that. But there doesn't seem to be a way to edit it.
Diff:
As always it took some time to work on this. From my point of view this has to be handle by the parser. Have done a fix because the neko parser can already handle this.
Please have a look at twitter (https://twitter.com/HtmlUnit). Will inform about a new snapshot if available.
Same problem with style declarations.
Should be fixed now. Will inform via twitter (https://twitter.com/HtmlUnit) if a new snapshot is avalialable.
Please check if your problem is gone.
Your cases
(new net.sourceforge.htmlunit.corejs.javascript.Parser()).parse("<![CDATA[obj1.obj2.func1();]]>", "", 1);will still fail, because the CDATA processing is done by the (X)Html parser.
Reopen this if it is still not working.