When useCdata is true then any existing CDATA block should be removed as otherwise this leads to 2 nested CDATA sections which is not valid and fails with XML parsers.
In addition note that for the following example there are 3 ContentToken generated (and not one) which means that all htmlcleaner serializers fail to generate valid content:
In HtmlCleaner added html'based serialiyers which leave script and stzle blocks as original, and in xml serialiyation inner CTADA blocka are escaped to make well'formed XML
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks a lot for reviewing and fixing issues Vladimir. At xwiki (xwiki.org) we were worried that this project was dead and we were considering switching to something else so it's really nice to see if isn't the case!
Any idea when 2.2 will be out?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In addition note that for the following example there are 3 ContentToken generated (and not one) which means that all htmlcleaner serializers fail to generate valid content:
"<script type=\"text/javascript\">\n"
+ "// <![CDATA[\n"
+ "function escapeForXML(origtext) {\n"
+ " return origtext.replace(/\\&/g,'&'+'amp;').replace(/</g,'&'+'lt;')\n"
+ " .replace(/>/g,'&'+'gt;').replace(/\'/g,'&'+'apos;').replace(/\"/g,'&'+'quot;');"
+ "}\n"
+ "// ]]>\n"
+ "</script>");
The problem is that code such as this one is not correct since there are several ContentToken generated:
else if (item instanceof ContentToken) {
String nodeName = element.getNodeName();
ContentToken contentToken = (ContentToken) item;
String content = contentToken.getContent();
boolean specialCase = props.isUseCdataForScriptAndStyle() &&
("script".equalsIgnoreCase(nodeName) || "style".equalsIgnoreCase(nodeName));
if (escapeXml && !specialCase) {
content = Utils.escapeXml(content, props, true);
}
element.appendChild( specialCase ? document.createCDATASection(content) : document.createTextNode(content) );
Namely the CDATA section in the example is split into several ContentToken
In HtmlCleaner added html'based serialiyers which leave script and stzle blocks as original, and in xml serialiyation inner CTADA blocka are escaped to make well'formed XML
Thanks a lot for reviewing and fixing issues Vladimir. At xwiki (xwiki.org) we were worried that this project was dead and we were considering switching to something else so it's really nice to see if isn't the case!
Any idea when 2.2 will be out?