From: SourceForge.net <no...@so...> - 2011-05-18 00:18:07
|
The following forum message was posted by at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/4534838: Our product lets users submit HTML that we display. The HTML can include JavaScript and, of course, the HTML and JavaScript is often malformed. We use JTidy to clean it up. I recently upgraded to r938 and noticed that JavaScript with strings containing unescaped HTML tags is now handled incorrectly. r918, tidy.exe, and Firefox HTML Tidy all flag these tags as warnings ("Warning: <' + '/' + letter not allowed here"); r938 claims 1 error but doesn't list it specifically (instead stating generically "This document has errors that must be fixed before using HTML Tidy to generate a tidied up version") and fails to generate any output. Sample code that demonstrates the problem below: [code]import org.w3c.tidy.Tidy; import java.io.Reader; import java.io.StringReader; public class JTidyBug { // Works as expected when using JTidy r918 (four warnings) // Does not work at all when using JTidy r938 (three warnings, one error, no output) public static void main(String[] args) { String html = "<script>\n" + "\tvar foo = \"<form>Test</form>\";\n" + "</script>"; Tidy tidy = new Tidy(); tidy.setXHTML(true); Reader reader = new StringReader(html); tidy.parse(reader, System.out); } }[/code] Thanks, Adam |