From: SourceForge.net <no...@so...> - 2008-10-09 13:27:46
|
Bugs item #2094508, was opened at 2008-09-05 08:46 Message generated for change (Settings changed) made by bodewig You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=377768&aid=2094508&group_id=23187 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Java 1.1 Status: Open >Resolution: Wont Fix >Priority: 1 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: HTMLDocumentBuilder and Java 1.6 Initial Comment: HTMLDocumentBuilder fails to parse following (one line) HTML correctly if Java 1.6 is used. Only one of input elements finds its way into the document. It works correctly with Java 1.5. If you add a space between </script> and <input> tags it works also with Java 1.6. String html = "<html><head><title>Test</title></head><form action=\"\"><script language=\"javascript\">alert('test');</script><input type=\"text\"/><input type=\"button\"/></form></html>"; TolerantSaxDocumentBuilder tolerantSaxDocumentBuilder = new TolerantSaxDocumentBuilder(XMLUnit.newTestParser()); HTMLDocumentBuilder htmlDocumentBuilder = new HTMLDocumentBuilder(tolerantSaxDocumentBuilder); Document document = htmlDocumentBuilder.parse(html); ---------------------------------------------------------------------- >Comment By: Stefan Bodewig (bodewig) Date: 2008-10-09 15:27 Message: I can reproduce this, but it is most likely due to changes in javax.swing.html.parser.* - something that I'm personally not familiar with and honestly don't want to become familiar with either. I'd recommend you use a library other than XMLUnit to turn HTML into proper XML since those other libraries do a better job at it anyway. I can't recommend a specific library, but options I've seen used include JTidy <http://jtidy.sourceforge.net/>, TagSoup <http://ccil.org/~cowan/XML/tagsoup/> and NekoHTML <http://nekohtml.sourceforge.net/>. Unless anybody feels like contributing a patch (including tests), this won't get fixes. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=377768&aid=2094508&group_id=23187 |