-
Tidy inserts a tag when it reaches the unexpected . This causes it to discard existing tags and their attributes.
Here's an example of a page it will incorrectly parse. The output will have dropped the id="companyAccountsTable" attribute.
jschmoe
...
2009-11-19 22:47:47 UTC in HTML Tidy
-
Sorry, it's 1-liner, so I didn't figure it was necessary. I'll post the code next time. This is very similar, but slightly different from the bug I posted earlier.
new Tidy().parseDOM(new ByteArrayInputStream("jschmoe".getBytes()), System.out);.
2009-11-06 06:12:22 UTC in JTidy
-
Consider the page below. Jtidy inserts a tag when it reaches the unexpected . This causes it to discard the existing and the resulting page has lost the id attribute. A better behavior would be to insert a new end tag instead of discarding the existing start tag.
...
2009-11-06 00:58:30 UTC in JTidy
-
Sorry, I guess I wasn't thinking very clearly. Thanks for the help and quick responses.
2009-11-04 23:43:15 UTC in JTidy
-
Sorry about the bad formatting. The preview box lied to me. Here's another try:
public static String nodeToString(Node node) {
try {
Source source = new DOMSource(node);
StringWriter stringWriter = new StringWriter();
Result result = new StreamResult(stringWriter);
Transformer transformer = TransformerFactory.newInstance().newTransformer();.
2009-11-02 16:21:20 UTC in JTidy
-
Hmm. Odd that we would get different results. Perhaps what gets printed to the output stream is different than what gets put in the document that is returned? I was printing the returned document with the following method:
<pre>
public static String nodeToString(Node node) {
try {
Source source = new DOMSource(node);
StringWriter stringWriter = new...
2009-11-02 16:19:05 UTC in JTidy
-
Right, I only meant that the String provided was a sample input document to reproduce the issue. If you print out the parsed doc then you'll see that it's removed the end tag. Just run the parser on the input:
.
2009-11-02 07:10:52 UTC in JTidy
-
i'm going to go ahead and close this. i'll assume this is working correctly and the problem was that i didn't realize these two options were both set and incompatible.
2009-11-02 07:08:03 UTC in JTidy
-
The following code removes the end tag for the input element:
String html = "";
Tidy tidy = new Tidy();
tidy.setXHTML(true);
DOMDocumentImpl doc = (DOMDocumentImpl) tidy.parseDOM(new ByteArrayInputStream(html.getBytes()), null);.
2009-11-02 06:32:23 UTC in JTidy
-
Tidy.setDropEmptyParas does not cause empty paragraphs to be dropped. I used the following as a test file:
jtidy test page
jschmoe
...
2009-10-31 06:06:03 UTC in JTidy