You can subscribe to this list here.
2004 |
Jan
(29) |
Feb
(1) |
Mar
(6) |
Apr
(31) |
May
(2) |
Jun
(2) |
Jul
(13) |
Aug
(31) |
Sep
(41) |
Oct
(12) |
Nov
(13) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(17) |
Feb
(3) |
Mar
(3) |
Apr
|
May
(1) |
Jun
(2) |
Jul
(1) |
Aug
(3) |
Sep
(3) |
Oct
(1) |
Nov
(2) |
Dec
(6) |
2006 |
Jan
(4) |
Feb
(6) |
Mar
(2) |
Apr
(1) |
May
|
Jun
|
Jul
(21) |
Aug
(7) |
Sep
(5) |
Oct
(4) |
Nov
(2) |
Dec
(2) |
2007 |
Jan
(1) |
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
(2) |
Sep
(2) |
Oct
(2) |
Nov
|
Dec
(1) |
2008 |
Jan
(1) |
Feb
(1) |
Mar
(7) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(8) |
2009 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(5) |
Aug
(24) |
Sep
(16) |
Oct
(8) |
Nov
(42) |
Dec
(3) |
2010 |
Jan
(8) |
Feb
(8) |
Mar
(14) |
Apr
(29) |
May
(2) |
Jun
(1) |
Jul
(11) |
Aug
(47) |
Sep
(4) |
Oct
(16) |
Nov
(18) |
Dec
|
2011 |
Jan
(5) |
Feb
(4) |
Mar
(2) |
Apr
|
May
|
Jun
(10) |
Jul
(50) |
Aug
(4) |
Sep
(4) |
Oct
(1) |
Nov
(4) |
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Mr. S. L. <hoo...@du...> - 2017-08-03 14:41:14
|
Dear friend, I got you/company's contact in a business directory and will like to discuss a very important/serious business with you. I wish to partner you/company, to use your name/your company bank account detail to transfer huge fund for mutual benefit. I am very much aware that the internet has been flooded with unnecessary business proposal but serious business still goes through the internet. The fund was secured through over invoiced contracts amount for projects awarded in my department at the Nigerian National Petroleum Corporation. The projects has been completed and commissioned and the contractors has been paid in full. The over invoiced amount is currently in the Nigerian National Petroleum corporation account with the central bank of Nigeria. The fund will be transferred to our bank account as a sub contractor that executed the project. I will present you as the beneficiary of the fund and facilitate the transfer to your account immediately. I have colleagues in the accounts department that will ensure immediate transfer of the fund. The transfer process will not take more than 10 working days hopefully. We are offering 20% of the total amount as your share of the deal and will like to invest fund in your country as we cannot bring back all the fund to our country immediately to avoid exposing ourselves. We can discuss details if you are interested. My name is Shehu Liman, born 1st October 1958 in Adamawa, Adamawa state Nigeria. I am presently a Group General Manager, Supply Chain Management at Nigerian National Petroleum Corporation. Address NNPC Towers, Herbert Macaulay Way, Central Business District, PMB 190, Garki, Abuja. I look forward to your immediate response Best Regards, Mr. Shehu Liman Group General Manager Supply Chain Management at Nigerian National Petroleum Corporation. Phone: 002348175027352 |
From: SourceForge.net <no...@so...> - 2012-06-08 09:47:00
|
Bugs item #3532726, was opened at 2012-06-07 03:44 Message generated for change (Comment added) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: Non-breaking space in HEAD rejected Initial Comment: In some web pages I see buggy HTML code like this: <script type="text/javascript" src="common/js/prototype.js"></script> <script src="common/js/scriptaculous.js?load=effects,builder"></script> <script type="text/javascript" src="common/js/lightbox.js"></script><link rel="prev" href="apps_06_004.html"> The result is a warning (plain text not allowed in HEAD elements) plus the SCRIPT tags are ignored, i.e. removed from the output. This makes the filteres page unusable because I need those scripts. How to fix: Make JTidy more tolerant by just ignoring nob-breaking space in HEAD sections or treating it like regular whitespace, parsing the rest of the line correctly. ---------------------------------------------------------------------- >Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-08 02:47 Message: The uploaded test case is basically the same as for bug #3532720, but while there I have stripped the bogus text nodes from HEAD here I left them in the file so we have a real-world test case: The uploaded files are part of a download from Galileo press (Galileo Openbook about iPhone development). It is freely available for download, so no worries there. I ran JTidy to clean up the HTML code and bumped into the problem that HEAD parsing stopped after the first text node was found, thus all following SCRIPT tags are non-existent in JTidy's output. My patch fixes this problem. ---------------------------------------------------------------------- Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-08 02:37 Message: The attached patch fixes the problem for me. Instead of stopping to parse the HEAD section whenever a text node is found, now text tokens are just ignored and the parsing continues. This has the additional advantage that not only legal HTML after the problematic text node is parsed correctly but that for multiple occurrences of illegal text nodes withing HEAD a warning is printed for each location which helps debug bogus HEAD sections. Maybe there is a better and cleaner way to do this, but as I said, it works for me. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-08 09:37:23
|
Bugs item #3532726, was opened at 2012-06-07 03:44 Message generated for change (Comment added) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: Non-breaking space in HEAD rejected Initial Comment: In some web pages I see buggy HTML code like this: <script type="text/javascript" src="common/js/prototype.js"></script> <script src="common/js/scriptaculous.js?load=effects,builder"></script> <script type="text/javascript" src="common/js/lightbox.js"></script><link rel="prev" href="apps_06_004.html"> The result is a warning (plain text not allowed in HEAD elements) plus the SCRIPT tags are ignored, i.e. removed from the output. This makes the filteres page unusable because I need those scripts. How to fix: Make JTidy more tolerant by just ignoring nob-breaking space in HEAD sections or treating it like regular whitespace, parsing the rest of the line correctly. ---------------------------------------------------------------------- >Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-08 02:37 Message: The attached patch fixes the problem for me. Instead of stopping to parse the HEAD section whenever a text node is found, now text tokens are just ignored and the parsing continues. This has the additional advantage that not only legal HTML after the problematic text node is parsed correctly but that for multiple occurrences of illegal text nodes withing HEAD a warning is printed for each location which helps debug bogus HEAD sections. Maybe there is a better and cleaner way to do this, but as I said, it works for me. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 21:37:16
|
Bugs item #3532720, was opened at 2012-06-07 03:35 Message generated for change (Comment added) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: BR within PRE rendered with additional linefeeds Initial Comment: Sometimes in the wild there is HTML code with PRE sections containing BR tags instead of linefeeds, like so: <pre>first line<br>second line<br>third line</pre> JTidy's pretty-printer renders it like this ("break-before-br" is false): <pre> first line<br> second line<br> third line </pre> The result is that a browser renders two linefeeds where just one should exist, causing ugly empty lines in the output. The problem gets worse if for some reason I use multiple passes of JTidy, adding more and more linefeeds. How to fix: never ever add newlines after BR tags inside PRE sections. ---------------------------------------------------------------------- >Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-07 14:37 Message: The uploaded test case is part of a download from Galileo press (Galileo Openbook about iPhone development). It is freely available for download, so no worries there. I ran JTidy to clean up the HTML code and bumped into the problem with unwanted blank lines in the PRE sections. Just search the HTML file for <pre class="prettyprint"> and the corresponding sections in the original HTML and the result generated by JTidy-r938. It looks wrong without my patch and correct with my patch. ---------------------------------------------------------------------- Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-07 14:23 Message: I just uploaded my humble try to fix the problem. Maybe it is a bit hacky and not the optimal solution, sorry this was my first look into your code and I have not been programming for a while. But at least locally it solves my problem and might be potentially beneficiary for other users, too. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 21:23:45
|
Bugs item #3532720, was opened at 2012-06-07 03:35 Message generated for change (Comment added) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: BR within PRE rendered with additional linefeeds Initial Comment: Sometimes in the wild there is HTML code with PRE sections containing BR tags instead of linefeeds, like so: <pre>first line<br>second line<br>third line</pre> JTidy's pretty-printer renders it like this ("break-before-br" is false): <pre> first line<br> second line<br> third line </pre> The result is that a browser renders two linefeeds where just one should exist, causing ugly empty lines in the output. The problem gets worse if for some reason I use multiple passes of JTidy, adding more and more linefeeds. How to fix: never ever add newlines after BR tags inside PRE sections. ---------------------------------------------------------------------- >Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-07 14:23 Message: I just uploaded my humble try to fix the problem. Maybe it is a bit hacky and not the optimal solution, sorry this was my first look into your code and I have not been programming for a while. But at least locally it solves my problem and might be potentially beneficiary for other users, too. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 18:22:03
|
Bugs item #3532831, was opened at 2012-06-07 08:16 Message generated for change (Comment added) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532831&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Deleted Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: Non-breaking space in HEAD rejected Initial Comment: In some web pages I see buggy HTML code like this: <script type="text/javascript" src="common/js/prototype.js"></script> <script src="common/js/scriptaculous.js?load=effects,builder"></script> <script type="text/javascript" src="common/js/lightbox.js"></script><link rel="prev" href="apps_06_004.html"> The result is a warning (plain text not allowed in HEAD elements) plus the SCRIPT tags are ignored, i.e. removed from the output. This makes the filteres page unusable because I need those scripts. How to fix: Make JTidy more tolerant by just ignoring nob-breaking space in HEAD sections or treating it like regular whitespace, parsing the rest of the line correctly. ---------------------------------------------------------------------- >Comment By: Alexander Kriegisch (kriegaex) Date: 2012-06-07 11:22 Message: Sorry, somehow a page reload created this bug twice. The older version is the right one. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532831&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 15:16:08
|
Bugs item #3532831, was opened at 2012-06-07 08:16 Message generated for change (Tracker Item Submitted) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532831&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: Non-breaking space in HEAD rejected Initial Comment: In some web pages I see buggy HTML code like this: <script type="text/javascript" src="common/js/prototype.js"></script> <script src="common/js/scriptaculous.js?load=effects,builder"></script> <script type="text/javascript" src="common/js/lightbox.js"></script><link rel="prev" href="apps_06_004.html"> The result is a warning (plain text not allowed in HEAD elements) plus the SCRIPT tags are ignored, i.e. removed from the output. This makes the filteres page unusable because I need those scripts. How to fix: Make JTidy more tolerant by just ignoring nob-breaking space in HEAD sections or treating it like regular whitespace, parsing the rest of the line correctly. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532831&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 10:44:26
|
Bugs item #3532726, was opened at 2012-06-07 03:44 Message generated for change (Tracker Item Submitted) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: Non-breaking space in HEAD rejected Initial Comment: In some web pages I see buggy HTML code like this: <script type="text/javascript" src="common/js/prototype.js"></script> <script src="common/js/scriptaculous.js?load=effects,builder"></script> <script type="text/javascript" src="common/js/lightbox.js"></script><link rel="prev" href="apps_06_004.html"> The result is a warning (plain text not allowed in HEAD elements) plus the SCRIPT tags are ignored, i.e. removed from the output. This makes the filteres page unusable because I need those scripts. How to fix: Make JTidy more tolerant by just ignoring nob-breaking space in HEAD sections or treating it like regular whitespace, parsing the rest of the line correctly. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532726&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-06-07 10:35:12
|
Bugs item #3532720, was opened at 2012-06-07 03:35 Message generated for change (Tracker Item Submitted) made by kriegaex You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alexander Kriegisch (kriegaex) Assigned to: Nobody/Anonymous (nobody) Summary: BR within PRE rendered with additional linefeeds Initial Comment: Sometimes in the wild there is HTML code with PRE sections containing BR tags instead of linefeeds, like so: <pre>first line<br>second line<br>third line</pre> JTidy's pretty-printer renders it like this ("break-before-br" is false): <pre> first line<br> second line<br> third line </pre> The result is that a browser renders two linefeeds where just one should exist, causing ugly empty lines in the output. The problem gets worse if for some reason I use multiple passes of JTidy, adding more and more linefeeds. How to fix: never ever add newlines after BR tags inside PRE sections. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3532720&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-05-04 20:40:07
|
Bugs item #3406215, was opened at 2011-09-08 05:48 Message generated for change (Comment added) made by martinkurz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rajesh Kumar (rajeshkumarp) Assigned to: Nobody/Anonymous (nobody) Summary: replacing unexpected <h2> by </h2>. But it is valid h2 tag. Initial Comment: JTidy replacing <h2> as </h2> while using nested <h2> tags. Here is my HTML code. <h2 >Test Content <h2 >Test Content</h2> </h2> Here I get a warning as Warning: replacing unexpected <h2> by </h2> Please help me to fix this issue. ---------------------------------------------------------------------- Comment By: Martin (martinkurz) Date: 2012-05-04 13:40 Message: When looking at http://www.w3.org/TR/html4/struct/global.html#h-7.5.5, just inline elements are allowed inside headings, so h2 isn't allowed inside another h2 (or other heading). So jTidy finds an error and tries to fix this. ---------------------------------------------------------------------- Comment By: Adam A. Koch (aakoch) Date: 2012-05-04 12:41 Message: I'm not sure nesting h2 elements is allowed. That's probably why it is failing. ---------------------------------------------------------------------- Comment By: Rajesh Kumar (rajeshkumarp) Date: 2011-09-08 05:53 Message: Note : I am using jtidy-r938 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 |
From: SourceForge.net <no...@so...> - 2012-05-04 19:41:13
|
Bugs item #3406215, was opened at 2011-09-08 05:48 Message generated for change (Comment added) made by aakoch You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rajesh Kumar (rajeshkumarp) Assigned to: Nobody/Anonymous (nobody) Summary: replacing unexpected <h2> by </h2>. But it is valid h2 tag. Initial Comment: JTidy replacing <h2> as </h2> while using nested <h2> tags. Here is my HTML code. <h2 >Test Content <h2 >Test Content</h2> </h2> Here I get a warning as Warning: replacing unexpected <h2> by </h2> Please help me to fix this issue. ---------------------------------------------------------------------- Comment By: Adam A. Koch (aakoch) Date: 2012-05-04 12:41 Message: I'm not sure nesting h2 elements is allowed. That's probably why it is failing. ---------------------------------------------------------------------- Comment By: Rajesh Kumar (rajeshkumarp) Date: 2011-09-08 05:53 Message: Note : I am using jtidy-r938 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-11-02 16:51:25
|
Bugs item #3432258, was opened at 2011-11-02 12:56 Message generated for change (Comment added) made by helsom You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Hel (helsom) Assigned to: Nobody/Anonymous (nobody) Summary: Unwrapped inline content means invalid XHTML is generated Initial Comment: Using jtidy.parseDOM with setXHTML(true) and setEncloseBlockText(true) does not cause inline content to be properly wrapped and hence W3c validation fails. Example HTML 1 (generates valid XHTML) "Text <em>Inline content</em>" -> "<p>Text <em>Inline content</em></p>" Example HTML 2 (generates invalid XHTML) "<em>Inline content</em>" -> "<em>Inline content</em>" There is code within src/main/java/org/w3c/tidy/ParserImpl.java that performs this wrapping but it has been commented out due to bug report 1403105 : java.lang.StackOverflowError in Tidy.parseDOM(). Uncommenting this block of code seems to produce correctly wrapped XHTML in most situations, but unfortunately the stack over flow error still happens if the HTML mentioned in report 1403105 is supplied. Anyway that this can be reinstated without causing the stack over flow? ---------------------------------------------------------------------- >Comment By: Hel (helsom) Date: 2011-11-02 16:51 Message: Adding code very similar to the TEXT_NODE encloseBodyText processing (about line 799 within ParserImpl.java) for an inline element (at about line 934) seems to result in inline content within the body being properly wrapped, though it hasn't had extensive testing and there may be a better way. That is, if (node.type == Node.START_TAG || node.type == Node.START_END_TAG) { if ( (node.tag.model & Dict.CM_INLINE) != 0 ) { if (lexer.configuration.encloseBodyText) { Node para; lexer.ungetToken(); para = lexer.inferredTag("p"); body.insertNodeAtEnd(para); parseTag(lexer, para, mode); mode = Lexer.MIXED_CONTENT; continue; } } ... ---------------------------------------------------------------------- Comment By: Hel (helsom) Date: 2011-11-02 15:33 Message: Even with the mentioned code being re-instated, this only resolves wrapping of inline content within a blockquote, for example, and not at the top level within the body element. For Example: "<em>ssss</em> <blockquote><em>Inline content</em></blockquote>" generates xhtml: "<em>ssss</em> <blockquote> <p><em>Inline content</em></p> </blockquote>" Note the initial <em> does not get wrapped with a p element. If I place some text in front of it, however, it does get wrapped. For Example: "xxxx <em>ssss</em> <blockquote><em>Inline content</em></blockquote>" generates xhtml: "<p>xxxx <em>ssss</em></p> <blockquote> <p><em>Inline content</em></p> </blockquote>" ---------------------------------------------------------------------- Comment By: Hel (helsom) Date: 2011-11-02 14:08 Message: Update: This is not an xhtml-specific problem. Incorrectly wrapped content also fails HTML 4.01 Strict validation. It seems a shame to lose this important functionality because of what seems to be quite an obsure bug (1403105). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-11-02 15:33:14
|
Bugs item #3432258, was opened at 2011-11-02 12:56 Message generated for change (Comment added) made by helsom You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Hel (helsom) Assigned to: Nobody/Anonymous (nobody) Summary: Unwrapped inline content means invalid XHTML is generated Initial Comment: Using jtidy.parseDOM with setXHTML(true) and setEncloseBlockText(true) does not cause inline content to be properly wrapped and hence W3c validation fails. Example HTML 1 (generates valid XHTML) "Text <em>Inline content</em>" -> "<p>Text <em>Inline content</em></p>" Example HTML 2 (generates invalid XHTML) "<em>Inline content</em>" -> "<em>Inline content</em>" There is code within src/main/java/org/w3c/tidy/ParserImpl.java that performs this wrapping but it has been commented out due to bug report 1403105 : java.lang.StackOverflowError in Tidy.parseDOM(). Uncommenting this block of code seems to produce correctly wrapped XHTML in most situations, but unfortunately the stack over flow error still happens if the HTML mentioned in report 1403105 is supplied. Anyway that this can be reinstated without causing the stack over flow? ---------------------------------------------------------------------- >Comment By: Hel (helsom) Date: 2011-11-02 15:33 Message: Even with the mentioned code being re-instated, this only resolves wrapping of inline content within a blockquote, for example, and not at the top level within the body element. For Example: "<em>ssss</em> <blockquote><em>Inline content</em></blockquote>" generates xhtml: "<em>ssss</em> <blockquote> <p><em>Inline content</em></p> </blockquote>" Note the initial <em> does not get wrapped with a p element. If I place some text in front of it, however, it does get wrapped. For Example: "xxxx <em>ssss</em> <blockquote><em>Inline content</em></blockquote>" generates xhtml: "<p>xxxx <em>ssss</em></p> <blockquote> <p><em>Inline content</em></p> </blockquote>" ---------------------------------------------------------------------- Comment By: Hel (helsom) Date: 2011-11-02 14:08 Message: Update: This is not an xhtml-specific problem. Incorrectly wrapped content also fails HTML 4.01 Strict validation. It seems a shame to lose this important functionality because of what seems to be quite an obsure bug (1403105). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-11-02 14:08:34
|
Bugs item #3432258, was opened at 2011-11-02 12:56 Message generated for change (Comment added) made by helsom You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Hel (helsom) Assigned to: Nobody/Anonymous (nobody) Summary: Unwrapped inline content means invalid XHTML is generated Initial Comment: Using jtidy.parseDOM with setXHTML(true) and setEncloseBlockText(true) does not cause inline content to be properly wrapped and hence W3c validation fails. Example HTML 1 (generates valid XHTML) "Text <em>Inline content</em>" -> "<p>Text <em>Inline content</em></p>" Example HTML 2 (generates invalid XHTML) "<em>Inline content</em>" -> "<em>Inline content</em>" There is code within src/main/java/org/w3c/tidy/ParserImpl.java that performs this wrapping but it has been commented out due to bug report 1403105 : java.lang.StackOverflowError in Tidy.parseDOM(). Uncommenting this block of code seems to produce correctly wrapped XHTML in most situations, but unfortunately the stack over flow error still happens if the HTML mentioned in report 1403105 is supplied. Anyway that this can be reinstated without causing the stack over flow? ---------------------------------------------------------------------- >Comment By: Hel (helsom) Date: 2011-11-02 14:08 Message: Update: This is not an xhtml-specific problem. Incorrectly wrapped content also fails HTML 4.01 Strict validation. It seems a shame to lose this important functionality because of what seems to be quite an obsure bug (1403105). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-11-02 12:56:52
|
Bugs item #3432258, was opened at 2011-11-02 12:56 Message generated for change (Tracker Item Submitted) made by helsom You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Hel (helsom) Assigned to: Nobody/Anonymous (nobody) Summary: Unwrapped inline content means invalid XHTML is generated Initial Comment: Using jtidy.parseDOM with setXHTML(true) and setEncloseBlockText(true) does not cause inline content to be properly wrapped and hence W3c validation fails. Example HTML 1 (generates valid XHTML) "Text <em>Inline content</em>" -> "<p>Text <em>Inline content</em></p>" Example HTML 2 (generates invalid XHTML) "<em>Inline content</em>" -> "<em>Inline content</em>" There is code within src/main/java/org/w3c/tidy/ParserImpl.java that performs this wrapping but it has been commented out due to bug report 1403105 : java.lang.StackOverflowError in Tidy.parseDOM(). Uncommenting this block of code seems to produce correctly wrapped XHTML in most situations, but unfortunately the stack over flow error still happens if the HTML mentioned in report 1403105 is supplied. Anyway that this can be reinstated without causing the stack over flow? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3432258&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-10-06 17:19:21
|
Bugs item #3419740, was opened at 2011-10-06 11:19 Message generated for change (Tracker Item Submitted) made by kd7ike You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3419740&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Kim Ebert (kd7ike) Assigned to: Nobody/Anonymous (nobody) Summary: StringIndexOutOfBoundsException when lexing a web page Initial Comment: Possible fix for lexer bug. === modified file 'jtidy/src/main/java/org/w3c/tidy/Lexer.java' --- jtidy/src/main/java/org/w3c/tidy/Lexer.java 2010-05-06 23:18:10 +0000 +++ jtidy/src/main/java/org/w3c/tidy/Lexer.java 2010-11-02 02:18:59 +0000 @@ -1821,7 +1821,12 @@ if (TidyUtils.isLetter((char) c)) { continue; } - matches = container.element.equalsIgnoreCase(TidyUtils.getString(lexbuf, start, + /* Fix for bug #991 */ + if ((start + container.element.length()) > lexsize) + matches = false; + /* End Fix */ + else + matches = container.element.equalsIgnoreCase(TidyUtils.getString(lexbuf, start, container.element.length())); if (matches) { nested++; ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3419740&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-09-09 05:11:17
|
Bugs item #3406215, was opened at 2011-09-08 18:18 Message generated for change (Settings changed) made by rajeshkumarp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rajesh Kumar (rajeshkumarp) Assigned to: Nobody/Anonymous (nobody) >Summary: replacing unexpected <h2> by </h2>. But it is valid h2 tag. Initial Comment: JTidy replacing <h2> as </h2> while using nested <h2> tags. Here is my HTML code. <h2 >Test Content <h2 >Test Content</h2> </h2> Here I get a warning as Warning: replacing unexpected <h2> by </h2> Please help me to fix this issue. ---------------------------------------------------------------------- Comment By: Rajesh Kumar (rajeshkumarp) Date: 2011-09-08 18:23 Message: Note : I am using jtidy-r938 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-09-08 18:12:39
|
Bugs item #3349161, was opened at 2011-07-01 15:56 Message generated for change (Comment added) made by furman82 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3349161&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Aaron Herstein (aarongh2012) Assigned to: Nobody/Anonymous (nobody) Summary: problem parsing CDATA Initial Comment: When parsing this page: http://www.nytimes.com/2011/04/14/world/asia/14quake.html?_r=2, a StringIndexOutOfBoundsException is being thrown with this stack trace: java.lang.StringIndexOutOfBoundsException: String index out of range: 16385 at java.lang.String.checkBounds(Unknown Source) at java.lang.String.<init>(Unknown Source) at org.w3c.tidy.TidyUtils.getString(TidyUtils.java:658) at org.w3c.tidy.Lexer.getCDATA(Lexer.java:1835) at org.w3c.tidy.ParserImpl$ParseScript.parse(ParserImpl.java:667) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2464) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseBody.parse(ParserImpl.java:971) at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:203) at org.w3c.tidy.ParserImpl$ParseHTML.parse(ParserImpl.java:483) at org.w3c.tidy.ParserImpl.parseDocument(ParserImpl.java:3401) at org.w3c.tidy.Tidy.parse(Tidy.java:435) at org.w3c.tidy.Tidy.parse(Tidy.java:658) ---------------------------------------------------------------------- Comment By: Matt Furman (furman82) Date: 2011-09-08 14:12 Message: I also ran into this issue and "fixed" it locally... It appears to be a flaw with addByte within Lexer.java. The function assumes that the buffer only gets examined one byte at a time, however in the CDATA function, the call to TidyUtils.getString passes in a length that is greater than 1. I overloaded the appropriate functions to allow to pass in the size the buffer needs to grow by. public void addByte(int c) { addByte(c, 1); } /** * Adds a byte to lexer buffer. * @param c byte to add */ public void addByte(int c, int size) { if (this.lexsize + size >= this.lexlength) { while (this.lexsize + size >= this.lexlength) { if (this.lexlength == 0) { this.lexlength = 8192; } else { this.lexlength = this.lexlength * 2; } } byte[] temp = this.lexbuf; this.lexbuf = new byte[this.lexlength]; if (temp != null) { System.arraycopy(temp, 0, this.lexbuf, 0, temp.length); updateNodeTextArrays(temp, this.lexbuf); } } this.lexbuf[this.lexsize++] = (byte) c; this.lexbuf[this.lexsize] = (byte) '\0'; // debug } Once I changed the necessary associated functions, it seemed to do the trick. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3349161&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-09-08 12:53:35
|
Bugs item #3406215, was opened at 2011-09-08 18:18 Message generated for change (Comment added) made by rajeshkumarp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rajesh Kumar (rajeshkumarp) Assigned to: Nobody/Anonymous (nobody) Summary: replacing unexpected <h2> by </h2> Initial Comment: JTidy replacing <h2> as </h2> while using nested <h2> tags. Here is my HTML code. <h2 >Test Content <h2 >Test Content</h2> </h2> Here I get a warning as Warning: replacing unexpected <h2> by </h2> Please help me to fix this issue. ---------------------------------------------------------------------- >Comment By: Rajesh Kumar (rajeshkumarp) Date: 2011-09-08 18:23 Message: Note : I am using jtidy-r938 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-09-08 12:48:02
|
Bugs item #3406215, was opened at 2011-09-08 18:18 Message generated for change (Tracker Item Submitted) made by rajeshkumarp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Tidy functionality Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rajesh Kumar (rajeshkumarp) Assigned to: Nobody/Anonymous (nobody) Summary: replacing unexpected <h2> by </h2> Initial Comment: JTidy replacing <h2> as </h2> while using nested <h2> tags. Here is my HTML code. <h2 >Test Content <h2 >Test Content</h2> </h2> Here I get a warning as Warning: replacing unexpected <h2> by </h2> Please help me to fix this issue. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3406215&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-08-12 07:07:00
|
Bugs item #3390317, was opened at 2011-08-12 00:43 Message generated for change (Comment added) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: Francis Crimmins () Assigned to: Nobody/Anonymous (nobody) Summary: JTidy goes into infinite loop on specific input document Initial Comment: JTidy goes into infinite loop on specific input document: http://www.takeovers.govt.nz/enforcement/decisions/2004/meeting-wrightson.php When we call tidy.parse() the stack traces ends in many calls to Node.checkNodeIntegrity() and the CPU is pegged at 100% We're using the latest version of JTidy (r938). I've attached a copy of the input document which triggers the behaviour. Hopefully it's not too difficult to fix :) Many thanks, - Francis. ---------------------------------------------------------------------- Comment By: https://www.google.com/accounts () Date: 2011-08-12 07:07 Message: Sorry - there was typo in the version I gave to Francis there... Make that... <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head></head> <body> <em> <dl> <p> <dd> </dd> </p> </dl> </em> </body> </html> (i.e. the extra opening html tag is not required, and the close html can be present) From a quick investigation the problem seems to be that parser is producing a cycle of br tags (with A followed by B and B followed by A) below the dd tag. e.g. [Node type=RootNode,element=null,content= [Node type=StartTag,element=html,content= [Node type=StartTag,element=head,content= [Node type=StartTag,element=title,content=null]], [Node type=StartTag,element=body,content= [Node type=TextNode,element=null,text="",content=null], [Node type=StartTag,element=dl,content= [Node type=StartTag,element=dd,content= [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], [Node type=StartTag,element=br,content=null], ... Though not a proper fix, this patch will detect the cycle and throw a RuntimeException (and will also limit the loop in toString to help see what's happening as above). Index: src/main/java/org/w3c/tidy/Node.java =================================================================== --- src/main/java/org/w3c/tidy/Node.java (revision 1261) +++ src/main/java/org/w3c/tidy/Node.java (working copy) @@ -1311,7 +1311,11 @@ for (child = this.content; child != null; child = child.next) { - if (child.parent != this || !child.checkNodeIntegrity()) + if (this.next != null && this.next.next == this) { + throw new RuntimeException("Cycle detected - aborting"); + } + + if (child.parent != this || !child.checkNodeIntegrity()) { return false; } @@ -1347,8 +1351,15 @@ String s = ""; Node n = this; + int loopLimit = 1024; while (n != null) { + if (loopLimit < 0) { + s += "...TRUNCATED..."; + n = null; + break; + } + loopLimit--; s += "[Node type="; s += NODETYPE_STRING[n.type]; s += ",element="; ---------------------------------------------------------------------- Comment By: Francis Crimmins () Date: 2011-08-12 06:07 Message: And here's a more minimal document which exhibits the problem: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <html> <head></head> <body> <em> <dl> <p> <dd> </dd> </p> </dl> </em> </body> ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-08-12 06:07:57
|
Bugs item #3390317, was opened at 2011-08-12 00:43 Message generated for change (Comment added) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: Francis Crimmins () Assigned to: Nobody/Anonymous (nobody) Summary: JTidy goes into infinite loop on specific input document Initial Comment: JTidy goes into infinite loop on specific input document: http://www.takeovers.govt.nz/enforcement/decisions/2004/meeting-wrightson.php When we call tidy.parse() the stack traces ends in many calls to Node.checkNodeIntegrity() and the CPU is pegged at 100% We're using the latest version of JTidy (r938). I've attached a copy of the input document which triggers the behaviour. Hopefully it's not too difficult to fix :) Many thanks, - Francis. ---------------------------------------------------------------------- >Comment By: Francis Crimmins () Date: 2011-08-12 06:07 Message: And here's a more minimal document which exhibits the problem: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <html> <head></head> <body> <em> <dl> <p> <dd> </dd> </p> </dl> </em> </body> ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-08-12 00:44:04
|
Bugs item #3390317, was opened at 2011-08-12 00:43 Message generated for change (Settings changed) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None >Priority: 7 Private: No Submitted By: Francis Crimmins () Assigned to: Nobody/Anonymous (nobody) Summary: JTidy goes into infinite loop on specific input document Initial Comment: JTidy goes into infinite loop on specific input document: http://www.takeovers.govt.nz/enforcement/decisions/2004/meeting-wrightson.php When we call tidy.parse() the stack traces ends in many calls to Node.checkNodeIntegrity() and the CPU is pegged at 100% We're using the latest version of JTidy (r938). I've attached a copy of the input document which triggers the behaviour. Hopefully it's not too difficult to fix :) Many thanks, - Francis. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-08-12 00:43:01
|
Bugs item #3390317, was opened at 2011-08-12 00:43 Message generated for change (Tracker Item Submitted) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Francis Crimmins () Assigned to: Nobody/Anonymous (nobody) Summary: JTidy goes into infinite loop on specific input document Initial Comment: JTidy goes into infinite loop on specific input document: http://www.takeovers.govt.nz/enforcement/decisions/2004/meeting-wrightson.php When we call tidy.parse() the stack traces ends in many calls to Node.checkNodeIntegrity() and the CPU is pegged at 100% We're using the latest version of JTidy (r938). I've attached a copy of the input document which triggers the behaviour. Hopefully it's not too difficult to fix :) Many thanks, - Francis. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3390317&group_id=13153 |
From: SourceForge.net <no...@so...> - 2011-07-11 01:18:19
|
Bugs item #3349163, was opened at 2011-07-02 04:02 Message generated for change (Comment added) made by aditsu You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3349163&group_id=13153 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: DOM Support Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Aaron Herstein (aarongh2012) Assigned to: Nobody/Anonymous (nobody) Summary: problem parsing tbody Initial Comment: Tidy does not parse the tbody element in html tables as in this example: <table border="1"> <tbody> <tr> <td>January</td> <td>$100</td> </tr> <tr> <td>February</td> <td>$80</td> </tr> </tbody> </table> nothing is done with the tbody element ---------------------------------------------------------------------- >Comment By: Adrian Sandor (aditsu) Date: 2011-07-11 09:18 Message: Ugh, there was SO much spam. I removed those comments and blocked anonymous posts. Sorry if that negatively affects any non-spammer. ---------------------------------------------------------------------- Comment By: Martin (martinkurz) Date: 2011-07-10 21:18 Message: Just tested with jTidys java-5 branch: String source = "<table border=\"1\"><tbody><tr><td>January</td><td>$100</td></tr><tr><td>February</td><td>$80</td></tr></tbody></table>"; Tidy tidy = new Tidy(); Writer stringWriter = new StringWriter(); tidy.parse(new ByteArrayInputStream(source.getBytes("UTF-8")), stringWriter); System.out.println(stringWriter.toString()); and got the following result: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <meta content= "HTML Tidy for Java (vers. 2009-08-01), see jtidy.sourceforge.net" name="generator"> <title></title> </head> <body> <table border="1"> <tbody> <tr> <td>January</td> <td>$100</td> </tr> <tr> <td>February</td> <td>$80</td> </tr> </tbody> </table> </body> </html> That's what I would expect to get, could you provide a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=113153&aid=3349163&group_id=13153 |