You can subscribe to this list here.
2004 |
Jan
(5) |
Feb
(6) |
Mar
(11) |
Apr
(6) |
May
(9) |
Jun
(5) |
Jul
(8) |
Aug
(3) |
Sep
(2) |
Oct
(16) |
Nov
(16) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(8) |
Feb
(7) |
Mar
(6) |
Apr
(8) |
May
(5) |
Jun
(9) |
Jul
(4) |
Aug
(4) |
Sep
(2) |
Oct
(5) |
Nov
(5) |
Dec
(2) |
2006 |
Jan
(9) |
Feb
(5) |
Mar
(2) |
Apr
(9) |
May
(1) |
Jun
(4) |
Jul
(1) |
Aug
(9) |
Sep
(2) |
Oct
(5) |
Nov
(5) |
Dec
|
2007 |
Jan
(1) |
Feb
(1) |
Mar
(2) |
Apr
(1) |
May
(5) |
Jun
(1) |
Jul
(2) |
Aug
(4) |
Sep
(3) |
Oct
(2) |
Nov
(3) |
Dec
|
2008 |
Jan
(4) |
Feb
(7) |
Mar
(3) |
Apr
(6) |
May
|
Jun
(1) |
Jul
(3) |
Aug
(3) |
Sep
(5) |
Oct
(1) |
Nov
(3) |
Dec
(3) |
2009 |
Jan
(2) |
Feb
(4) |
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
(16) |
Aug
(12) |
Sep
(10) |
Oct
|
Nov
(2) |
Dec
(4) |
2010 |
Jan
(3) |
Feb
(1) |
Mar
(1) |
Apr
(16) |
May
(4) |
Jun
(1) |
Jul
(15) |
Aug
(8) |
Sep
(14) |
Oct
(5) |
Nov
(1) |
Dec
|
2011 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(6) |
Jul
|
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(1) |
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
(7) |
May
|
Jun
(1) |
Jul
|
Aug
(3) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
2013 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(6) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: 周鹏 <zho...@gm...> - 2010-07-14 08:15:23
|
Hi! I'm use Jtidy to convert this page: http://sports.yahoo.com/mlb/news;_ylt=AgnhXKUYhDvpSQ1TCuC_5SE5nYcB?slug=ap-obit-steinbrenner Here is my code: Tidy tidy = new Tidy(); tidy.setXHTML(true); InputStream is = new FileInputStream("1.html");//1.html is the page on the top OutputStream os = new FileOutputStream("result.xml"); tidy.parseDOM(is, os); ...... This can't work correct.Here is the log: InputStream: Doctype given is "-//W3C//DTD HTML 4.01//EN" InputStream: Document content looks like HTML 4.01 Transitional 330 warnings, 19 errors were found! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. Can anyone help me? Sorry for my poor english! |
From: SourceForge.net <no...@so...> - 2010-07-12 16:05:35
|
The following forum message was posted by weberjn at http://sourceforge.net/projects/jtidy/forums/forum/41436/topic/3767679: Hi, can you make jTidy output a content-type meta tag like: [code]<meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\" />[/code] ? Or would I have to add manually an element to the head element? Thanks, Juergen |
From: SourceForge.net <no...@so...> - 2010-07-02 12:21:08
|
The following forum message was posted by asheara at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3758644: Hi All, I\'m trying to extract a div element with its content from html file. Then I want to write a new html file wich its only content will be the extracted div. Everything\'s ok until the point where I try to do tidy.pprint(w3cDoc, bos); After this pprint sentences I inspect bos and it is empty. I had never work with jTidy, it\'s so hard to find examples o tutorials, any idea? Thank you, This is the complete code block [code]Document doc = tidy.parseDOM(new FileInputStream(\"myFile.html\"), null); DOMReader reader = new DOMReader(); org.dom4j.Document dom4jDoc = reader.read(doc); String node = \"//div[@id=\'contenedor\']\"; Node myNode = dom4jDoc.selectSingleNode(node); miNodo.setDocument(null); miNodo.setParent(null); //Create new Document org.dom4j.Document newHTML = DocumentHelper.createDocument(); newHTML.add(miNodo); DOMWriter writer = new DOMWriter(); try { Document w3cDoc = writer.write(newHTML); ByteArrayOutputStream bos = new ByteArrayOutputStream(); tidy.pprint(w3cDoc, bos);[/code] |
From: SourceForge.net <no...@so...> - 2010-06-07 17:28:35
|
The following forum message was posted by at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683558: Sure. One thing, I\'m also finding that we get line breaks after <b> and <i> in <pre> tags... is this part of the bug? |
From: SourceForge.net <no...@so...> - 2010-05-16 20:29:32
|
The following forum message was posted by Anonymous at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3711387: I\'m processing bad-formated HTML pages with JTidy. I am only interested in fixing a specific set of tags, for example <img> <table>. Is there anyway to tell JTidy to focus on only those tags? |
From: SourceForge.net <no...@so...> - 2010-05-14 06:58:48
|
The following forum message was posted by viswavaranasi at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3708791: Hi, does Jtidy can be used for deleting the unwanted HTML tags from a html file? with JTidy is it possible to replace some of the HTML tags with new HTML tags? for eg : in a HTML file,replace all <h1> with <h2> pls refer me with some examples or tutorials on Jtidy. Thanks viswa |
From: Misha K. <mis...@gm...> - 2010-05-12 10:03:27
|
Dear All: Thank you for great product! I am using TagSoup+XOM per: http://nicklothian.com/blog/2006/09/11/using-xpath-on-real-world-html-documents/ seems to work well except the following namespace problem: http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet Can I use JTidy for XPath? Any code samples? How does it compare to tagsoup/HTMLParser/Jericho etc? Thank you Misha |
From: Misha K. <mis...@gm...> - 2010-05-12 10:02:06
|
Dear All: Thank you for great product! I am using TagSoup+XOM per: http://nicklothian.com/blog/2006/09/11/using-xpath-on-real-world-html-documents/ seems to work well except the following namespace problem: http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet Can I use JTidy for XPath? Any code samples? How does it compare to tagsoup/HTMLParser/Jericho etc? Thank you Misha |
From: SourceForge.net <no...@so...> - 2010-04-24 08:32:42
|
The following forum message was posted by aditsu at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683463: JTidy implements getTextContent in DOMNodeImpl (only), this way: [code] /** * @todo DOM level 3 getTextContent() Not implemented. Returns null. * @see org.w3c.dom.Node#getTextContent() */ public String getTextContent() throws DOMException { return null; }[/code] I think it\'s quite obvious. Can you file a bug report? |
From: SourceForge.net <no...@so...> - 2010-04-24 07:49:20
|
The following forum message was posted by aditsu at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683558: I think I fixed this in the CodeUpdateAndJava5 branch, but trunk still has this bug. I\'d have to backport it. Wanna file a bug report with a test case? |
From: SourceForge.net <no...@so...> - 2010-04-21 13:29:31
|
The following forum message was posted by Anonymous at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683558: It looks like this was supposed to have been fixed in htmltidy a while back: http://osdir.com/ml/web.html-tidy.tracker/2006-04/msg00015.html Is this working as designed? One should expect linebreaks after <BR>s in <PRE> tags? |
From: SourceForge.net <no...@so...> - 2010-04-21 11:03:50
|
The following forum message was posted by Anonymous at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683463: Right, apparently I messed up the BBCode. This is the Java code: [code]// Load test.html. String file = \"test.html\"; InputStream in = new FileInputStream(file); OutputStream out = null; // Parse test.html into a DOM tree. Tidy tidy = new Tidy(); Document doc = tidy.parseDOM(in, out); // Print <body>\'s text content. org.w3c.dom.Node body = doc.getElementsByTagName(\"body\").item(0); Element bodyElement = (Element) body; String bodyTextContent = bodyElement.getTextContent(); System.out.print(\"<body> TextContent:\\n\" + bodyTextContent);[/code] |
From: SourceForge.net <no...@so...> - 2010-04-21 11:01:32
|
The following forum message was posted by Anonymous at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3683463: Everytime I call getTextContent() on an org.w3c.dom.Node object, it always returns null. When I checked the documentation, it said getTextContent only returns null when the Node object is either of type DOCUMENT_NODE, DOCUMENT_TYPE_NODE, or NOTATION_NODE. This is odd because it returns null on virtually every DOM node. It illustrate my issue, I\'ve written a small test case. I\'ve used the following HTML code: [code]<!DOCTYPE html> <html> <head> <title>jwz</title> </head> <body> <p>text<b>b<i>i<u>u</u>i</i>b<br>b</b>text</p> </body> </html>[/code] Using the following Java code I\'ve tried to get the textContent of the <body> element: [code]// Load test.html. InputStream in = new FileInputStream(\"test.html\"); OutputStream out = null; // Parse test.html into a DOM tree. Tidy tidy = new Tidy(); Document doc = tidy.parseDOM(in, out); // Print <body>\'s text content. org.w3c.dom.Node body = doc.getElementsByTagName(\"body\").item(0); Element bodyElement = (Element) body; String bodyTextContent = bodyElement.getTextContent(); System.out.print(\"<body> TextContent:\\n\" + bodyTextContent);[/code] However, the result is: [code]<body> TextContent: null[/code] Did I do something wrong here? Or is this not supposed to happen? Thanks in advance! |
From: Kevin B. <kb...@gm...> - 2010-04-18 19:24:12
|
It looks like this was supposed to have been fixed in htmltidy: http://osdir.com/ml/web.html-tidy.tracker/2006-04/msg00015.html Is this working as designed? One should expect linebreaks after <BR>s in <PRE> tags? |
From: SourceForge.net <no...@so...> - 2010-04-12 16:30:13
|
The following forum message was posted by verhagent at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3673061: Hi Adrian, Thanks for the quick reply! Yes, it would be of great help for me, and also for others, when they use JTidy API from within a Maven based project. Ok, I\'ll have to investigate this part also myself a bit more. I\'ll let you know, what you (/ we) need to do, to get it done. Thanks in advance! Tjeerd |
From: SourceForge.net <no...@so...> - 2010-04-12 13:24:31
|
The following forum message was posted by aditsu at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3673061: Hi, I\'m the current JTidy maintainer. I joined the project last year, after noticing that it was almost abandoned. That feature request is much older. I don\'t use maven at all, I don\'t know what needs to be done and I\'d rather not bother doing it. But if it is useful to you and you know how to release it to whatever repository you need, then just go ahead. Let me know what you need and I will assist you. Adrian |
From: SourceForge.net <no...@so...> - 2010-04-12 11:24:54
|
The following forum message was posted by rajorshi at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: Hello aditsu, I\'ve attached a sample HTML file which shows this problem in the bug report I\'ve filed at : https://sourceforge.net/tracker/?func=detail&aid=2985849&group_id=13153&atid=113153 Can you please let me know if this is a valid bug or if I\'m doing something wrong? Thanks, Raj |
From: SourceForge.net <no...@so...> - 2010-04-12 10:47:45
|
The following forum message was posted by verhagent at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3673061: What are the ideas / plans to make JTidy availabel through the central Maven Repository? It has been asked already before, see Feature Request: Release new version in Maven Central Repo https://sourceforge.net/tracker/?func=detail&aid=1780883&group_id=13153&atid=363153 My project http://docbook-utils.sourceforge.net/maven-tidy-plugin_1.0/docbook/article-project-overview.html which just creates an Maven Plug-in of the JTidy API depends on it. I\'m planning to release the Plug-in soon through the Sonatype Forge repository. But for that I also would need to release the JTidy API myself, as it currently is not centrally available. I\'m looking forward to see the JTidy API getting available through the Maven repository. If any help is needed, please let me know, I\'m prepared to help the JTidy team along. Regards, Tjeerd |
From: SourceForge.net <no...@so...> - 2010-04-12 08:15:29
|
The following forum message was posted by rajorshi at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: Ok, so here is the problem: For example, consider the following input: <p class=\"MsoNormal\" style=\"text-autospace:none;\"><font color=\"black\"><span style=\"color:black;\">???</span></font><b><font color=\"#7f0055\"><span style=\"color:#7f0055;font-weight:bold;\">private</span></font></b><font color=\"black\"><span style=\"color:black;\"> String parseDescription</span></font><font> The output is: <p class=\"MsoNormal\" style=\"text-autospace:none;\"><font color= \"black\"><span style=\"color:black;\"> </span></font> <b><font color=\"#7F0055\"><span style= \"color:#7f0055;font-weight:bold;\">private</span></font></b><font color=\"black\"><span style=\"color:black;\">String parseDescription</span></font></p> So, \"public String parseDescription\" becomes \"publicString parseDescription\" |
From: SourceForge.net <no...@so...> - 2010-04-03 18:24:11
|
The following forum message was posted by aditsu at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: The code looks ok, and also it doesn\'t have to be so complicated. If you have the html in a string, you can do: tidy.parse(new StringReader(s), System.out); If you have it in a file, you can do: tidy.parse(new FileInputStream(f), System.out); |
From: SourceForge.net <no...@so...> - 2010-04-03 16:30:40
|
The following forum message was posted by rajorshi at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: Hello aditsu, thanks for your reply. Yes, the one I gave was just an example. I am using jtidy-r938.jar. I am facing this with a complex HTML snippet ( I didnt want to paste the large snippet here ) -- this is just a demo. Could you please tell me if the code I am using is correct for balancing/cleaning a random snippet of HTML? If so, I\'ll try to reproduce the problem with a simpler html snippet. Thanks, Raj |
From: SourceForge.net <no...@so...> - 2010-04-03 14:54:14
|
The following forum message was posted by aditsu at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: The output I get is: <!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"> <html> <head> <meta name=\"generator\" content= \"HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net\"> <title></title> </head> <body> <span style=\"color:red;\">hello</span> <span style= \"color:blue\">world</span> </body> </html> Notice the space between the two span tags. What JTidy version are you using? |
From: SourceForge.net <no...@so...> - 2010-04-03 14:25:01
|
The following forum message was posted by rajorshi at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3658377: Hello, I had posted this on jtidy-user mailing list but got no response :( , so trying here... I\'m trying to use jtidy to format/cleanup some HTML contained in a Java String. What I see is that often, spaces are lost. For instance, suppose the markup is <span style=\"color:red;\">hello</span><span style=\"color:blue\"> world</span> The space (not nbsp, but it\'s rendered by browsers and mail clients nevertheless) is lost, and it transforms into: <span style=\"color:red;\">hello</span><span style=\"color:blue\">world</span> And hence shows up in a browser as \"helloworld\" instead of \"hello world\". The following is my code. Am I doing something obviously wrong here? Code: InputStream is = new ByteArrayInputStream(rawHtml.getBytes(\"utf-8\")); Tidy tidy = new Tidy(); tidy.setInputEncoding(\"utf-8\"); ByteArrayOutputStream baos = new ByteArrayOutputStream(); tidy.parseDOM(is, baos); String pure = baos.toString(\"utf-8\"); Thanks in advance! Raj |
From: Adrian S. <ad...@ya...> - 2010-04-02 08:11:58
|
Hi everybody, I'm the current JTidy maintainer. I'm writing to let you know that while I'm subscribed to this mailing list, I find it much easier and more convenient to use the forums, especially when I'm busy. Therefore I recommend you to use the help forum at https://sourceforge.net/projects/jtidy/forums/forum/41437 , or if you think you found a bug, you're welcome to report it in the bug tracker at https://sourceforge.net/tracker/?group_id=13153&atid=113153 You can still write to this mailing list, but I may take longer to reply or even forget. Thanks Adrian |
From: Rajorshi B. <raj...@in...> - 2010-03-31 16:03:40
|
Hello there, I'm trying to use jtidy to format/cleanup some HTML contained in a Java String. What I see is that often, spaces are lost. For instance, suppose the markup ishello worldThe space (not nbsp, but it's rendered by browsers and mail clients nevertheless) is lost, and it transforms into:helloworldAnd hence shows up in a browser as "helloworld" instead of "hello world".The following is my code. Am I doing something obviously wrong here?Code:InputStream is = new ByteArrayInputStream(rawHtml.getBytes("utf8"));Tidy tidy = new Tidy();tidy.setInputEncoding("utf8");ByteArrayOutputStream baos = new ByteArrayOutputStream();tidy.parseDOM(is, baos);String pure = baos.toString("utf8");Thanks in advance!RajDear jtidyuser! Get Yourself a cool, short @in.com Email ID now! |