Vincent Massol

Show:

What's happening?

  • new line is not recognized as a valid children content

    TagInfo.allowItems() doesn't allow new line (wrapped into a ContentToken) to be a valid children element. For example if you have this input: "\n\n" you'll get after cleaning: "\n\n".

    2009-05-05 12:06:21 UTC in HtmlCleaner

  • CDATA block handling is not correct

    Hi Vladimir, I've found several issues with CDATA handling that I have reported as: * https://sourceforge.net/tracker/?func=detail&aid=2691888&group_id=183053&atid=903696 * https://sourceforge.net/tracker/?func=detail&aid=2761963&group_id=183053&atid=903696 I'd like to know what's your opinion on this and whether there's a chance a fix will be incorporated in the...

    2009-04-14 17:22:41 UTC in HtmlCleaner

  • Comment: CDATA blocks are not recognized

    It'll also fail if useCdata is set to true and there are any "

    2009-04-14 17:17:09 UTC in HtmlCleaner

  • Comment: CDATA blocks are not recognized

    See also https://sourceforge.net/tracker/?func=detail&aid=2761963&group_id=183053&atid=903696 One problem is in HtmlTokenizer, in content() which breaks if the "

    2009-04-14 17:14:51 UTC in HtmlCleaner

  • Comment: CDATA blocks should be stripped in scripts/style elements

    In addition note that for the following example there are 3 ContentToken generated (and not one) which means that all htmlcleaner serializers fail to generate valid content: "\n" + "//

    2009-04-14 13:28:12 UTC in HtmlCleaner

  • Maven pom.xml for JFreechart seems invalid

    Hi, The maven pom.xml file published for jfreechart on the central maven repositrory doesn't look correct. The junit dependency should be marked with the test scope or be made optional. This was wrong in 1.0.0-RC1 but was fixed in 1.0.1 (http://repo2.maven.org/maven2/jfree/jfreechart/1.0.1/jfreechart-1.0.1.pom) and it's now wrong again in 1.0.11...

    2009-03-21 09:03:33 UTC in JFreeChart

  • Followup: RE: HTML to XHTML transformation/cleaning

    Hi Martin, Thanks for your answer. I have indeed surveyed the landscape (the open source one only) and the 2 solutions I have found are JTidy and SF's HTML Cleaner. As you say JTidy is old (last release is from 2001). I had checked its SVN activity and found it very low (last change was 2 years ago). I've just rechecked now and indeed someone has been working a bit on it about 2 months ago...

    2009-03-21 08:46:54 UTC in Jericho HTML Parser

  • HTML to XHTML transformation/cleaning

    Hi there, Just found Jericho and it looks great. The javadoc is awesome, well done. I'm looking for a tool to transform HTML into XHTML and I was wondering if Jericho has such support or I would have to use Jericho as a low level tool to build that cleaner/transformer. Right now I'm using SF's HTML Cleaner (http://htmlcleaner.sourceforge.net/) but it's quite buggy and is lacking things...

    2009-03-20 21:49:11 UTC in Jericho HTML Parser

  • CDATA blocks are not recognized

    HTML Cleaner should recognize CDATA blocks and thus it shouldn't escape any character inside them.

    2009-03-18 14:24:17 UTC in HtmlCleaner

  • Comment: DomSerializer ignores the doctype

    Proposed patch, instead of calling newDocument() in DOMSerialize use getDOMImplementation as in: DocumentBuilderFactory factory   = DocumentBuilderFactory.newInstance();  DocumentBuilder builder = factory.newDocumentBuilder();  DOMImplementation impl = builder.getDOMImplementation();    DocumentType svgDOCTYPE = impl.createDocumentType("svg",   "-//W3C//DTD SVG 1.0//EN",

    2009-03-16 15:48:50 UTC in HtmlCleaner

About Me

  • 2000-03-25 (10 years ago)
  • 22169
  • vmassol (My Site)
  • Vincent Massol

Send me a message