User Activity

  • Posted a comment on ticket #225 on HtmlCleaner

    it's worth noticing that, by setting props.setOmitUnknownTags(true); the issue does not happen

  • Created ticket #225 on HtmlCleaner

    StringIndexOutOfBoundsException while sanitizeXmlIdentifier

  • Posted a comment on ticket #175 on HtmlCleaner

    makes sense, thanks

  • Posted a comment on ticket #175 on HtmlCleaner

    I agree, probably default not to use a prefix is clearer and if you like, just expose a prefix property for one to use (in this case would be "" by default). A side question: I couldn't find a way to set Document.stricterrorchecking=false if not creating my own custom DomSerializer where I can access document before is built. Is there a property that I miss?

  • Posted a comment on ticket #175 on HtmlCleaner

    I think this is it! It covers all the options. thanks!

  • Posted a comment on ticket #175 on HtmlCleaner

    I see, seems fine. Only one last thing, I'd still leave the possibility to configure such that nothing is touched. While this will produce invalid attribute names, setting Document.stricterrorchecking=false will enable to have a Document with untouched names. This is my current use case (getting a Document from html5 pages and query via xpath saxon).

  • Posted a comment on ticket #175 on HtmlCleaner

    From your last example, I wonder if you can always clean invalid names like in your example ban;ana to banana ? What about <p 1="1"> ? Thinking it through, maybe the most flexible way for a user would be to allow to pass to a serializer constructor a function<String, String=""> that transforms an attribute name into whatever the user wants. By default, if not overridden by the user, this function does what we said above. something like: if (!isXMLValid(attrName)) attrName = invalidAttrNameFunction.sanitize(attrName);...

  • Posted a comment on ticket #175 on HtmlCleaner

    and I agrre that prefixInvalidAttributeNames can be true by default

View All

Personal Data

Username:
legrass
Joined:
2008-11-28 16:16:33

Projects

  • No projects to display.

Personal Tools

MongoDB Logo MongoDB