User Activity

  • Posted a comment on discussion Help on HtmlCleaner

    Hi Scott, Thanks very much for the speedy investigation! Remi

  • Posted a comment on discussion Help on HtmlCleaner

    Hi, I've found an HTML page which results in "INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified." when I try to use HtmlCleaner to create a DOM out of it. I've attached said webpage. Drilling down in the DomSerializer, there is a call to setAttribute with the attrName "dispariție.", and I imagine that accent might be causing the issue. Should it not be getting sanitised before that? Let me know if this is an invalid input or not. Thanks

  • Posted a comment on discussion Help on Jericho HTML Parser

    Hi Martin, Thanks a lot for patching this. I now get expected behaviour! Kind regards, Remi

  • Modified a comment on discussion Help on Jericho HTML Parser

    Hi, I've noticed that the Jericho Renderer doesn't include Button elements in its toString(). This is presumably because button is mapped to a RemoveElementHandler in Renderer. I would be interested to hear the rationale behind this, but more importantly, is there a way to override this behaviour on my end? You can reproduce with something as simple as: <html><body><button>My Button</button></body></html> Which will result in an empty string. Many thanks

  • Posted a comment on discussion Help on Jericho HTML Parser

    Hi, I've noticed that the Jericho Renderer doesn't include Button elements in its toString(). This is presumably because button is mapped to a RemoveElementHandler in Renderer. I would be interested to hear the rationale behind this, but more importantly, is there a way to override this behaviour on my end? You can reproduce with something as simple as: <button>My Button</button> Which will result in an empty string. Many thanks

  • Created ticket #227 on HtmlCleaner

    Running out of memory cleaning HTML

  • Posted a comment on ticket #92 on Jericho HTML Parser

    Thanks for clearing this up, Martin!

  • Created ticket #92 on Jericho HTML Parser

    Query parameter names in hyperlinks being incorrectly decoded

View All

Personal Data

Username:
remirosenthal
Joined:
2020-07-06 08:52:53

Projects

  • No projects to display.

Personal Tools

MongoDB Logo MongoDB