I've noticed that the Jericho Renderer doesn't include Button elements in its toString(). This is presumably because button is mapped to a RemoveElementHandler in Renderer.
I would be interested to hear the rationale behind this, but more importantly, is there a way to override this behaviour on my end?
You can reproduce with something as simple as: <html><body><button>My Button</button></body></html>
Which will result in an empty string.
Many thanks
Last edit: Remi Rosenthal 2022-06-29
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I didn't document anywhere why I made the decision to remove the content of button elements. In general I was copying the behaviour of how some email clients create pure text versions of HTML emails. Maybe I just thought they should be removed because all other form elements (INPUT, TEXTAREA etc) are removed. Or maybe I just didn't think much about it!
I've modified the Render class in version 3.5 to include the content of BUTTON elements.
The development version is always pretty much as stable as an official release. It has been a long time since an official release because the new WebBot functionality hasn't been documented yet, and I simply don't have time to work on it.
Cheers
Martin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I've noticed that the Jericho Renderer doesn't include Button elements in its
toString()
. This is presumably becausebutton
is mapped to a RemoveElementHandler in Renderer.I would be interested to hear the rationale behind this, but more importantly, is there a way to override this behaviour on my end?
You can reproduce with something as simple as:
<html><body><button>My Button</button></body></html>
Which will result in an empty string.
Many thanks
Last edit: Remi Rosenthal 2022-06-29
Hi Remi,
I didn't document anywhere why I made the decision to remove the content of button elements. In general I was copying the behaviour of how some email clients create pure text versions of HTML emails. Maybe I just thought they should be removed because all other form elements (INPUT, TEXTAREA etc) are removed. Or maybe I just didn't think much about it!
I've modified the Render class in version 3.5 to include the content of BUTTON elements.
Until version 3.5 is officially released, the development version is available here:
http://jericho.htmlparser.net/temp/jericho-html-3.5-dev.zip
The development version is always pretty much as stable as an official release. It has been a long time since an official release because the new WebBot functionality hasn't been documented yet, and I simply don't have time to work on it.
Cheers
Martin
Hi Martin,
Thanks a lot for patching this. I now get expected behaviour!
Kind regards,
Remi