Hello, we use CyberNeko as the parser for html mails on our WebTop 5 collaboration platform, always with balance tags on (turning it off brings many problems).
We found that html containing svg tags, will break the html flux, obtaining a modified html output that won't work the same as original.
For example:
<a href="..."><svg ....>....</svg></a>
will be parsed producing this wrong sequence
<a href="..."></a><svg ....>....</svg>
and links on the svg elements will not work anymore.
We have verified that the callbacks to start/end tags arrive in the wrong sequence.
Is there a chance that this is corrected? Or should we fork the project and try out a solution of our own?
If you provide a patch fixing the problem, it can be quickly applied.
Actually I did not try latest 1.9.9 but only 1.9.6.x , as it it is the highest version available on maven repos. Why isn't 1.9.9 available?
I would recommend to use the latest available version (1.9.21). You can give a try to HtmlUnit's fork of Neko as well, perhas does it already contain a fix for your issue: https://github.com/HtmlUnit/htmlunit-neko
Yes thanks, moving to the html unit neko solved the problem automagically!
For others reading this thread, the latest available version of Neko is currently (1.9.22): https://search.maven.org/artifact/net.sourceforge.nekohtml/nekohtml/1.9.22/jar - The 1.9.6.x version Sonicle refers to above is at an older set of artifact coordinates: https://search.maven.org/artifact/nekohtml/nekohtml/1.9.6.2/jar - If you are using that old version, you should switch to: net.sourceforge.nekohtml:nekohtml:1.9.22 or greater.