Hi Andrew. Version 3.5 hasn't been officially released yet because the newest feature, a web crawler API, has not been fully documented yet.
The project is not dead, and minor improvements continue to make their way into the DEV version, but other time commitments have prevented the completion of the documentation and an official release for years.
The 3.5-dev version (http://jericho.htmlparser.net/temp/jericho-html-3.5-dev.zip) is always a release candidate and can be used as a reliable substitute for the last official 3.4 release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for responding so quickly. Since my last message, I've been trying out 3.5-dev as I was hoping to take advantage of the memory consumption improvements, but have come across a behaviour difference for the Renderer between 3.4 and 3.5.
For example, <p>Hello</p><p><br></p><p>There</p> used to output Hello\r\n\r\nThere
But now in 3.5-dev it outputs Hello\r\n\r\n\r\nThere
Is this an expected behaviour change? I have attached a screenshot of the Renderer configured
The release.txt file does mention "minor changes to Renderer behaviour" for version 3.5. The new behaviour is more consistent with browser behaviour so it is most likely an intended change.
Cheers
Martin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
Are there any plans to release 3.5 onto Maven? I've seen the version being developed mentioned on here a few times
Hi Andrew. Version 3.5 hasn't been officially released yet because the newest feature, a web crawler API, has not been fully documented yet.
The project is not dead, and minor improvements continue to make their way into the DEV version, but other time commitments have prevented the completion of the documentation and an official release for years.
The 3.5-dev version (http://jericho.htmlparser.net/temp/jericho-html-3.5-dev.zip) is always a release candidate and can be used as a reliable substitute for the last official 3.4 release.
Hi Martin,
Thanks for responding so quickly. Since my last message, I've been trying out 3.5-dev as I was hoping to take advantage of the memory consumption improvements, but have come across a behaviour difference for the Renderer between 3.4 and 3.5.
For example,
<p>Hello</p><p><br></p><p>There</p>
used to outputHello\r\n\r\nThere
But now in 3.5-dev it outputs
Hello\r\n\r\n\r\nThere
Is this an expected behaviour change? I have attached a screenshot of the Renderer configured
Hi Andrew,
The release.txt file does mention "minor changes to Renderer behaviour" for version 3.5. The new behaviour is more consistent with browser behaviour so it is most likely an intended change.
Cheers
Martin