User Activity

  • Posted a comment on ticket #91 on Jericho HTML Parser

    Ok thanks, I understand and agree with your point. I'll give it some thought of how to handle this scenario with customers that have this issue and decide whether fixing this up silently for them by customising the Renderer class or leave it as is, as that is what a non styled representation of their page would look like. Thanks for your help Danny

  • Posted a comment on ticket #91 on Jericho HTML Parser

    Hi Martin, The customer who reported this issue gave this web page as an example that showed this issue: https://leadsforward.com/generating-the-best-solar-leads-before-the-end-of-2019/ The Render class joins the words "Do" and "Lead" which are on different lines in the visible page: I've attached a screenshot that shows the visible page and the source code below. Thanks Danny

  • Posted a comment on ticket #91 on Jericho HTML Parser

    Hi Martin, Thanks for your response. The renderer.setHyperlinkContentDelimiters(null," ") method would work for us. However, this doesn't seem to work for me. We have Jericho-thml-3.5-dev-3. The following code with the above line setHyperlinkContentDelimiters call still concatenates the words "some" and "text": public void testIssue() { final String html = "\n" + " \n" + " sometext\n" + " \n" + ""; final Source source = new Source(html); final Renderer renderer = new Renderer(source); renderer.setHyperlinkContentDelimiters(null,...

  • Created ticket #91 on Jericho HTML Parser

    Renderer class joins words in consecutive anchor tags text

  • Posted a comment on ticket #90 on Jericho HTML Parser

    Thanks for the fix. Danny

  • Created ticket #90 on Jericho HTML Parser

    Renderer class picks out content from within a script tag

  • Posted a comment on ticket #219 on HtmlCleaner

    Hi Scott, Seems that sourceforge is removing some of my comments if I refer to html tags. What I wanted to say is that I believe it is the svg tags that is causing the issue in the above example. Thanks Danny

  • Posted a comment on ticket #219 on HtmlCleaner

    Hi Scott, Thanks for your response. Yes this problem occurred in real HTML. The following HTML will reproduce it: <html> <head></head> <body> <svg></svg> <a href="https://www.moneysavingexpert.com/car-insurance/" "><br> <h3">Compare cheap car insurance quotes online - MSE</h3> <div"><cite">www.moneysavingexpert.com › car-insurance</cite></div> </a> </body> </html> Seems that the tags cause the issue as without it the H3 end tag is in the correct place. Thanks Danny

View All

Personal Data

Username:
screamingdan
Joined:
2020-02-17 11:40:44

Projects

  • No projects to display.

Personal Tools