User Ratings

★★★★★
★★★★
★★★
★★
18
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5

Rate This Project

Login To Rate This Project

User Reviews

  • Best HTML parser there is

  • Very useful and reliable library to parse HTML!

  • I needed to transform many JSP pages from a real open source project downloaded form the Internet. Initially, I tried jsoup because it looks like having a easier/more intuitive/higher level API, but it become a headache with lots of errors to compile the transformed project. Then, I tried jericho and, it worked incredibly fine from the first time. Awesome library to parse and transform JSP pages!!

  • Impressible project - more powerful than most commercial solutions. Incredible powerful and flexible. Saved me countless hours.

    1 user found this review helpful.
  • We needed to validate and compare different dynamic generated web pages inside of Selenium2 application tests. Jericho HTML Parser provided all required high-level methods for HTML content analysis and evaluation (especially getDebugInfo() method for fast allocation of problem code). Very good API documentation and set of examples allows us fast finish all necessary application tests. Excellent work - thanks.

  • I looked at other HTML parsers (Jtidy, Jsoup, etc), and found that this was the only one that would meet my needs and was extremely easy to implement. I need to be able to have custom tags embedded in the HTML content, that can exist under other standard HTML tags that may not normally allow these tags. With other parsers, it would either throw these custom tags out, or it would place them outside of the tag that I had them under (eg. table tag as parent), not what I wanted. And using Jericho, I am able parse this content, leaving the original HTML structure intact and then dynamically replace these custom tags with valid HTML tags that are generated by the server application process, using the attributes defined on these custom tags for the server side processing. Originally, I was using regular expressions to due this work, but then I was limited to having the custom tag attributes defined in the expression, and the order of them could not change. With an HTML parser, the content developer can now use other tag attributes that I may not have been checking for, and they will get passed on to the rendered element(s), and the order of the attributes no longer matters. After trying a few other HTML parsers, I began writing my own basic HTML tag parser that would do detection and replacement of specified tags, but I quickly discovered that this would take more time than I wanted to spend. So, I looked at other HTML parsers and found Jericho. And after looking at the API docs and and trying some simple test cases, I found that this was exactly what I was looking for! Since, it would leave the original structure intact and just do replacement using the index position of the original content, which is exactly what my regular expression matching was doing, and I needed to simulate that. Great Work! This parser is the best for my purpose, and saved me a good deal of development time.

  • no trouble to install and run, works nicely.

  • works perfectly.

  • Excellent work.

  • Good.

  • Excellent Library. Nicely converts HTML to ASCII only text representation. Wish it was released in Apache License.

  • Great and well-documented library.

  • Nice API and easy to use !

  • gr8 library - perfect for web robots!

  • A great niche product. Easy to use, and doesn't restrict you to valid XHTML. Well done.

  • Jtidy is also not bad but currently i am using this one.

  • Great software! Compared to the alternatives (e.g Jtidy, HTML cleaner) it is predictable and reliable. A big plus also is that it accepts any number of html elements even without a root element (html fragments). We use it in production!

  • Been using Jericho for my web search project, has been a great help.