Jericho HTML Parser is a simple but powerful java HTML parser library allowing analysis and manipulation of HTML documents.
Version 1.4 introduces classes for dealing with character entity references and numeric character references, relaxes rules for parsing attributes, and includes some minor documentation improvements.
- Added CharacterEntityReference and NumbericCharacterReference classes
- Added CharOutputSegment class
- Attributes allow whitespace around '=' sign
- Added convenience method Element.getAttributes()
- Some documentation improvements