A lexer is the low level linear 'node fetcher'. The parser tries to balance end tags and gathers interior tags into the children collection for an outer tag.
The duplicate nodes, StringNode and RemarkNode, will dissappear with a future refactoring. Duplicates were only necessary to break out the lexer as a separate jar, since the Visitor (incorrectly) knows about specific tags and using the org.htmlparser.StringNode in the Lexer would have dragged in all the specific tag types because of it's Visitor pattern, see
RFE #874000 Remove specialized tag signatures from NodeVisitor: http://sourceforge.net/tracker/index.php?func=detail&aid=874000&group_id=24399&atid=381402
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
what is the diffenece between a lexer and a parser ?
for example there is a StringNode in a lexer as well as in a parser ..
regards
Suchak Jani
A lexer is the low level linear 'node fetcher'. The parser tries to balance end tags and gathers interior tags into the children collection for an outer tag.
The duplicate nodes, StringNode and RemarkNode, will dissappear with a future refactoring. Duplicates were only necessary to break out the lexer as a separate jar, since the Visitor (incorrectly) knows about specific tags and using the org.htmlparser.StringNode in the Lexer would have dragged in all the specific tag types because of it's Visitor pattern, see
RFE #874000 Remove specialized tag signatures from NodeVisitor: http://sourceforge.net/tracker/index.php?func=detail&aid=874000&group_id=24399&atid=381402