Re: [Htmlparser-developer] lexer integration

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Joshua,

I think the duplication is because the lexer.nodes package nodes don't 
use the NodeVisitor pattern and the htmlparser package nodes do. The 
lexer is shipped as a separate jar so it needs nodes that don't drag in 
the composite node stuff, whcih happens if the NodeVisitor signature is 
included. This may be factored out if we get rid of visitLinkTag, 
visitorImageTag and visitorTitleTag from that interface.  These may best 
be handled by direct examination of the node name in the various visitor 
classes.

The composite tag recursion happens on the scanTagNode method which does 
need a lexer, so the create calls can take just a Page, like you say.

Derrick

Joshua Kerievsky wrote:

> Derrick,
>
> It is me or are there duplicates of the StringNode, RemarkNode, etc 
> between the org.htmlparser package and the org.htmlparser.lexer.nodes 
> package? 
> I also noticed that the NodeFactory's creation methods take the lexer 
> as an argument, yet *all* of those methods and the methods they call 
> rely on lexer.getPage().   Have you considered simply passing in a 
> page instance rather than a lexer instance?    That will work well for 
> some further refactoring I have in mind. 
> --jk
>
>
>