Re: [Htmlparser-developer] toPlainTextString() feedback requested
Brought to you by:
derrickoswald
|
From: Somik R. <so...@ya...> - 2002-12-27 07:53:19
|
Hi Dhaval, Sam,
> I agree with Sam when he says that "don't fix it when its not broken".
I quite disagree with both of u on this philosophy. Looking at the code with
Joshua has made me realize its actually ghastly. A serious code cleanup is
needed, there is just too much duplication! Most of the scanner code is
unreadable, while the tag constructors are a nightmare. I dont think I will
be at peace till a couple of rounds of refactoring has been completed.
I do not think the panic on the visitor is warranted. Like I said before,
the current access methods will continue to be present. However, having a
visitor will make life simpler, as in - there's so much code now that uses
the same loop over and over again. We can replace code like :
Vector links = new Vector();
for (Enumeration e = parser.elements();e.hasMoreElements();) {
HTMLNode node = (HTMLNode)e.nextElement();
if (node instanceof HTMLLinkTag) {
links.add(node);
}
}
with :
HTMLLinkVisitor linkVisitor = new HTMLLinkVisitor();
collectNodesWith(linkVisitor);
Vector links = linkVisitor.getResult();
This looks so much more readable and simple. Of course, you could still use
the old way - its just that you would have the option of making life easier.
In the latest round of refactorings, we've put in a HTMLCompositeTag, from
which the link, form and select tags inherit. The toPlainTextString() and
toHTML() methods are now in the parent class. All tests passing (except the
charset tests).
Regards,
Somik
|