Re: [Htmlparser-developer] toPlainTextString() feedback requested
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-12-27 07:53:19
|
Hi Dhaval, Sam, > I agree with Sam when he says that "don't fix it when its not broken". I quite disagree with both of u on this philosophy. Looking at the code with Joshua has made me realize its actually ghastly. A serious code cleanup is needed, there is just too much duplication! Most of the scanner code is unreadable, while the tag constructors are a nightmare. I dont think I will be at peace till a couple of rounds of refactoring has been completed. I do not think the panic on the visitor is warranted. Like I said before, the current access methods will continue to be present. However, having a visitor will make life simpler, as in - there's so much code now that uses the same loop over and over again. We can replace code like : Vector links = new Vector(); for (Enumeration e = parser.elements();e.hasMoreElements();) { HTMLNode node = (HTMLNode)e.nextElement(); if (node instanceof HTMLLinkTag) { links.add(node); } } with : HTMLLinkVisitor linkVisitor = new HTMLLinkVisitor(); collectNodesWith(linkVisitor); Vector links = linkVisitor.getResult(); This looks so much more readable and simple. Of course, you could still use the old way - its just that you would have the option of making life easier. In the latest round of refactorings, we've put in a HTMLCompositeTag, from which the link, form and select tags inherit. The toPlainTextString() and toHTML() methods are now in the parent class. All tests passing (except the charset tests). Regards, Somik |