Re: [Htmlparser-developer] lexer integration - added back visitEndTag

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

If you've been following the developer threads, Joshua and I are still 
thrashing out the details on how that would work ;-)
It will be extendable.

Couball, James wrote:

>Regarding your note about having TagFactory have signatures for all
>possible tags... how will TagFactory be extended to account for new,
>user defined tags?  Is it intended to be user extendable?
>
>Thanks for the great work!
>
>Sincerely,
>James.
>
>-----Original Message-----
>From: Derrick Oswald [mailto:Der...@Ro...] 
>Sent: Sunday, September 28, 2003 12:33 PM
>To: htm...@li...
>Subject: [Htmlparser-developer] lexer integration - added back
>visitEndTag
>
>Fixed up the broken visitor logic.
>Added some docos on NodeVisitor.
>
>TODO
>=====
>
>Serializable
>--------------
>The Parser needs to be made serializable again. This involves a 
>transient field down on the Source, I think, rather than having the 
>whole Lexer transient in the Parser.
>
>TagData
>-------
>This has been reworked to allow it to limp along under the new system, 
>but it should really be removed. I think the reason for it (reduce the 
>number of arguments to tag constructors) no longer applies, and a lot of
>
>the code could be easier to read if the Tag was more bean-like and had a
>
>zero args constructor with appropriate accessors.
>
>Helpers
>-------
>I desparately want to get rid of these 'helper' classes. They are just 
>obfuscating the code.
>
>Node Factory
>------------
>The factory concept needs to be extended with a TagFactory (extending 
>NodeFactory) that has the signatures for creating all the possible types
>
>of tags there are, and then this needs to be used by all the scanners to
>
>create their specific tags.
>
>Scanners
>--------
>The scanners may not be working, hard to tell without the unit tests 
>running. I'm not sure that CompositeTagScanner is completely all right 
>yet, It probably needs to be reworked based on the lexer.
>
>Unit Tests
>----------
>As mentioned, many of the unit tests expect toHtml() to produce 
>capitalized and rearranged output. And parseAndAssertNodeCount() is 
>expected not to include so many whitespace nodes. These need to be 
>addressed.
>
>Documentation
>-------------
>As of now, it's more likely that the javadocs are lying to you than 
>providing any helpful advice. This needs to be reworked completely.
>
>
>
>
>As you can see there's lots of work to do, so anyone with a death wish 
>can jump in.  I'll be working my way from top to bottom of the TODO list
>
>and commiting and notifying the developer list after each of them.  So 
>go ahead and do a take from CVS and jump in the middle with anything 
>that appeals. Keep the list posted and update your CVS tree often (or 
>subscribe to the htmlparsre-cvs mailing list for interrupt driven 
>notification rather than polled notification).
>
>
>  
>