[Htmlparser-developer] Incomplete HTML Tag Definitions
Brought to you by:
derrickoswald
From: Martin H. <Mh...@de...> - 2005-08-25 23:01:33
|
First, I am grateful for all the work that has been done to produce this project. =20 Second, I noticed that the tag classes defined are not a complete representation of the HTML tags available. I may have missed a fundamental usage approach but noticed, for example, that there is no 'H2' specific tag. The filters appear to find the opening tag as "H2" but it is not associated with the end tag in any way, nor is the text that appears between it easily extractable without resorting to serial processing of the list. So, am I missing something in the approach or is the tag list incomplete? =20 Third, due perhaps to my ignorance about the overall 'zen' of the parser, I created a clone of the Span.java class, edited it to create an H2Tag.java class, and then registered it as a compound tag. It works beautifully. This worked for many other tags. So, if I am not completely missing the point, should I contribute these back to increase the tag coverage? =20 Martin N. Hudson devIS - Development InfoStructure =20 |