Re: [Htmlparser-developer] HTMLReader design needs to be modified (dev opinion solicited)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Leslie,
I prefer the second, non-compat, approach on architectural grounds.  If =
the creator of the reader knows the length of data, which is the most =
common case, then it (the creator) can do the mark and reset where =
needed with absolute certainty.  On the other hand, if the creator of =
the reader does not know the data length, then it is in every bit as =
good a position to suggest a length as htmlparser is, and nothing can be =
gained by delegating to htmlparser.

Just to add to the picture - the reset is done when a call is made to =
the elements() method, we wish to position the parser back to the =
beginning of the stream. Now, it just might be that this is not possible =
- in which case we'd throw an exception. For the user to handle an =
exception and create a new parser object/move the mark himself in the =
catch code is an unncessary complication - dont you think ?=20

The whole idea of putting it there was to make it simpler to parse thru =
a given html page again and again using the same parser object.=20
But if that is leading to other complications, it might just be better =
to take it out and expect that the parser object will need to be created =
every time. Of course if we can handle all of it in the parser, then I'd =
think its worth it, but a middle approach might just benefit neither =
side.

What are your thoughts ?

Regards,
Somik