Re: [Htmlparser-user] Efficient parsing - help needed

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Somik,

Thanks for the help. 

> You can use toHTML() to do this..
> HTMLNode node;
> for (HTMLEnumeration e =
> parser.elements();e.hasMoreNodes();) {
>    node = e.nextHTMLNode();
>    writeToDisk(node.toHTML());
> }

I tried this, but toHTML() modifies the contents,
wrongly in some cases. I have posted a bug regarding
this :
http://sourceforge.net/tracker/index.php?func=detail&aid=663038&group_id=24399&atid=381399

I have one suggestion to make : overloaded
constructors in HTMLParser of the foll. signature/s :
public HTMLParser(java.lang.String resourceLocn,
HTMLParserFeedback feedback, Writer writer)

public HTMLParser(java.lang.String resourceLocn,
Writer writer)

with corresponding overloaded constructors in
HTMLReader:
public HTMLReader(java.io.Reader in, int len, Writer
writer)

public HTMLReader(java.io.Reader in, java.lang.String
url, Writer writer)

This will give the users a way to save the response to
disk as it is received. Of course, there is another
option of taking a String file name argument, but the
user may want to specify the file encoding as well (as
is the case with me). So the java.io.Writer is a
better option.

This should not take much time to implement, as you
just need to check if the writer has been supplied and
once you read a line using the readLine() method in
HTMLReader, write this string to the writer using the
println method and call flush(). This gives the added
advantage to the user of preserving line breaks at the
original points.

What do you think?

Also, when can we expect the next release?

Warm Regards,
Ash

________________________________________________________________________
Missed your favourite TV serial last night? Try the new, Yahoo! TV.
       visit http://in.tv.yahoo.com