Re: [Htmlparser-user] Simple extraction?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Andy,

You might want to start with the StringBean (see bin/stringextractor 
example application), and extend it to handle the <p> tags specially.
I believe if you override visitTag (Tag tag) and check for <P> before 
calling super.visitTag (Tag tag), you can get the strings you want.

Derrick

Andy Wickson wrote:

> Hi,
> I am stuggling with what I assume should be a simple operation.
> The html file I am parsing has each line ending with a <P> tag (not 
> </P> as you might expect).
> There is a random amount of bold tags in each line - I am interested 
> in the text in each line without the tags - one String for each <P> tag.
>
> Are there any decent examples anywhere apart from the ones on the 
> htmlparser site?
>
> Thanks,
> Andy
>
>