Andy,
You might want to start with the StringBean (see bin/stringextractor
example application), and extend it to handle the <p> tags specially.
I believe if you override visitTag (Tag tag) and check for <P> before
calling super.visitTag (Tag tag), you can get the strings you want.
Derrick
Andy Wickson wrote:
> Hi,
> I am stuggling with what I assume should be a simple operation.
> The html file I am parsing has each line ending with a <P> tag (not
> </P> as you might expect).
> There is a random amount of bold tags in each line - I am interested
> in the text in each line without the tags - one String for each <P> tag.
>
> Are there any decent examples anywhere apart from the ones on the
> htmlparser site?
>
> Thanks,
> Andy
>
>
|