Re: [Htmlparser-user] Simple extraction?
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2006-03-26 16:42:13
|
Andy, You might want to start with the StringBean (see bin/stringextractor example application), and extend it to handle the <p> tags specially. I believe if you override visitTag (Tag tag) and check for <P> before calling super.visitTag (Tag tag), you can get the strings you want. Derrick Andy Wickson wrote: > Hi, > I am stuggling with what I assume should be a simple operation. > The html file I am parsing has each line ending with a <P> tag (not > </P> as you might expect). > There is a random amount of bold tags in each line - I am interested > in the text in each line without the tags - one String for each <P> tag. > > Are there any decent examples anywhere apart from the ones on the > htmlparser site? > > Thanks, > Andy > > |