Re: [Htmlparser-user] pre tag
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2006-11-06 13:09:00
|
The StringBean is a NodeVisitor, so it can be applied to a NodeList to extract tthe text from a child list. I guess it's up to you to remove stuff you don't want. Dave wrote: > Hi Derrick, > > Thanks for help. > > ... find the H3 node (with Description as the contents), > ... get it's parent > ... and extract all text from the parent's children (after the Heading) > ExtractTextFromChildrenOf (HasSibling (And (TagName(H3), HasChild > (String(Description))))) > > I could not find the method: ExtractTextFromChildrenOf(), which class? > Does the text extracted include "Good Morning" or "Description"? I > want the text after the heading(Description) only. > > Thanks! > Dave > > */Derrick Oswald <Der...@Ro...>/* wrote: > > Dave, > > PRE has not been added as a tag because it very often is not > closed by > the /PRE. You can create your own "PRE" tag class derived from > CompositeTag, and register it with a PrototypicalNodeFactory you > give to > the parser. > > To answer your previous question about filters for: > > Good Morning > > > Description > > >*Text to extract Line1* >*Text to extract Line2* > > > > Good Morning > > > ... find the H3 node (with Description as the contents), > ... get it's parent > ... and extract all text from the parent's children (after the > Heading) > > so it would be something like > ExtractTextFromChildrenOf (HasSibling (And (TagName(H3), HasChild > (String(Description))))) > This is a lot easier to construct with the FilterBuilder application. > > ... or alternatively I had thought of making a 'TriggerFilter' that > would set a member flag when it's subordinate filter went true, and > after that would always return true because the flag was set... > but then > this member would need to be reset or you would need to build the > filter > fresh for each parse. > > Derrick > > Dave wrote: > > > > >> text1 >> > > > > > > > > > text2 > > > > > > > > > >parse http://web-site table > > show the whole table structure > > >parse http://web-site pre > > show the tag "pre" only, no text inside the pre tag. > > > > It seems that pre is not treated as the parent node of "text1". > > > > Is this a bug? > > > > Thanks! > > > > > > > > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, > security? > Get stuff done quickly with pre-integrated technology to make your > job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > ------------------------------------------------------------------------ > Check out the New Yahoo! Mail > <http://us.rd.yahoo.com/evt=43257/*http://advision.webevents.yahoo.com/mailbeta>- > Fire up a more powerful email and get things done faster. > >------------------------------------------------------------------------ > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > >------------------------------------------------------------------------ > >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > |