Re: [Htmlparser-user] finding meta data
Brought to you by:
derrickoswald
|
From: Derrick O. <Der...@Ro...> - 2006-07-30 12:12:21
|
Kavorka, Maybe if you just want to remove the whole link, use something like: getParent ().getChildren ().remove (this); in the doSemanticAction() override of your custom LinkTag class. That will remove the current link tag from the enclosing parent tag by altering the children list. Derrick kavorka wrote: > Hi Oswald, > Yes i want to remove text within <a></a>. i'll try to do what you have > said, but > i'm a newbie java coder i didnt understand what you have said clearly. > I tried to override > linkTAg to not to take text <a></a> now myLinkTag doesnt find links. > but now how can i take > text other that <a></a>. > if i ask to much, i'm sorry. > thanks a lot > murat > > > On 7/29/06, *Derrick Oswald* <Der...@ro... > <mailto:Der...@ro...>> wrote: > > Murat, > > I'm not sure what you mean by 'pure' text. > The stringextractor program uses the StringBean under the hood. > It only collects text which would be presented in a browser - or at > least it's supposed to. > The stringextractor program has an option (-links) to output the links > within angle brackets. Make sure this is not used. > If you want to remove text within <a></a> pairs you will need to > override the default LinkTag to not do this and register it with the > PrototypicalNodeFactory. > > Derrick > > kavorka wrote: > > > Hi Oswald, > > I have another question. In HTMLPARSER, is it possible to > extract only > > the text in the webpage. In the stringextractor program, it extract > > also link text in the page, i want to extract "pure" text. can i > do it? > > thanks > > Murat > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys -- and earn > cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > <http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV> > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > <mailto:Htm...@li...> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > >------------------------------------------------------------------------ > >------------------------------------------------------------------------- >Take Surveys. Earn Cash. Influence the Future of IT >Join SourceForge.net's Techsay panel and you'll get the chance to share your >opinions on IT & business topics through brief surveys -- and earn cash >http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > >------------------------------------------------------------------------ > >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > |