Re: [Htmlparser-user] finding meta data
Brought to you by:
derrickoswald
From: kavorka <the...@gm...> - 2006-07-29 13:07:11
|
Hi Oswald, Yes i want to remove text within <a></a>. i'll try to do what you have said, but i'm a newbie java coder i didnt understand what you have said clearly. I tried to override linkTAg to not to take text <a></a> now myLinkTag doesnt find links. but now how can i take text other that <a></a>. if i ask to much, i'm sorry. thanks a lot murat On 7/29/06, Derrick Oswald <Der...@ro...> wrote: > > Murat, > > I'm not sure what you mean by 'pure' text. > The stringextractor program uses the StringBean under the hood. > It only collects text which would be presented in a browser - or at > least it's supposed to. > The stringextractor program has an option (-links) to output the links > within angle brackets. Make sure this is not used. > If you want to remove text within <a></a> pairs you will need to > override the default LinkTag to not do this and register it with the > PrototypicalNodeFactory. > > Derrick > > kavorka wrote: > > > Hi Oswald, > > I have another question. In HTMLPARSER, is it possible to extract only > > the text in the webpage. In the stringextractor program, it extract > > also link text in the page, i want to extract "pure" text. can i do it? > > thanks > > Murat > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > |