Re: [Htmlparser-user] finding meta data

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Oswald,
Yes i want to remove text within <a></a>. i'll try to do what you have said,
but
i'm a newbie java coder i didnt understand what you have said clearly. I
tried to override
linkTAg to not to take text <a></a> now myLinkTag doesnt find links. but now
how can i take
text other that <a></a>.
if i ask to much, i'm sorry.
thanks a lot
murat

On 7/29/06, Derrick Oswald <Der...@ro...> wrote:
>
> Murat,
>
> I'm not sure what you mean by 'pure' text.
> The stringextractor program uses the StringBean under the hood.
> It only collects text which would be presented in a browser - or at
> least it's supposed to.
> The stringextractor program has an option (-links) to output the links
> within angle brackets. Make sure this is not used.
> If you want to remove text within <a></a> pairs you will need to
> override the default LinkTag to not do this and register it with the
> PrototypicalNodeFactory.
>
> Derrick
>
> kavorka wrote:
>
> > Hi Oswald,
> > I have another question. In HTMLPARSER, is it possible to extract only
> > the text in the webpage. In the stringextractor program, it extract
> > also link text in the page, i want to extract "pure" text. can i do it?
> > thanks
> > Murat
> >
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
> your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>