Re: [Htmlparser-user] extracting only certain links

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The second issue that you mentioned is already fixed. It is in release 1.1 -
have you got the latest release ?

Regards,
Somik
----- Original Message -----
From: "Raghavender Srimantula" <kin...@ho...>
To: <htm...@li...>
Sent: Wednesday, May 01, 2002 4:15 AM
Subject: Re: [Htmlparser-user] extracting only certain links

> hi Somik,
> I have tried urls www.nba.com, www.yahoo.com which seem to have lot of
> links. yahoo.com has 191 links when I tried. I wrote a small class
> Parser.java which I am mailing as an attachment. everytime I run my
project
> using JBuilder after a series of parsing it throws a OutOfMemoryError.
since
> I am using JBuilder I havent set any -ms or -mx parameters to run my
> Parser.java. so you might want to try it out.
> and the other thing I noticed while running the Parser.java was at the
> LinkScanner in the extractLink() method for a particular <a tag I get
> relativeLink as "null" and then when we do
> "return (new HTMLLinkProcessor()).extract(relativeLink,url);"
> it throws  a NullPointerException in that method since relativeLink is
null.
> The exact place it throws a NullPointerException is
> "if (link.indexOf("http://")==-1 && link.indexOf("mailto:")==-1 && url !=
> null)" in "checkIfLinkIsRelative" method of HTMLLinkProcessor. this could
be
> fixed. I fixed it....but the OutOfMemoryError seems to be potentially
> dangerous.
>
> Thanks,
> Raghav
>
> >From: "Somik Raha" <so...@ya...>
> >Reply-To: htm...@li...
> >To: <htm...@li...>
> >Subject: Re: [Htmlparser-user] extracting only certain links
> >Date: Tue, 30 Apr 2002 11:44:17 +0900
> >
> >Semantic analysis...
> >Write a conditional to process the tag contents. You will have code like
> >this :
> >
> >if (node instanceof HTMLLinkTag) {
> >     HTMLLinkTag linkTag = (HTMLLinkTag) node;
> >     if (linkTag.getLink().indexOf("http://rd.yahoo.com")==0) {
> >         // print the tag or display it however you want
> >     }
> >}
> >
> >Regards
> >Somik
> >----- Original Message -----
> >From: "Sodergren, M.G." <mg...@le...>
> >To: <htm...@li...>
> >Sent: Tuesday, April 30, 2002 2:19 AM
> >Subject: [Htmlparser-user] extracting only certain links
> >
> >
> >Hello.
> >When i enter a url like
http://search.yahoo.com/bin/search?p=SEARCHENTERED
> >(yahoo result page for SEARCHENTERED),the program extracts all the links
> >from the html page but i just want it to extract the links that are
> >returned
> >as the result of my search by yahoo, so for example (with yahoo), all the
> >links beginning with <a href="http://srd.yahoo.com
> >but not the links beginning with <a href="http://rd.yahoo.com/
> >so in other words all the links with srd and not rd.
> >How would i solve this problem? What code do i put and where?
> >
> >Thanks
> >Mats
> >
> >_______________________________________________
> >Htmlparser-user mailing list
> >Htm...@li...
> >https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >
> >
> >_______________________________________________
> >Htmlparser-user mailing list
> >Htm...@li...
> >https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
>
>
> _________________________________________________________________
> Chat with friends online, try MSN Messenger: http://messenger.msn.com
>
>
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user