I'm trying to modify the sample LinkExtractor application and I'm having difficulty extracting the href attribute.
The sample application LinkExtractor :
System.out.println (list.elementAt(i).toHtml());
I get values such as: <a href="index.htm">Home</a>
Do I use?: BaseHrefTag getBaseUrl()
How do I get at the href attribute?
Is there any other sample code?
Great package! Many thanks,
Perren
Check out the wiki sample code: http://htmlparser.sourceforge.net/wiki/index.php/LinkExtraction
I think that is what you are looking for.
Matt
LinkTag.getLink() has already applied any <BASE> url to return an absolute URL. If you want the raw HREF attribute use LinkTag.getAttribute ("HREF");
Log in to post a comment.
I'm trying to modify the sample LinkExtractor application and I'm having difficulty extracting the href attribute.
The sample application LinkExtractor :
System.out.println (list.elementAt(i).toHtml());
I get values such as:
<a href="index.htm">Home</a>
Do I use?:
BaseHrefTag getBaseUrl()
How do I get at the href attribute?
Is there any other sample code?
Great package! Many thanks,
Perren
Check out the wiki sample code:
http://htmlparser.sourceforge.net/wiki/index.php/LinkExtraction
I think that is what you are looking for.
Matt
LinkTag.getLink() has already applied any <BASE> url to return an absolute URL. If you want the raw HREF attribute use LinkTag.getAttribute ("HREF");