I have tried to scan the link http://java.sun.com/j2se/index.jsp but the parser seems can not retrieve all LinkTag > It showed me no LinkTag
if(node instanceof LinkTag)
{
LinkTag linkTag=(LinkTag)node;
textArea.setText(textArea.getText()+linkTag.getLink()+"\r\n");
}
When I print them all then it show some node as follows :
<TD><A HREF="http://developers.sun.com">developers.sun.com</A></TD>
Perhap it nested inside <TD> tag so it can't be parsed ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have tried to scan the link http://java.sun.com/j2se/index.jsp but the parser seems can not retrieve all LinkTag > It showed me no LinkTag
if(node instanceof LinkTag)
{
LinkTag linkTag=(LinkTag)node;
textArea.setText(textArea.getText()+linkTag.getLink()+"\r\n");
}
When I print them all then it show some node as follows :
<TD><A HREF="http://developers.sun.com">developers.sun.com</A></TD>
Perhap it nested inside <TD> tag so it can't be parsed ?
That code I posted above is to print LinkTAg but it didn't print anylink at all .
I think you aren't looking in the nodes recursively.
From the parser, tags contain tags, and you need to look in the lists returned from getChildren(). See the example code at:
http://htmlparser.sourceforge.net/wiki/index.php/LinkExtraction