Hi,
I am trying to extract the links from the table. As you suggested Derrick, I am successful by now, to access the table. But, I actually need to extract only links from them. Actually, its the other way! I need the links, but I want the parents of those links and group them based on the size of their parents. If same parents, then group it!
So, how do I that??
Thanks in advance and , Cheers to HTMLParser
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Every node has the getStartPosition() and getEndPosition() methods. The value returned is the character (not byte) offset into the HTML content.
I don't think I can help you write your program, but perhaps using the size isn't the right way to do it.
Every link will have the same parent somewhere up the tree (recursively) since all nodes are contained by the <HTML> tag, and hence have that as a parent (.. of a parent, of a parent...).
Besides a recursive option, the HasParentFilter takes another filter as a parameter, which could be an IsEqualFilter that is true only for one specific node. You will need to determine what that node is that you are looking for.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I am trying to extract the links from the table. As you suggested Derrick, I am successful by now, to access the table. But, I actually need to extract only links from them. Actually, its the other way! I need the links, but I want the parents of those links and group them based on the size of their parents. If same parents, then group it!
So, how do I that??
Thanks in advance and , Cheers to HTMLParser
Every node has the getStartPosition() and getEndPosition() methods. The value returned is the character (not byte) offset into the HTML content.
I don't think I can help you write your program, but perhaps using the size isn't the right way to do it.
Every link will have the same parent somewhere up the tree (recursively) since all nodes are contained by the <HTML> tag, and hence have that as a parent (.. of a parent, of a parent...).
Besides a recursive option, the HasParentFilter takes another filter as a parameter, which could be an IsEqualFilter that is true only for one specific node. You will need to determine what that node is that you are looking for.