Re: [Htmlparser-developer] about lexer.getNextNode().getParent()

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The parent field points to the enclosing composite tag -- composite tags 
are *not* returned by the lexer. The lexer produces a linear stream of 
simple lexemes, without composite structure. You would need to use a parser.
That is, in the example <A href="yadda"><IMG href="baffa"></A>,  the 
image tag has the link tag as the parent, only for nodes produced by the 
parser (this would be one node with one child).  You could use the same 
logic as below but you would need to dig recursively into each node 
returned to do your checking. If it's always in a table, you need only 
register the table scanner, so there would be less digging to do, since 
all other non-table nodes would be just simple nodes (again with no 
children).

Derrick

du du wrote:

> Hello everyone:
>  
> i'd like to locate a specific string in a html page and then process 
> information around it, so the whole scenario as:
>  
> <html> <head>...</head>
> <body><table>
> <tr><td><p class=tablehead><b>Closing Time</b> </p></td></tr>
> <tr>.....</tr>
> </table>
> </body></html>
>  In fact, I can locate "Closing Time", as well as  its lexerNode, 
> and thus, I could further locate its parentNode or children nodes. But 
> when I using
> aNode.getParentNode() always throw null point error. Part of code like:
>  
> ...
> Node aNode = lexer.nextNode();
>       Node bNode;
>       while(aNode != null){
>         if (aNode.getText().indexOf("Closing Time")!=-1){
>           bNode = aNode.getParent();
>           System.out.println("current node="+_bNode_.getText());
>         }
>         aNode = lexer.nextNode();
>       }
> ...
>  
> I'll be very appreciate if somebody could give me help. 
>  
> henry