Re: [Htmlparser-developer] about lexer.getNextNode().getParent()
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2003-11-03 22:57:10
|
The parent field points to the enclosing composite tag -- composite tags are *not* returned by the lexer. The lexer produces a linear stream of simple lexemes, without composite structure. You would need to use a parser. That is, in the example <A href="yadda"><IMG href="baffa"></A>, the image tag has the link tag as the parent, only for nodes produced by the parser (this would be one node with one child). You could use the same logic as below but you would need to dig recursively into each node returned to do your checking. If it's always in a table, you need only register the table scanner, so there would be less digging to do, since all other non-table nodes would be just simple nodes (again with no children). Derrick du du wrote: > Hello everyone: > > i'd like to locate a specific string in a html page and then process > information around it, so the whole scenario as: > > <html> <head>...</head> > <body><table> > <tr><td><p class=tablehead><b>Closing Time</b> </p></td></tr> > <tr>.....</tr> > </table> > </body></html> > In fact, I can locate "Closing Time", as well as its lexerNode, > and thus, I could further locate its parentNode or children nodes. But > when I using > aNode.getParentNode() always throw null point error. Part of code like: > > ... > Node aNode = lexer.nextNode(); > Node bNode; > while(aNode != null){ > if (aNode.getText().indexOf("Closing Time")!=-1){ > bNode = aNode.getParent(); > System.out.println("current node="+_bNode_.getText()); > } > aNode = lexer.nextNode(); > } > ... > > I'll be very appreciate if somebody could give me help. > > henry |