[Htmlparser-user] HTMLParser 1.6 : Unexpected behavior in getNext/getPrevSibling()
Brought to you by:
derrickoswald
|
From: Madhur K. T. <mad...@gm...> - 2005-12-07 12:21:24
|
Hi, I'm facing a problem using HTMLParser 1.6 (integration release) to parse an HTML document, described here. I'm using the getNextSibling and getPrevSibling function from the new Node interface to to back and forward from a a text node. The snippet of the HTML page causing the problem is here (table tag inserted into a body tag). ><body> ><TABLE WIDTH="651" CELLPADDING="0" CELLSPACING="0" BORDER="0"> <TR VALIGN="TOP"> <TD BGCOLOR="#FFFFFF" ALIGN="LEFT"> <FONT face="helvetica, arial" size="1"> ><IMG SRC="http://www.comics.com/comics/dilbert/daily_dilbert/images/bullet2.gif" WIDTH="14" HEIGHT="11" ALT="" BORDER="0"> ><A HREF="https://members.comics.com/members/registration/showDilbertLogin.do?aid=1" target="_blank"> Unsubscribe </A>/ ><A HREF="https://members.comics.com/members/registration/showDilbertLogin.do?aid=1" target="_blank" >> Modify </A></FONT></TD></TR></TABLE></body> The code that I am using is as follows :- (in my custom visitor class) >public void visitStringNode(Text string) { > if(string.getText().contains("Unsubscribe")) { > Node prevSibling = string; //.getPreviousSibling(); > while(prevSibling != null) { > System.out.println("Prev Sibling " + prevSibling); > prevSibling = prevSibling.getPreviousSibling(); > } > > Node nextSibling = string; > while(nextSibling != null) { > System.out.println("Next Sibling " + nextSibling); > nextSibling = nextSibling.getNextSibling(); > } > } >} However the output that is seen when the code runs is as follows :- >String : Unsubscribe >Prev Sibling Txt (389[3,100],402[3,113]): Unsubscribe >Next Sibling Txt (389[3,100],402[3,113]): Unsubscribe I expected that the parser would treat the <A> tag and the <IMG> just before the text "Unsubscribe" as siblings and wold return those. Please could you tell me where I;m going wrong? Or is it that the Parser is not correctly getting the siblings? Thanks, -- Madhur Kumar Tanwani "If opportunity knocks only once then build more doors"...... |