Re: [Htmlparser-user] Parsing Partial HTML text
Brought to you by:
derrickoswald
From: Derrick O. <der...@ro...> - 2007-09-27 01:15:32
|
=0AThe tbody tag you are getting is a generic tag - because the parser does= n't know about tbody.=0AHence it has no children because it is not a compos= ite node.=0AYou can make your own tbody composite node as described here: h= ttp://htmlparser.sourceforge.net/faq.html#composite=0A=0A=0A----- Original = Message ----=0AFrom: "mic...@no..." <michaeld.jones@novartis= .com>=0ATo: htm...@li...=0ASent: Wednesday, Septem= ber 26, 2007 10:30:45 AM=0ASubject: [Htmlparser-user] Parsing Partial HTML = text=0A=0A=0A=0AI am having trouble parsing html tagged=0Atext. It seems th= at I can retrieve a node but that element does not have=0Athe child nodes a= s expected. =0A=0A=0A=0A String table=0A=3D=0A=0A =0A "<= tbody>\n" +=0A=0A =0A "<tr>\n" +=0A=0A =0A "<td><= span>brain_normal_GSM80627</span></td>\n"=0A+=0A=0A =0A "<td><= span>normal</span></td>\n"=0A+=0A=0A =0A "<td><span>cerebral c= ortex</span></td>\n"=0A+=0A=0A =0A "<td><span>brain</span></td= >\n"=0A+=0A=0A =0A "</tr>\n" +=0A=0A =0A "</tbody= >\n";=0A=0A=0A=0A Parser parser=0A=3D new Parser(new Lexer(table));= =0A=0A try {=0A=0A =0ANode tBodyNode =3D parser.extractAll= NodesThatMatch(new TagNameFilter("tbody")).elementAt(0);=0A=0A = =0ASystem.out.println(tBodyNode.getChildren()); // Prints null <----------= -----=0A=0A } catch=0A(ParserException e) {=0A=0A =0Ae.pri= ntStackTrace(); //To change body of catch statement use File=0A| Settings = | File Templates.=0A=0A }=0A=0A=0A=0ADoes HTML Parser not handle tex= t input=0Aor partial html files well?=0A=0A=0A=0A_________________________= =0A=0A=0A=0ACONFIDENTIALITY NOTICE=0A=0A=0A=0AThe information contained in = this e-mail message is intended only for the=0Aexclusive use of the individ= ual or entity named above and may contain information=0Athat is privileged,= confidential or exempt from disclosure under applicable=0Alaw. If the read= er of this message is not the intended recipient, or the=0Aemployee or agen= t responsible for delivery of the message to the intended=0Arecipient, you = are hereby notified that any dissemination, distribution=0Aor copying of th= is communication is strictly prohibited. If you have received=0Athis commun= ication in error, please notify the sender immediately by e-mail=0Aand dele= te the material from any computer. Thank you.=0A=0A=0A=0A=0A |