Re: [Htmlparser-user] Not CompositeNode ("ADDRESS", "CENTER" TAG, etc...)
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2006-02-07 18:34:13
|
Tags are omitted because heuristically the tighter rule that assumes all tags are composite tags fails to parse correctly because of bad HTML out in the wild. You are welcome to try replacing the default tag (see PrototypicalNodeFactory.setTagPrototype()) with a composite tag that ends with a matching slash name, but my guess is it will parse very poorly. 加藤 千典 wrote: >Hi, all. > >I notice that correct way. > >I created a AddressTag.java that is almost copy of ParagraphTag.java > >And add same code like this. > > PrototypicalNodeFactory factory = new PrototypicalNodeFactory(); > factory.registerTag (new AddressTag()); > parser.setNodeFactory (factory); > >It's ok, but Should I create another alot of HTML tag classes ? > >I think that there are almost Html Tag classes already. >How can I get ? > >Thank you, all. > > > >>Hi, all. >> >>I parsed a html, and create a dom , using >>HTMLParser Version 1.6 (Integration Build Nov 12, 2005) >> >>The "P" tag has "P" END TAG as child. >>(It's is same at "HEAD", "TITLE", "BODY", etc...) >> >>The othe hand, there are 2 "ADDRESS" Tag ("ADDRESS" and "/ADDRESS") >>on the same level in dom. >>(It's the same thing at "CENTER" tag.) >> >>I expected that ADDRESS tag become like "P" tag, but not. >> >>Why the reason ? >> >>How can I that the paser recognize ADDRESS tag as a single >>CompositeTag. >> >>Thank you, all. Sorry my poor english. >> >> >> > > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > |