Re: [Htmlparser-user] Elements and text nodes
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2006-04-17 11:48:04
|
Bastian, The CODE tag has not been added as a subclass of CompositeTag, so you're getting the default behaviour -- just a simple NodeTag that has the name CODE. Perhaps the 'phrase elements' (EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR, and ACRONYM see http://www.w3.org/TR/html4/struct/text.html#h-9.2.1) should be added. You can raise this as a request for enhancement (RFE) or you can do this yourself by copying another tag based on CompositeTag and editing it a bit, and then register the new tag with the PrototypicalNodeFactory: PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.registerTag (new MyCodeTag ()); parser.setNodeFactory (factory); See for example PrototypicalNodeFactory.registerTags(). The problem becomes detecting when the tag doesn't have a </CODE> like it should, so getEnders() and getEndTagEnders should probably have all the block level tag names. Derrick Bastian Hoesch wrote: > Hello, > > given this text string > > "<html><body><a href="xy">test</a></body></html>" > > HTMLParser creates this nodelist: > > Tag (0[0,0],6[0,6]): html > Tag (6[0,6],12[0,12]): body > Tag (12[0,12],25[0,25]): a href="xy" > Txt (25[0,25],29[0,29]): test > End (29[0,29],33[0,33]): /a > End (33[0,33],40[0,40]): /body > End (40[0,40],47[0,47]): /html > > > So, the text "test" is child element of the tag node for the element > <A>. I like this behaviour and I think thats correct way to do that. > > But: > > from this text string > > "<html><body><code>test</code></body></html>" > > the parser creates the following node list: > > Tag (0[0,0],6[0,6]): html > Tag (6[0,6],12[0,12]): body > Tag (12[0,12],18[0,18]): code > Txt (18[0,18],22[0,22]): test > End (22[0,22],29[0,29]): /code > End (29[0,29],36[0,36]): /body > End (36[0,36],43[0,43]): /html > > so, the text "test" is not a child element of the tag <code>. > Why does this happen? Is it a bug or feature? > > Thank you for your help, > > greetings > Bastian Hoesch > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > |