Re: [Htmlparser-user] htmlparser 1.4 html to xml howto - thanks Derrick Oswald
Brought to you by:
derrickoswald
|
From: moedusa <mo...@in...> - 2004-07-01 10:26:07
|
Derrick thank you very much for your answer.
Derrick Oswald wrote:
> Hi,
>
> I think the 'missing information' here is that the node that isEndTag()
> is included as a child of the tag it belongs to.
> So,
> NodeList children = tag.getChildren ();
> if (null != children) // true for composite tags like <html> that
> have ci\ontents but not singletons like <p>
> {
> Node last = children.elementAt (children.size () - 1);
> if (last instanceof Tag) // usually the case
> ((Tag)last).isEndTag (); // is true in general, I can't think
> when it wouldn't, since missing ones add virtual end tags
> }
>
> The isEndTag() method is useful when using the Lexer to return nodes in
> a linear fashion (not nested like above).
> In this case, the </whatever> tag has the raw tag name starting with a
> slash and the isEndTag() true.
>
> Yes RemarkTag is for comments: <!-- this is a comment -->.
>
> The null return from getChildren() is just to avoid allocating empty
> nodelists, but you're right, we could have a singleton, empty, read-only
> node list that's returned when it would be returned null. Editing might
> be more problematic then, but that is surely the rarer case.
>
> Derrick
|