Re: [Htmlparser-user] htmlparser 1.4 html to xml howto - thanks Derrick Oswald

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Derrick thank you very much for your answer.

Derrick Oswald wrote:
> Hi,
> 
> I think the 'missing information' here is that the node that isEndTag() 
> is included as a child of the tag it belongs to.
> So,
>    NodeList children = tag.getChildren ();
>    if (null != children) // true for composite tags like <html> that 
> have ci\ontents but not singletons like <p>
>    {
>        Node last = children.elementAt (children.size () - 1);
>        if (last instanceof Tag) // usually the case
>            ((Tag)last).isEndTag (); // is true in general, I can't think 
> when it wouldn't, since missing ones add virtual end tags
>    }
> 
> The isEndTag() method is useful when using the Lexer to return nodes in 
> a linear fashion (not nested like above).
> In this case, the </whatever> tag has the raw tag name starting with a 
> slash and the isEndTag() true.
> 
> Yes RemarkTag is for comments: <!-- this is a comment -->.
> 
> The null return from getChildren() is just to avoid allocating empty 
> nodelists, but you're right, we could have a singleton, empty, read-only 
> node list that's returned when it would be returned null. Editing might 
> be more problematic then, but that is surely the rarer case.
> 
> Derrick