Re: [Htmlparser-developer] Method to check if TextNode is just whitespace
Brought to you by:
derrickoswald
|
From: Axel <ax...@gm...> - 2005-11-02 22:07:16
|
On 11/1/05, Ian Macfarlane <ian...@gm...> wrote:
> I was thinking it might be worthwhile adding a method to Text/TextNode
> along the lines of:
>
> boolean isWhiteSpace()
>
> Which would return if the TextNode consisted of solely white space
> characters (or was the empty String).
>
> Now this could simply be done using String.trim().equals(""), however
> that wouldn't account for:
>
> - the non-breaking space character (#160)
> - The HTML code (also   as Firefox/IE do)
> - The HTML code   (also   as Firefox/IE do)
>
> So my question is, do you think should this method should treat those
> as spaces and remove/ignore them also for purposes of determining if
> the TextNode is white space? Or should it only trim normal whitespace
> (space, tab, carriage returns, etc).
I think, if every character (or entity converted to a
unicode-character) in the TextNode is true for
Character#isWhitespace() the boolean isWhiteSpace() should return
true;
IMO the TextNode shouldn't be trimmed automatically. Only a special
function should allow this to do.
--
Axel Kramer
http://www.plog4u.org - Wikipedia Eclipse Plugin
|