Re: [Htmlparser-developer] Method to check if TextNode is just whitespace
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2005-11-04 12:48:31
|
IMHO, the text shouldn't ever be null, but if it is, toHtml() would (should?) return an empty string, so isWhitespace() should also return true. Ian Macfarlane wrote: >>Conversion of character references like is already performed by the util.Translate class. >> >> >Oh good! No need for me to write it then :) > > > >>There is no &tab; character reference as far as I'm aware >> >> >You're right, I just guessed a whitespace entity name, typed it into >Google and found references to it. Sorry, I ought to have checked it >out a bit better first. > >Derrick, for a isWhiteSpace() method, what do you think it ought to do >when the String is null? > >Ian > >On 11/3/05, Derrick Oswald <Der...@ro...> wrote: > > >>Conversion of character references like is already performed by >>the util.Translate class. >>There is no &tab; character reference as far as I'm aware (see >>http://www.w3.org/TR/REC-html40/sgml/entities.html). >> >>Ian Macfarlane wrote: >> >> >> >>>Thanks for your reply, >>> >>>I wasn't suggesting trimming the actual text of the text nodes >>>permanently, merely wondering if using the trim() method to see if the >>>resulting string was empty would be sufficient, or whether we should >>>also look for various white-space HTML entities (e.g. &tab; also) for >>>purposes of determining this. >>> >>>Now I think about it some more, white space alone is probably what we >>>want to do. If we want to get things like &tab; we ought to write some >>>sort of method that would replace those types of HTML character >>>references with the actual characters, if that's feasible. >>> >>>The only other question I've got - what do you all think should happen >>>if the contents of the text node is null? Should it return true >>>(because there's no characters), false (because it's not actually a >>>white space String) or throw a NullPointerException (which would >>>negate the value of this method by forcing the end-user to write lots >>>of code to use this method)? Can a text node ever be null without the >>>user changing the text ot be null? >>> >>>Ian >>> >>>String is immutable so String.trim().equals("") won't change the >>>original String object. >>> >>>On 11/2/05, Axel <ax...@gm...> wrote: >>> >>> >>> >>> >>>>On 11/1/05, Ian Macfarlane <ian...@gm...> wrote: >>>> >>>> >>>> >>>> >>>>>I was thinking it might be worthwhile adding a method to Text/TextNode >>>>>along the lines of: >>>>> >>>>>boolean isWhiteSpace() >>>>> >>>>>Which would return if the TextNode consisted of solely white space >>>>>characters (or was the empty String). >>>>> >>>>>Now this could simply be done using String.trim().equals(""), however >>>>>that wouldn't account for: >>>>> >>>>>- the non-breaking space character (#160) >>>>>- The HTML code (also   as Firefox/IE do) >>>>>- The HTML code   (also   as Firefox/IE do) >>>>> >>>>>So my question is, do you think should this method should treat those >>>>>as spaces and remove/ignore them also for purposes of determining if >>>>>the TextNode is white space? Or should it only trim normal whitespace >>>>>(space, tab, carriage returns, etc). >>>>> >>>>> >>>>> >>>>> >>>>I think, if every character (or entity converted to a >>>>unicode-character) in the TextNode is true for >>>>Character#isWhitespace() the boolean isWhiteSpace() should return >>>>true; >>>>IMO the TextNode shouldn't be trimmed automatically. Only a special >>>>function should allow this to do. >>>> >>>>-- >>>>Axel Kramer >>>>http://www.plog4u.org - Wikipedia Eclipse Plugin >>>> >>>> >>>>------------------------------------------------------- >>>>SF.Net email is sponsored by: >>>>Tame your development challenges with Apache's Geronimo App Server. Download >>>>it for free - -and be entered to win a 42" plasma tv or your very own >>>>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>>>_______________________________________________ >>>>Htmlparser-developer mailing list >>>>Htm...@li... >>>>https://lists.sourceforge.net/lists/listinfo/htmlparser-developer >>>> >>>> >>>> >>>> >>>> >>>------------------------------------------------------- >>>SF.Net email is sponsored by: >>>Tame your development challenges with Apache's Geronimo App Server. Download >>>it for free - -and be entered to win a 42" plasma tv or your very own >>>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>>_______________________________________________ >>>Htmlparser-developer mailing list >>>Htm...@li... >>>https://lists.sourceforge.net/lists/listinfo/htmlparser-developer >>> >>> >>> >>> >>> >> >>------------------------------------------------------- >>SF.Net email is sponsored by: >>Tame your development challenges with Apache's Geronimo App Server. Download >>it for free - -and be entered to win a 42" plasma tv or your very own >>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>_______________________________________________ >>Htmlparser-developer mailing list >>Htm...@li... >>https://lists.sourceforge.net/lists/listinfo/htmlparser-developer >> >> >> > > >------------------------------------------------------- >SF.Net email is sponsored by: >Tame your development challenges with Apache's Geronimo App Server. Download >it for free - -and be entered to win a 42" plasma tv or your very own >Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >_______________________________________________ >Htmlparser-developer mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > > |