From: Tom J. <tom...@gm...> - 2009-10-06 23:03:54
|
On Tue, Oct 6, 2009 at 10:06 AM, George Petasis <pe...@ii...> wrote: > > TIP #358: SUPPRESS EMPTY LIST ELEMENT GENERATION FROM THE SPLIT COMMAND > ========================================================================= > Version: $Revision: 1.2 $ > Author: George Petasis <petasis_at_iit.demokritos.gr> > State: Draft > Type: Project > Tcl-Version: 8.7 > Vote: Pending > Created: Sunday, 04 October 2009 > URL: http://purl.org/tcl/tip/358.html > WebEdit: http://purl.org/tcl/tip/edit/358 > Post-History: > > ------------------------------------------------------------------------- > > ABSTRACT > ========== > > The *split* command will create empty list elements when adjacent split > characters are found in the input. In some cases these empty list > elements are not desired, so this TIP proposes a new switch to disable > their generation. > > RATIONALE > =========== > > The idea for this TIP came from a discussion in comp.lang.tcl: > [<URL:http://groups.google.gr/group/comp.lang.tcl/browse_thread/thread/8d46b0f10e7a5750/d7844cc739aa4310>] > and the (non obvious) suggestions on how tokens can be extracted from a > string can be performed efficiently. > > It should be noted that this will allow the *split* command to be used > in a fashion that is very similar to how splitting works in many other > languages (e.g., Perl, awk, Unix shells). > > SPECIFICATION > =============== > > This TIP proposes a new optional switch (*-noemptyelements*) to the > *split* command: > > *split -noemptyelements* /string/ ?/splitChars/? > > If this option is present, then *split* will not produce an empty list > element when the /string/ contains adjacent characters that are present > in /splitChars/. I think that [split] is best reserved for well formed inputs, in fact, if the split chars are whitespace, then [split] does what most Tcl programmers would consider to be the wrong thing...creating empty elements between extra whitespace chars. The solution could be something more generally useful: maybe whitespace normalization? We have [string trim], maybe something like [string normalize (whitespace)]. The result would be a string where adjacent internal whitespace chars are collapsed into one space char, and before and after whitespace is eliminated. Then the problem would be solved like this: set mylist [split [string normalize $mystring]] |