From: SourceForge.net <no...@so...> - 2010-09-10 12:43:52
|
Bugs item #3063568, was opened at 2010-09-10 12:57 Message generated for change (Comment added) made by dkf You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3063568&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 43. Regexp Group: current: 8.6b1 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jasper (jaspertheperson) >Assigned to: Donal K. Fellows (dkf) Summary: regsub example doesn't perform as advertized Initial Comment: It's this one: =========================================== Convert all non-ASCII and Tcl-significant characters into \u escape sequences by using regsub and subst in combination: # This RE is just a character class for everything "bad" set RE {[][{};#\\\$\s\u0080-\uffff]} # We will substitute with a fragment of Tcl script in brackets set substitution {[format \\\\u%04x [scan "\\&" %c]]} # Now we apply the substitution to get a subst-string that # will perform the computational parts of the conversion. set quoted [subst [regsub -all $RE $string $substitution]] =========================================== The RE for "bad" characters includes \s which matches all whitespace characters including newline. However, inserting \ before newline before calling subst on a string containing it does not preserve the newline, it causes it to be replaced by a space, so the whole procedure replaces newlines with \u0020 Not sure how to get this to work properly! Little help? ---------------------------------------------------------------------- >Comment By: Donal K. Fellows (dkf) Date: 2010-09-10 13:43 Message: Good catch. Newlines need to be handled specially. :-( ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3063568&group_id=10894 |