|
From: maxwell <ma...@um...> - 2017-11-03 22:23:04
|
I'm trying to use regex's that include Unicode code points in both the
search and the replace parts. It works fine for search, but not for
replace.
Specifically, I can enter the sequence
\x{016B}\x{0304}
in the "Search for" box in the "Search and Replace" dialog box, or
equivalently
\u016B\u0304
and it finds the characters (ū̄) correctly. (FWIW, this is a
pre-composed u+macron followed by a combining macron, which is of course
redundant...which is why I'm trying to replace it.)
If I then enter the string
\x{016B}
or equivalently
\u016B
in the "Replace with" box in that same "Search and Replace" dialog box,
I would expect the search string in the buffer to be replaced by
ū
But instead it gets replaced by
u016B
I would consider this a bug. If Unicode code points work in the Search
box, they should also work in the Replace box.
Before I submit a bug report, though, I want to find out whether I'm
doing s.t. wrong. I do of course have the "Regular expressions" box
checked in the dialog box (else the search wouldn't work). Does anyone
have any suggestions? Or a reason why this is a feature rather than a
bug.
BTW, there is a work-around in this case. Since the output is contained
in the input, I can use a group in the input
(\u016B)\u0304
and refer to the group in the output:
$1
There are of course situations where that won't work. Alternatively, I
could copy-paste the actual desired output chars from the Character Map
plugin into a temp buffer, and copy them from there into the Replace
box. But I'd rather not :-).
Mike Maxwell
|