The replace function $1 in a regex search and replace makes the following transcription mistakes:
A carriage return character is substituted for the two character string '\n' in the original string.
A tab character is substituted for the two character string '\t' in the original string.
A carriage return character is substituted for the two character string '$2' in the original string.
regex find => replace sequences
'/\*((\n|[^\n])+?)\*/' => '$1'
Input text
/* Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea -----File: 148.png---\newprofer\tsombody\P3\F1\F2\---- commodo consequat. Duis aute irure dolor in reprehenderit in $2 voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. */
Output text (used 4 spaces to simulate a tab character)
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea -----File: 148.png--- ewprofer sombody\P3\F1\F2\---- commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Anonymous
Surely, in a regex search string, 'backslash'-'en' etc is expected to match the single character 'new-line'.
If you want to find actual text 'backslash'-'en', you have to have 'backslash'-'backslash'-'en' in your search string. That's how regex works.
Last edit: Tony Browne 2016-11-20
View and moderate all "bugs Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Bugs"
Sorry, should have realized the search string may be confusing.
The search string
will produce the same results in the replacement.
For completeness, please include info about your OS and GG version.
A somewhat simpler test case seems to be:
Input:
abcdef\nghijkl
S:
(bc.*jk)
R:
$1
Expected output is the original text unchanged:
abcdef\nghijkl
Observed output:
hanne,
OS=Windows 10 64 bit
GG=1.0.25
You're correct about the simpler test case. If the string '\t' were substututed the result would be
input:
abcdef\tghijkl
S:
(bc.*jk)
R:
$1
Expected output is the original text unchanged:
abcdef\tghijkl
Observed output:
abcdef ghijkl
and with $2 substituted the result is
input:
abcdef$2ghijkl
S:
(bc.*jk)
R:
$1
Expected output is the original text unchanged:
abcdef$2ghijkl
Observed output:
abcdef
ghijkl
It's doesn't quite seem to work for the $2 case, but here's one:
Input:
abcdef$2ghijkl
S:
(b(c).*jk)
R:
$1
Observed output.
abcdefcghijl
So basically the $2 in the input text gets replaced by whatever's in the second captured expression, and something similar seems to happen for $3 and presumably higher.
Richard - please be very careful when you're copy-pasting error reports. I feel pretty sure you haven't actually observed the output you claim for the $2 case...
It maybe of interest that the problem doesn't seem to arise with ReplaceAll, only with an individual Replace.
My OS=Windows10(original); no difference between current standard GG1.0.25 & my private version sometimes identified as 1.0.26 (as known to hanne).
Try this
Input:
/
abcdef$2ghijkl/
S:
/*\n((\n|[^\n])+)*/
R:
$1
Observed output.
abcdef
ghijl
I tested this and it happens every time.
Suggest removing the proofers information before running your RegEx with:
search:
(-----File: \d+.png---).+?$
replace:
$1
Wayne
Actually I do this before running my regex:
search
\[nt]
replace:
\
Then search
\$
replace
¥
The last thing is to replace the yen with dollars after I've finished all processing.
The only problem with replacing only the n or t is the large number of other characters that affect the text.
http://www.pgdp.net/wiki/Regex_Cookbook#List_of_Ingredients.2C_or.2C_how_to_read_regex_code
I don't recall seeing any backslashes except in the proofer/formater tags at the page numbers.
Regards,
Wayne
----- Original Message -----
From: "Richard Tonsing" okrick@users.sf.net
To: "[guiguts:bugs]" 134@bugs.guiguts.p.re.sf.net
Sent: Monday, December 19, 2016 12:44:43 AM
Subject: [guiguts:bugs] Re: #134 Replace error with regex Search and replace of \n, \t, & $2.
Actually I do this before running my regex:
search
\ [nt]
replace:
\
Then search
\$
replace
¥
The last thing is to replace the yen with dollars after I've finished all processing.
[bugs:#134] Replace error with regex Search and replace of \n, \t, & $2.
Status: open
Group: v1.0.26
Created: Sun Nov 20, 2016 06:42 PM UTC by Richard Tonsing
Last Updated: Mon Dec 19, 2016 01:48 AM UTC
Owner: nobody
The replace function $1 in a regex search and replace makes the following transcription mistakes:
A carriage return character is substituted for the two character string '\n' in the original string.
A tab character is substituted for the two character string '\t' in the original string.
A carriage return character is substituted for the two character string '$2' in the original string.
regex find => replace sequences
'/*((\n|[^\n])+?)*/' => '$1'
Input text
/ Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea -----File: 148.png---\newprofer\tsombody\P3\F1\F2---- commodo consequat. Duis aute irure dolor in reprehenderit in $2 voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. /
Output text (used 4 spaces to simulate a tab character)
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
-----File: 148.png---
ewprofer sombody\P3\F1\F2----
commodo consequat. Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/guiguts/bugs/134/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Related
Bugs: #134
Actually the first search term should have been
\ [nt]
Odd, I see now how that happened. I copied the term with the double back slash and pasted it here but one of the back slashs disappears when pasting. Must be a Source Forge note error checking thing. Anyway I search for a backslash followed by either a t or n and replace it with a back slash. I've never encountered that combination in the text.
Should have caught the error but guess I expected to see two back slashes so thats what I thought I saw.
Sorry about the confusion.