Menu

#134 Replace error with regex Search and replace of \n, \t, & $2.

v1.0.26
open
nobody
None
5
2016-12-19
2016-11-20
No

The replace function $1 in a regex search and replace makes the following transcription mistakes:

A carriage return character is substituted for the two character string '\n' in the original string.
A tab character is substituted for the two character string '\t' in the original string.
A carriage return character is substituted for the two character string '$2' in the original string.

regex find => replace sequences

'/\*((\n|[^\n])+?)\*/'                   =>           '$1'

Input text

/*
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
-----File: 148.png---\newprofer\tsombody\P3\F1\F2\----
commodo consequat. Duis aute irure dolor in reprehenderit in $2 voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
*/

Output text (used 4 spaces to simulate a tab character)

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
-----File: 148.png---
ewprofer    sombody\P3\F1\F2\----
commodo consequat. Duis aute irure dolor in reprehenderit in 
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Related

Bugs: #134

Discussion

  • Tony Browne

    Tony Browne - 2016-11-20

    Surely, in a regex search string, 'backslash'-'en' etc is expected to match the single character 'new-line'.
    If you want to find actual text 'backslash'-'en', you have to have 'backslash'-'backslash'-'en' in your search string. That's how regex works.

     

    Last edit: Tony Browne 2016-11-20
  • Anonymous

    Anonymous - 2016-11-20

    Sorry, should have realized the search string may be confusing.

    The search string

    '/\*(([^*])+?)\*/'                   =>           '$1'  
    

    will produce the same results in the replacement.

     
  • hannne

    hannne - 2016-11-20

    For completeness, please include info about your OS and GG version.

    A somewhat simpler test case seems to be:
    Input:
    abcdef\nghijkl
    S:
    (bc.*jk)
    R:
    $1
    Expected output is the original text unchanged:
    abcdef\nghijkl
    Observed output:

    abcdef
    ghijkl
    
     
  • Richard Tonsing

    Richard Tonsing - 2016-11-20

    hanne,

    OS=Windows 10 64 bit
    GG=1.0.25

    You're correct about the simpler test case. If the string '\t' were substututed the result would be

    input:
    abcdef\tghijkl
    S:
    (bc.*jk)
    R:
    $1
    Expected output is the original text unchanged:
    abcdef\tghijkl
    Observed output:

    abcdef ghijkl

    and with $2 substituted the result is

    input:
    abcdef$2ghijkl
    S:
    (bc.*jk)
    R:
    $1
    Expected output is the original text unchanged:
    abcdef$2ghijkl
    Observed output:

    abcdef
    ghijkl

     
  • hannne

    hannne - 2016-11-20

    It's doesn't quite seem to work for the $2 case, but here's one:
    Input:
    abcdef$2ghijkl
    S:
    (b(c).*jk)
    R:
    $1
    Observed output.
    abcdefcghijl
    So basically the $2 in the input text gets replaced by whatever's in the second captured expression, and something similar seems to happen for $3 and presumably higher.

    Richard - please be very careful when you're copy-pasting error reports. I feel pretty sure you haven't actually observed the output you claim for the $2 case...

     
  • Tony Browne

    Tony Browne - 2016-11-20

    It maybe of interest that the problem doesn't seem to arise with ReplaceAll, only with an individual Replace.
    My OS=Windows10(original); no difference between current standard GG1.0.25 & my private version sometimes identified as 1.0.26 (as known to hanne).

     
  • Richard Tonsing

    Richard Tonsing - 2016-11-21

    Try this

    Input:
    /
    abcdef$2ghijkl
    /

    S:
    /*\n((\n|[^\n])+)*/

    R:
    $1

    Observed output.
    abcdef
    ghijl

    I tested this and it happens every time.

     
  • Wayne Hammond

    Wayne Hammond - 2016-12-19

    Suggest removing the proofers information before running your RegEx with:

    search:
    (-----File: \d+.png---).+?$

    replace:
    $1

    Wayne

     
    • Richard Tonsing

      Richard Tonsing - 2016-12-19

      Actually I do this before running my regex:
      search
      \[nt]
      replace:
      \

      Then search
      \$
      replace
      ¥

      The last thing is to replace the yen with dollars after I've finished all processing.

       
      • Wayne Hammond

        Wayne Hammond - 2016-12-19

        The only problem with replacing only the n or t is the large number of other characters that affect the text.
        http://www.pgdp.net/wiki/Regex_Cookbook#List_of_Ingredients.2C_or.2C_how_to_read_regex_code

        I don't recall seeing any backslashes except in the proofer/formater tags at the page numbers.

        Regards,
        Wayne

        ----- Original Message -----

        From: "Richard Tonsing" okrick@users.sf.net
        To: "[guiguts:bugs]" 134@bugs.guiguts.p.re.sf.net
        Sent: Monday, December 19, 2016 12:44:43 AM
        Subject: [guiguts:bugs] Re: #134 Replace error with regex Search and replace of \n, \t, & $2.

        Actually I do this before running my regex:
        search
        \ [nt]
        replace:
        \

        Then search
        \$
        replace
        ¥

        The last thing is to replace the yen with dollars after I've finished all processing.

        [bugs:#134] Replace error with regex Search and replace of \n, \t, & $2.

        Status: open
        Group: v1.0.26
        Created: Sun Nov 20, 2016 06:42 PM UTC by Richard Tonsing
        Last Updated: Mon Dec 19, 2016 01:48 AM UTC
        Owner: nobody

        The replace function $1 in a regex search and replace makes the following transcription mistakes:

        A carriage return character is substituted for the two character string '\n' in the original string.
        A tab character is substituted for the two character string '\t' in the original string.
        A carriage return character is substituted for the two character string '$2' in the original string.

        regex find => replace sequences
        '/*((\n|[^\n])+?)*/' => '$1'

        Input text
        / Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea -----File: 148.png---\newprofer\tsombody\P3\F1\F2---- commodo consequat. Duis aute irure dolor in reprehenderit in $2 voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. /

        Output text (used 4 spaces to simulate a tab character)
        Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
        -----File: 148.png---
        ewprofer sombody\P3\F1\F2----
        commodo consequat. Duis aute irure dolor in reprehenderit in
        voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

        Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/guiguts/bugs/134/

        To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

         

        Related

        Bugs: #134

  • Richard Tonsing

    Richard Tonsing - 2016-12-19

    Actually the first search term should have been

    \ [nt]

    Odd, I see now how that happened. I copied the term with the double back slash and pasted it here but one of the back slashs disappears when pasting. Must be a Source Forge note error checking thing. Anyway I search for a backslash followed by either a t or n and replace it with a back slash. I've never encountered that combination in the text.

    Should have caught the error but guess I expected to see two back slashes so thats what I thought I saw.

    Sorry about the confusion.

     

Anonymous
Anonymous

Add attachments
Cancel