Menu

Wildcard question

Anonymous
2009-03-10
2012-11-13
  • Anonymous

    Anonymous - 2009-03-10

    Hi...

    i'm new to Notepad++.  I think it's a great program but it's going to take a little getting used to as I haven't used anything like it with the same features.

    Anyway..I have a question about wildcard find and replace.  I'm currently editing some fairly large text files which contain thousands of IP addresses.  I want to remove certain entries such as:
    (NET-*-*-*-*-*)

    * denotes numbers which vary from line to line.

    At present, i'm removing the entries line by line but it's proving to be a little time consuming so I was wondering if I would be able to use a wildcard to do all the work for me?

    Is this possible?

    Thanks for any advice.

     
    • Justin Time

      Justin Time - 2009-03-18

      I found the above very useful for a project I'm doing, And While staying on the same question:

      I currently have two parameters beginning and ending various strings and I would like to be remove all of the content between those params.

      Viz; \B "content" \b

      Using the previous posted examples I can fudge the RegExp strings to remove most, but still have to do that with "Replace One at a time" because the RegExp over-runs the first occurrence of \b if there are two or more in one line :(

      Here's a few examples of what I'm using ;)

      (\\B[^-\d]+)([^-\d]+\b)
      (\\B[^-\d]+)-([^-\d]+\b)
      (\\B[^-\d]+)94([^-\d]+\b)

      And there are others :(

      How do I strip what is between \B and \b without adding any variations to the RegExp and without over-running the \b at the end of each query. ?

      Thanks for any help.

       
      • Fool4UAnyway

        Fool4UAnyway - 2009-03-18

        Assuming this is an example line:

        Viz; \B "content" \b

        I suggest you do the following.

        1. UNcheck Recursive Replacement, which was needed in the other instruction because of individual repetitions. This asked for a non-greedy matching method.

        What you want is "remove any text between #A and #B".

        2. Find #A and #B including anything in-between. If #A and #B can be found, "anything" may use a greedy regular expression.

        In the Find field, enter:
        ^Viz; (\\B).*(\\b)

        ^Viz; ____ the example line starts with this, but may be omitted
        _________ you may want to keep the ; as identifier or separator though
        \\B ______ this is your #A match
        .* _______ match any character, any times: this is greedy
        \\b ______ #B match, notice the double backslashes here also!

        Because a #B match is specified, the engine can't "eat" the whole line when digesting .*.

        3. Because we grouped the #A and #B matches, we can easily replace them.

        In the Replace field, enter:
        ^Viz; \1 \2

        ^Viz; ____ as in the example line, but may be omitted as well

        \1 _______ re-place the #A match
        _ ________ add a space character in-between
        \2 _______ re-place the #B match

         
    • Greg

      Greg - 2009-03-10

      Pleas give at least one specific example. Do you really mean to have the fifth asterisk(*)? If so, is it preceded by a colon (:) or a hyphen (-)?

       
    • Anonymous

      Anonymous - 2009-03-10

      Thanks for the reply.

      Here's an example as it appears in the list:
      (NET-12-163-132-0-1)

       
    • Greg

      Greg - 2009-03-10

      Okay, before you try this make a backup.

      N++ has regular expression which are a form of wildcards but wildcards on steroids.

      Place cursor at top of file.
      Ctrl/H (N++ Find and Replace)
      Enter in Find what box \(NET-[0-9]+-[0-9]+-[0-9]+-[0-9]+-[0-9]+\)
      Leave empty the Replace with box
      Click on Regular Expression in the Search Mode box.
      Find, replace all

      The above will delete the string.
      If you want to delete the lines that contain your string then the Replace what string is:

      ^.*\(NET-[0-9]+-[0-9]+-[0-9]+-[0-9]+-[0-9]+\).*$

       
    • Anonymous

      Anonymous - 2009-03-10

      Wildcards on steroids ;)

      Thanks for the help...I will give it a shot :)

       
    • Anonymous

      Anonymous - 2009-03-10

      Awesome...worked like a charm.  Really appreciate the help :)

       
    • Anonymous

      Anonymous - 2009-03-11

      Thanks for that....really saved a lot of time :)

      I've got another query about wildcards though.

      The text files contain many lines where words are separated with a - instead of a regular space so I would like to remove those entries en masse.  If I choose to replace all - with 'space' then that would mess up the IP addresses which are also separated with a -.

      I've tried a few various combinations but can't seem to find the right syntax.

      Say for example I have:
      BT-CENTRAL-PLUS:

      I would like to make it:
      BT CENTRAL PLUS:

      I have no problem making the find value:
      ([A-Z]+-[A-Z]+-[A-Z]+:)

      But the replace value is where i'm becoming stuck.  Won't post the examples i've tried for fear of being ridiculed as some couldn't have been more wrong if I typed them blindfolded whilst drunk! ;)

      Any ideas please?

       
      • Greg

        Greg - 2009-03-11

        You have to 'capture' the good bits in the search string. You do this with what's called sub-strings, which are denoted by left and right parenthesis ().

        Hence, your search string should be:
        ([A-Z]+)-([A-Z]+)-([A-Z]+:)

        In the replacement string the first sub-string is denoted by \1, the second by \2.

        Hence the replacement string you want is:
        \1 \2 \3

        Knowing that parenthesis have special meanings in search strings, I'll leave you to figure what what your search string meant.

        --Greg

        Remember, Don has three children to feed.

         
        • Fool4UAnyway

          Fool4UAnyway - 2009-03-11

          > Hence, your search string should be:
          > ([A-Z]+)-([A-Z]+)-([A-Z]+:)

          > In the replacement string the first sub-string is denoted by
          > \1, the second by \2.

          > Hence the replacement string you want is:
          > \1 \2 \3

          Let me be the least fool I can be and break into this discussion.

          I can imagine you want to replace all - characters between _any_ two words. There may be more or less than exact 3 words on each line.

          Let's translate word into "not-a-number", although perhaps any "word" may contain digits (I don't expect this).

          Let's turn any - in-between words into a space character.

          Use the Text FX advanced replace dialog, Ctrl+R.

          Check the regular expression option.
          Uncheck the wrap option.
          Check the recursive replacement option. ! This is important !

          In the Find field, enter:
          ([^-\d+]+)-([^-d]+)

          In the Replace field, enter:
          \1 \2

          This is how it works:

          \d _______ = any digit
          [^...] ___ = match any character _not_ listen after ^
          [^-\d] ___ = match any character that is not a digit or -
          + ________ = match any string of those characters, require at least one
          () _______ = group an expression to re-use (the found match)later

          The - between two words will be replaced by a space character.

          The Recursive Replacement then makes the search engine start again from the position where it found the match.

          It won't find the replaced - anymore, but will match (and replace) any other - between the second and third word, and so on.

           
          • Fool4UAnyway

            Fool4UAnyway - 2009-03-11

            An alternative method would be:

            1. Replace all -'s by space characters
            2. Replace all space characters in-between numbers by -'s again

            For part 2 you could use the following expressions.

            Find:
            (\d+) (\d+)

            Replace:
            \1-\2

            Of course, in this case, also the Recursive Replacement option should be checked.

            But: there may be -'s that are not surrounded by two words that will be replaced by this method, but won't be by the previous.

             
          • Fool4UAnyway

            Fool4UAnyway - 2009-03-11

            > In the Find field, enter:
            > ([^-\d+]+)-([^-d]+)

            Correction:

            ([^-\d]+)-([^-\d]+)

             
            • Fool4UAnyway

              Fool4UAnyway - 2009-03-11

              Announcement: an improvement will be available soon.

              You might want to _exclude_ space characters as well for any "word" match and _remove_ any (number of) space characters between the two words and the - character itself.

              Improvemed Find regular expression:
              ([^ -\d]+) *- *([^ -\d]+)

              Remark: when undo-ing something, the cursor caret will be placed _after_ the selection on which the change was performed. This is not convenient for undo-ing replace changes, when wanting to directly execute them again to see the effect. The match won't be found anymore, because the cursor caret has already "passed" it.

               
    • Anonymous

      Anonymous - 2009-03-11

      That's awesome...thanks very much Greg.

      I think I see where I was going wrong with my approach.  Rather than dealing with separate objects, my search string was only targeting one aspect of what I wanted to change....so Don would've only been able to feed one kid.

      Is that about right?

       
      • Greg

        Greg - 2009-03-11

        That's it Nick. In your 'search for' string you had one set of parenthesis.

        ([A-Z]+-[A-Z]+-[A-Z]+:)

        So in the replace string \1 would represent the search string itself, hyphens included.

        You've learned quickly, well done!