Total Newb Needs Help With Notepad++

  • Trevor Zacek
    Trevor Zacek

    Hi everyone,

    I have a large text file that I opened with Notepad++ and there are occurences of opening <Title> and closing </Title> tags. There are many of them, about 23 thousand, actually.

    I need to copy the text between these tags. So as the output I need 23 thousand lines, each line including the text that appeared between the title tags in the original code.

    Can anyone tell me how to do that?



  • Anonymous

    You can replace all appearances of <Title>  and </Title> with an empty string.

  • I think you can combine "Macro" tool with regexp search. Something like:
    -Start macro recording
    -Write regular expression, that will search next text block with <Title> and </Title>
    -Cut it
    -goto the end of file
    -Paste it
    -Stop macro recording
    -Replay macro so many times you want (or "to the end of file")
    -Replace <Title> and  </Title> to empty string.

  • srbs

    My suggestion is to use Textcrawler's Extract tool with a regular expression.

    This regex should work:

  • Jan Schreiber
    Jan Schreiber

    On the Mark tab of the find dialog, check "Regular expression" and "Mark line". Enter "<Title>(.+)</Title>" (w/o the quotes) as search term, then click "Find All." This step will add bookmarks to all the lines that match the regEx.
    Then do Search -> Bookmark -> Copy Bookmarked Lines and paste to a new document. Finally, on the Replace tab of the Find and Replace dialog reuse the above regEx and use "$1" (without quotes) as replace term. This step will remove "<Title>" and "</Title>".


    Hello tekamolo,

    I think we can do the job with ONLY ONE search/replacement !

      1) COPY ALL the text BETWEEN the TWO lines '------------', below, in a NEW file

    <Title>THIS IS A </Title>very small <Title>TEXT TO SEE</Title>67890
    12345<Title>IF ALL</Title>
    NICE !
    IT WORKS FINE !</Title>………..
    no good text

    Just notice TWO facts :

      - ALL the text you need to extract, in this example, is UPPERCASE text !

      - The LAST bloc <Title>……</Title> is a MULTI-lines BLOC  ( It doesn't matter ! )

      2) In this NEW tab, type CTRL-H to open the SEARCH-REPLACEMENT dialog

      3) SELECT the radio button 'Regular expression'

      4) SELECT the box 'Wrap around'
      5) Do the SEARCH-REPLACEMENT, below, on the text of the NEW file :

    SEARCH :         (?s).*?<Title>(.*?)</Title>(\R)?|.*\z

    REPLACE :       (?1\1(?2\2:\r\n))

      => Finally, we obtain, below, ALL UPPERCASE text which is INSIDE the ZONES  <Title>…..</Title>


    IF ALL
    NICE !

    Once again, notice TWO facts :

      - The EMPTY forms  <Title></Title>  generate a BLANK line

      - The cursor, BEFORE the SEARCH-REPLACEMENT, can be at ANY POSITION of the file !

    I've made a TUTORIAL, about the PCRE Regular Expressions ( Perl Common Regular Expressions ),
      used in Notepad++, from the 6.0 version.

    As I'm French, all this manual is written in French. but you can find out some tricks or
      explanations in all the lists and examples, all along this tutorial.

    Christian Cuvier ( cchris ), a very well-known contributer, allowed me to put my tutorial
      on his personnel site.

    So, you can download this TUTORIAL, in 3 versions, (.txt .pdf .html), at the address below :


    I hope it'll be useful to you

    Cheers !


    P.S. You can also find some documentation, about the new PRCE Regular Expressions, used by N++, at the
           two adresses below :

         The FIRST one concerns the syntax of regular expressions in the SEARCH part

         The SECOND one concerns the syntax of regular expressions in the REPLACEMENT part