remove duplicate lines?

2006-07-26
2012-11-13
  • Nobody/Anonymous

    hi all

    is there a button that can scan a text file & remove any duplicate lines of text?

    thanks for notepad++ i love it!

    dave

     
    • Nobody/Anonymous

      I'm looking for it too !!

       
    • Anonymous - 2006-08-19

      And another one. :)

       
    • Don HO

      Don HO - 2006-08-19

      Use "Replace in the selection" feature in Find/Replace Dialog to do what you want.

      Don

       
    • Anonymous - 2006-08-19

      OK, I see now how to do it: first sort (Plugins ... TextFX Tools ... Sort), then select lines, then replace with null.

      That works fine. 

      However, it would still be a minor feature request to have this as built-in function.  EditPlus provides 'Sort ... Remove Duplicates', so it's a one-step procedure.

      Maybe put this near bottom of feature list? ;)

       
    • Nobody/Anonymous

      Don

      thanks for the answer on how to manually remove duplicate lines. when i finish browsing line by line for duplicates through this 10,000 line text file, i'll be sure to post your answer as a bonus feature of notepad++

      :D

       
    • Nobody/Anonymous

      Man that is the lamest responce ever. It should be a built-in function! Ultraedit, Acdsee & Editplus already do it. Get with the program!

       
    • Nobody/Anonymous

      Lame? Talk about looking the proverbial gift horse in the mouth. I think we should have all paid $$ for Notapad++, then you'd have reason to ask for something. You get with the program.

       
    • Chris Severance

      Chris Severance - 2007-02-02

      Adding line numbers, an extra sort, and removing line numbers will remove duplicates without losing the sort order. The process is described in NPPTextFXdemo.TXT.

       
      • Nobody/Anonymous

        I have been following the NPPTextFXdemo.TXT. But i probably don't know how to read it...

        so here the step by step i'm doing.

        keyboard : ctrl-A : select all lines
        XYZZY
        The Cave
        XYZZY
        XYZZY
        The Cave

        menu : TextFX --> TextFX Tool --> Insert Line Numbers
        00000001 XYZZY
        00000002 The Cave
        00000003 XYZZY
        00000004 XYZZY
        00000005 The Cave

        at this point all lines are selected

        menu : TextFX --> TextFX Tool --> Sort outputs only UNIQUE (at column) lines (a check mark appear on the menu)
        00000001 XYZZY
        00000002 The Cave
        00000003 XYZZY
        00000004 XYZZY
        00000005 The Cave

        keyboard/mouse : shift-alt-left click : select column line numbers.
        00000001
        00000002
        00000003
        00000004
        00000005
        only the above is highlighted

        menu : TextFX --> TextFX Tool --> sort lines case sensitive (at column)
        00000005 The Cave
        00000004 XYZZY
        00000003 XYZZY
        00000002 The Cave
        00000001 XYZZY

        keyboard : ctrl-A : select all lines

        menu : TextFX --> TextFX Tool --> Delete Lines Numbers or  First Word
        The Cave
        XYZZY
        XYZZY
        The Cave
        XYZZY

        So as u can see i still have all the lines .So clearly I'm doing the wrong thing if someone could give me direction how to interpret NPPTextFXdemo.TXT that would be great thank you.

        NPPTextFXdemo.TXT :

        "...
        Line 1 XYZZY
        Line 2 The Cave
        Line 3 XYZZY
        Line 4 XYZZY
        Line 5 The Cave

        If you sort the above lines at the column starting XYZZY outputing only unique lines, only a two lines will be output. Since a tool to insert and remove line numbers is provided, you can sort unique lines then return them back to their original order in 4 steps.

        1) Insert line numbers
        2) Sort unique after the line numbers
        3) Sort the line numbers
        4) Remove the line numbers
        ..."

         
    • Chris Severance

      Chris Severance - 2007-02-08

      Most text editors will only operate on the exact text marked. TextFX cannot follow this standard and provide it's functionality. For Sort TextFX will expand the selection to include entire lines but not until it has noted the column number that started the selection. The column number on the first marked line is what TextFX uses to determine sort order and uniqueness for every line.

      >00000001 XYZZY
      >00000002 The Cave
      >00000003 XYZZY
      >00000004 XYZZY
      >00000005 The Cave
      >keyboard/mouse : shift-alt-left click : select column line numbers.

      No column marking here. Normal mark from the first XYZZY to beyond the end of the text. TextFX will note the starting column which will exclude the line numbers from the sort and expand the selection to the entire line before performing the sort.

      In a later step you will will mark the entire text including the line numbers to make TextFX sort on the line numbers.

       
    • Nobody/Anonymous

      yeah this feature is definitely needed!!

       
      • Mike R. Haller

        Mike R. Haller - 2008-03-14

        +1 for this feature request:

        Delete duplicate lines

         
        • Nobody/Anonymous

          Needs to be added so simple yet many programs don't have this feature

           
    • Greg Bullock

      Greg Bullock - 2006-09-01

      If you can accept the sorting on top of removing duplicate lines, then TextFX can already do this.  Check the item

        +Sort outputs only UNIQUE (at column) lines

      before doing the sort.

      Regards.
      Greg

       
    • Nobody/Anonymous

      I made it like is written above:
      --Select Text --> Plugins ->  Text FX-TOOLS --> Check the sotring
      Thanks for Notepad++  :)

       

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks