Working down a file in Python.

Ron Jensen
2013-09-16
2014-03-15
  • Ron Jensen

    Ron Jensen - 2013-09-16

    I have a text file that is line wrapped. The first line in each block ends in a tilde (~) This is an example:

    Now is the time for all~
    good men to come to the
    aid of their party.
    Short line.~
    The quick brown fox~
    jumped over the lazy
    yellow dog.
    What light through~
    yonder window breaks?
    ~
    

    I need it to come out looking like this:

    Now is the time for all good men to come to the aid of their party.
    Short line.
    The quick brown fox jumped over the lazy yellow dog.
    What light through yonder window breaks?
    

    I recorded a macro that will do it, but I have to run the macro manually for each line.

    My question is, how can I wrap this file together in Python, either by running the macro in a for() loop until it finishes or entirely in Python?

    Thanks,

    Ron

     
  • Ron Jensen

    Ron Jensen - 2013-09-16

    FYI,
    The macro I recorded does:
    - regex find and select "~\R([^~]*)"
    - Scintilla command 2303 LineUpExtend
    - Notepad command "Edit Join Lines"
    - Scintilla command 2304 CharLeft
    - Scintilla command 2180 Clear

     
  • Ron Jensen

    Ron Jensen - 2013-09-16

    O.K., this is what I came up with:

    from types import *
    
    # First we'll start an undo action, then Ctrl-Z will undo the actions of the whole script
    editor.beginUndoAction()
    
    done = 0
    while (done != 1):
     endPos = editor.getLength()
     firstTilde = editor.findText(0,0,endPos,"~")
     secondTilde = editor.findText(0,firstTilde[1],endPos,"~")
    
     if (type(firstTilde) == NoneType):
      break
    
     if (type(secondTilde) == NoneType):
      editor.setSelection(endPos, firstTilde[0])
      done = 1
     else:
      editor.setSelection(secondTilde[0], firstTilde[0])
    
     editor.lineUpExtend()
     notepad.runMenuCommand("Edit", "Join Lines")
     editor.setSelection(firstTilde[1], firstTilde[0])
     editor.clear()
    
    # End the undo action, so Ctrl-Z will undo the above two actions
    editor.endUndoAction()
    # Turn the undo recorder back on.
    editor.setUndoCollection(1)
    

    Constructive critism and python tips appreciated.

    Ron

     
  • THEVENOT Guy

    THEVENOT Guy - 2013-11-11

    Hi Ron,

    It's a bit late to answer, isn't it ? But never mind !

    I think it can, also, be done with a search/replacement in regular expression mode !

    What are we looking for ? :

    • If there is a dot, eventually followed with a tilde (~), before the EOL character(s), we just have to re-copy the dot, followed with the EOL character(s).

    • If NOT, we must change any EOL character(s), eventually preceded by a tilde (~), with ONE space (\x20).

    Then the Search/Replacement could be something like below :

    SEARCH : ((\.)|)~?(\R)

    REPLACE : (?2.\3: )

    NOTES :

    • In Search, there's an alternative that means a single DOT OR Nothing, followed with a possible tilde, followed by any kind of EOL ( \r\n for Windows, \n for Unix or \r for Old Mac )

    • In Replacement, there's a conditional form that means : if Group 2 exists ( the DOT ) then rewrite it, followed with the Group 3 ( \R ). If NOT, ( part after the colon ) just write a single space

    • Tested with the 6.5.1 version of N++ => Everything fine :-)

    Enjoy our loved editor,

    Best regards,

    guy038,

    P.S. :

    You can find good documentation, about the new PRCE Regular Expressions, used by N++, since the 6.0 version, at the TWO addresses below :

    http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html

    http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html

    The FIRST link concerns the syntax of regular expressions in SEARCH

    The SECOND link concerns the syntax of regular expressions in REPLACEMENT

     
    Last edit: THEVENOT Guy 2013-11-11
  • Dane

    Dane - 2014-03-15

    So, four months later, I just want to say that those are excellent links on PCRE. Also, that's a pretty insightful use of regex to solve the problem.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks