regex simple pattern fails

Help
s1w
2011-06-04
2013-01-25
  • s1w
    s1w
    2011-06-04

    Problem looks easy. I havent issue like this in other languages, I cant find reason. PythonScript for Notepad++:

    non-greedy html comment removal for entity:
        <!- <input id="file_upload" type="file"/> -> ass ->

    python script: (?P<begin> and ?P<end> group names are for clearness)

    import re
    line = editor.getCurLine()
    p = re.compile('(?P<begin><!--+)(?P<between>.*?)(?P<end>--+>)')
    if '<!--' in line or '-->' in line:
        console.write(p.sub(r'\g<between>', line)) 
    else:
        console.write('not commented.. ' + line)
    

    results:
          <input id="file_upload" type="file"/>  ass ->

    works satisfactorily.

    but if I want to implement conditional (?) / {0,1} occurrences for <begin> or <end>:
    1) '(?P<begin><!-+)?(?P<between>.*?)(?P<end>-+>)?'
    2) '(<!-+)?(?P<between>.*?)(-+>)?'
    the result looks GREEDY/or replaced in all occurences.. count=x flag wont help - disables replacements at all.

    other pattern tryings with similar result is failing too:
    3) '(<!-+){0,1}(?P<between>.*?)(-+>){0,1}'  replacement =  r'\g<between>'
    4) '(<!-+)?(?:.*?)(-+>)?'  replacement =  ''
    5) '(?:<!-+)?(.*?)(?:-+>)?'  replacement =  '\\1'

    all results are:
    <input id="file_upload" type="file"/>  ass

    (last '->' should stood). any clue?

     
  • s1w
    s1w
    2011-06-04

    easy way: importing string and using 3 commands /I really wanted to avoid that way:

    import re, string
    line = editor.getCurLine()
    if '<!--' in line or '-->' in line:
        line = re.sub('(?<=<!)-{3,}', '--', line)
        line = re.sub('-{3,}(?=>)', '--', line)
        line = line.replace("<!--", "", 1).replace("-->", "", 1)
    else:
        line = re.sub('(\s*)(.*)\s*\n', '\\1<!-- \\2 -->', line)
    

    result:
    + allows conditional instances of '<!-' and '->' in NON-GREEDY way
    + shortens '<!------' and '------->' cases
    - unfortunately allows removal reversed setting of comments "-> . . . <!-"

    but of course Ive created a magic pattern to do this all, and I challenge anyone to shorten it

    import re
    line = editor.getCurLine()
    p1 = re.compile('^(.*?)((?P<lt><!--+)|(?P<rt>--+>))(?P<block>.*?)((?(rt)|(?(lt)--+>|<!--+))|$)')
    p2 = re.compile('(\s*)(.*)\s*\n')
    if '<!--' in line or '-->' in line:
        line = p1.sub(r'\1\g<block>', line)
    else:
        line = p2.sub('\\1<!-- \\2 -->', line)
    
     
  • s1w
    s1w
    2011-06-04

    missing explanation: function of this script was to fast process single line without selection. To check validity of html comments in multiline, I am going to do another thing..

     
  • s1w
    s1w
    2011-06-05

    one more correction:

    p1 = re.compile('^(.*?)((?P<lt><!--+)|(?P<rt>--+>))(?P<block>.*?)((?(rt)|(?(lt)((?=<!--+)|--+>)))|$)')
    

    + now it also poperly handles   "-> . . . <!-"  and  "<!- .  . <!- .  . ->" cases, it will be well self-explanatory during exploitation

     
  • s1w
    s1w
    2011-06-05

    there rly should be edit option. there was little error in code. here is full and tested html comment script for Notepad++:

    import re
    line = editor.getCurLine()
    p1 = re.compile('^(.*?)((?P<lt><!--+\s*)|(?P<rt>\s*--+>))(?P<block>.*?)((?(rt)|(?(lt)((?=<!--+\s*)|\s*--+>)))|$)')
    p2 = re.compile('\r?(\s*)(.*)\s*\n')
    if '<!--' in line or '-->' in line:
        line = p1.sub(r'\1\g<block>', line).rstrip()
    else:
        line = p2.sub(r'\1<!-- \2', line).rstrip() + ' -->'
    

    + fast comments/uncomments single line without selection
    + allows conditional instances of '<!-' and '->' in NON-GREEDY way
    + shortens '<!------' and '------->' cases
    + correctly handles reversed setting of comments "-> . . . <!-" and recognizes "<!- .  . <!- .  . ->" situations

     
  • s1w
    s1w
    2011-06-05

    forgot about printing (add this to the bottom of script):

    currentLine  = editor.lineFromPosition(editor.getCurrentPos())
    editor.replaceLine(currentLine, line)