Menu

rereplace() and unicode

Help
2014-07-13
2014-08-14
  • David Instone-Brewer

    I really love the Python plugin for Notepad++ for complex work on large Hebrew texts.
    But I'm having problems with Unicode and the new version.

    For example, the old version works fine with editor.pymlreplace("cafe", "café")

    But in the new version I tried editor.rereplace("cafe", "café") and got: cafe => caf

    Do I need to do something complicated with .decode('utf8') or is there a simple fix?

     
    • Dave Brotherstone

      Your best bet is to save your python script as utf8, then place a "u"
      before the replacement string, identifying the string as Unicode. The new
      version handles this sensibly, and if the document is Unicode, uses
      Unicode, otherwise it attempts to convert it to ANSI. You might need an
      encoding comment on the top of your script:

      # -*- coding: utf-8 -*-
      

      See the replace tests for an example -
      https://github.com/davegb3/PythonScript/blob/master/PythonScript/python_tests/tests/ReplaceUTF8TestCase.py

      Cheers
      Dave

       
  • David Instone-Brewer

    I don't think I've got it.

    I wrote a file (test.py) and saved it as a Notepad++ plugin script,
    then restarted Notepad++ so I could edit it and run it.

    The file consisted of:
    **

        # -*- coding: utf-8 -*-
        editor.rereplace(r"X\d", "XäXüXö") 
        #X2
    

    **

    When I saved this as ANSI and ran it on itself it worked fine: the last line changed to #XäXüXö
    When I converted this as UTF-8 and ran it on itself, the last line changed to #XXX
    I tried u"XäXüXö" but this didn't help.

     
  • David Instone-Brewer

    BTW, the reason I need it in UTF-8 is that I'm aiming for things like:
    **

    # -*- coding: utf-8 -*-
    editor.rereplace(r"X\d", "Xאבג") 
    #X2
    

    **

    I can, of course, use the old version with pyreplace() but I was looking forward to the increased speed. (My texts sometimes have 300K lines).
    Even so, your plugin makes Notepad++ into a wonderful tool

     
    • Dave Brotherstone

      That should work with a Unicode string (the u before the literal). I'm not
      in front of a PC at the moment, but I'll check it and post an example.

      Cheers
      Dave

       
      • José Calvo

        José Calvo - 2014-08-13

        Hi, I am having exactly the same problem, I wanted to clean some of the characters changing the three dots for the character "…" (it looks the same, but it ain't) and put some nice — instead of hyphen (again, very close, but not). I tried everything I saw on the web, but nothing worked when I called the script.

        Basically I am trying:

        editor.rereplace(u"-", u"—")

        Thanks for the answers, Python Script is a great tool :)

         
        • Dave Brotherstone

          What version are you using - this was fixed in 1.0.7.

           
          • José Calvo

            José Calvo - 2014-08-14

            You are right, yes!! :)
            I downloaded the plugin yesterday through the Plugin Manager and it was stil the 1.6 version. And now works fine with your example of Hebrew. Thanks again!
            Grüße

             
  • Dave Brotherstone

    Ok, I can reproduce it and see the error in the code. I'll get a fix out as soon as possible.

     
  • Dave Brotherstone

    Ok, fixed.

    Plan is to fix the another issue with startup.py, then release a new version.

    Many thanks for reporting this.

    Cheers,
    Dave.

     
  • David Instone-Brewer

    Thank you so much for putting so much time into this excellent software

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.