notepad.Open file name with french characters

  • mrpaul1

    mrpaul1 - 2011-02-04


    When I call with a filename that contains french characters, it brings up a prompt saying it cannot find the file (and subsequently shows the special character as a square). Any idea how to work around this? I tried escaping the characters, using hex values, but nothing seems to work.

    For example:"C:\\école.html")

    Thank you in advance.

  • Dave Brotherstone

    You need to pass it as a UTF8 encoded path, so, assuming your script is saved as UTF8:"C:\\école.html".encode('utf8'))

    If the script isn't saved as UTF8, you'll need to change the accented e to the proper unicode escape sequence \uXXXX where XXXX is the unicode character point.


  • mrpaul1

    mrpaul1 - 2011-02-04

    Worked like a charm… thanks again!

  • mrpaul1

    mrpaul1 - 2011-02-07

    One other quick question: what would be the syntax when the French characters are contained within a variable? (my python script is saved as UTF-8)

    Here is my code… not sure how to implement your suggestion above:

    for root, subFolders, files in os.walk(filePathSrc): # searches file path recursively
        console.write("Begin scanning path: " +root + "\n")
        for file in files:
            filePath = os.path.join(root,file)
            if file[-4:] == '.htm' or file[-5:] == '.html':
                # this throws an error when the filePath contains french characters
                #<code snippet omitted>
  • Dave Brotherstone

    (Sorry for the delay, and thanks for the poke, I thought I'd answered everything)

    You've got 2 options - the docs for os.walk and os.listdir say that if you pass a unicode filename in, you get unicode filenames out, so just pass a unicode string in (so make filePathSrc a unicode string).

    Alternatively, you can convert the string that comes out to utf8, but that means converting it to unicode first from whatever it comes out of listdir as (which is sys.getfilesystemencoding()).  On my system that's 'mbcs', but it could be different on yours.'mbcs').encode('utf8'))

    Really, you should change the hard coded 'mbcs' to sys.getfilesystemencoding().

    Hope that helps,


Log in to post a comment.