When I call notepad.open with a filename that contains french characters, it brings up a prompt saying it cannot find the file (and subsequently shows the special character as a square). Any idea how to work around this? I tried escaping the characters, using hex values, but nothing seems to work.
For example:
notepad.open("C:\\école.html")
Thank you in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You need to pass it as a UTF8 encoded path, so, assuming your script is saved as UTF8:
notepad.open(u"C:\\école.html".encode('utf8'))
If the script isn't saved as UTF8, you'll need to change the accented e to the proper unicode escape sequence \uXXXX where XXXX is the unicode character point.
Dave.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
One other quick question: what would be the syntax when the French characters are contained within a variable? (my python script is saved as UTF-8)
Here is my code… not sure how to implement your suggestion above:
for root, subFolders, files in os.walk(filePathSrc): # searches file path recursively
console.write("Begin scanning path: " +root + "\n")
for file in files:
filePath = os.path.join(root,file)
if file[-4:] == '.htm' or file[-5:] == '.html':
# this throws an error when the filePath contains french characters
notepad.open(filePath)
#<code snippet omitted>
notepad.close()
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
(Sorry for the delay, and thanks for the poke, I thought I'd answered everything)
You've got 2 options - the docs for os.walk and os.listdir say that if you pass a unicode filename in, you get unicode filenames out, so just pass a unicode string in (so make filePathSrc a unicode string).
Alternatively, you can convert the string that comes out to utf8, but that means converting it to unicode first from whatever it comes out of listdir as (which is sys.getfilesystemencoding()). On my system that's 'mbcs', but it could be different on yours.
Hi,
When I call notepad.open with a filename that contains french characters, it brings up a prompt saying it cannot find the file (and subsequently shows the special character as a square). Any idea how to work around this? I tried escaping the characters, using hex values, but nothing seems to work.
For example:
Thank you in advance.
You need to pass it as a UTF8 encoded path, so, assuming your script is saved as UTF8:
If the script isn't saved as UTF8, you'll need to change the accented e to the proper unicode escape sequence \uXXXX where XXXX is the unicode character point.
Dave.
Worked like a charm… thanks again!
One other quick question: what would be the syntax when the French characters are contained within a variable? (my python script is saved as UTF-8)
Here is my code… not sure how to implement your suggestion above:
(Sorry for the delay, and thanks for the poke, I thought I'd answered everything)
You've got 2 options - the docs for os.walk and os.listdir say that if you pass a unicode filename in, you get unicode filenames out, so just pass a unicode string in (so make filePathSrc a unicode string).
Alternatively, you can convert the string that comes out to utf8, but that means converting it to unicode first from whatever it comes out of listdir as (which is sys.getfilesystemencoding()). On my system that's 'mbcs', but it could be different on yours.
Really, you should change the hard coded 'mbcs' to sys.getfilesystemencoding().
Hope that helps,
Dave.