Bugs item #2618277, was opened at 2009-02-20 12:34
Message generated for change (Comment added) made by mhammond
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=551954&aid=2618277&group_id=78018
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: pythonwin
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: markt (metolone)
Assigned to: Nobody/Anonymous (nobody)
Summary: backspacing over multibyte characters raises exception
Initial Comment:
In Pythonwin from pywin32-213, and on both the 2.6 and 3.0 versions, when backspacing over the Chinese characters in the attached file (or any UTF-8 multibyte character), the first backspace displays the remaining UTF-8 code bytes instead of deleting the entire character. Trying to save the file at this point or backspacing a second time throws exceptions.
The backspace exception (from 2.6) is:
Firing event '<<smart-backspace>>' failed.
Traceback (most recent call last):
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\bindings.py", line 142, in fire
rc = binding.handler(*args)
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\idle\AutoIndent.py", line 133, in smart_backspace_event
chars = text.get("insert linestart", "insert")
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\IDLEenvironment.py", line 343, in get
ret = self.edit.GetTextRange(start, end)
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\control.py", line 362, in GetTextRange
ret = ret.decode(default_scintilla_encoding)
File "C:\dev\python\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 11-12: unexpected end of data
Saving in 2.6 gives:
Traceback (most recent call last):
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\framework\editor\document.py", line 77, in OnSaveDocument
self.SaveFile(fileName)
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\document.py", line 54, in SaveFile
ok = view.SaveTextFile(fileName)
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\view.py", line 394, in SaveTextFile
doc._SaveTextToFile(self, f)
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\document.py", line 137, in _SaveTextToFile
s = view.GetTextRange() # already decoded from scintilla's encoding
File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\control.py", line 362, in GetTextRange
ret = ret.decode(default_scintilla_encoding)
File "C:\dev\python\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 28-30: invalid data
win32ui.error: OnSaveDocument() virtual handler (<bound method SyntEditDocument.OnSaveDocument of <pywin.framework.editor.color.coloreditor.SyntEditDocument instance at 0x00F27D00>>) raised an exception
----------------------------------------------------------------------
>Comment By: Mark Hammond (mhammond)
Date: 2009-03-30 23:03
Message:
Thanks!
Checking in pythonwin/pywin/scintilla/IDLEenvironment.py;
new revision: 1.15; previous revision: 1.14
----------------------------------------------------------------------
Comment By: markt (metolone)
Date: 2009-02-20 13:26
Message:
I made a fix to pythonwin\pywin\scintilla\IDLEenvironment.py that appears
to correct the problem. I found a function _fix_eol_indexes() that
corrected for partial deletion of \r\n. The function, now named
_fix_indexes(), now corrects for partial deletion of UTF8 chars as well. I
uploaded the file. It works by making sure the start and end variables
point to the start of a valid UTF8 character. In a UTF8 byte, bit 7 on and
bit 6 off is an intermediate byte.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=551954&aid=2618277&group_id=78018
|