[pywin32-bugs] [ pywin32-Bugs-2618277 ] backspacing over multibyte characters raises exception
OLD project page for the Python extensions for Windows
Brought to you by:
mhammond
From: SourceForge.net <no...@so...> - 2009-03-30 12:03:13
|
Bugs item #2618277, was opened at 2009-02-20 12:34 Message generated for change (Comment added) made by mhammond You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=551954&aid=2618277&group_id=78018 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: pythonwin Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: markt (metolone) Assigned to: Nobody/Anonymous (nobody) Summary: backspacing over multibyte characters raises exception Initial Comment: In Pythonwin from pywin32-213, and on both the 2.6 and 3.0 versions, when backspacing over the Chinese characters in the attached file (or any UTF-8 multibyte character), the first backspace displays the remaining UTF-8 code bytes instead of deleting the entire character. Trying to save the file at this point or backspacing a second time throws exceptions. The backspace exception (from 2.6) is: Firing event '<<smart-backspace>>' failed. Traceback (most recent call last): File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\bindings.py", line 142, in fire rc = binding.handler(*args) File "C:\dev\python\Lib\site-packages\pythonwin\pywin\idle\AutoIndent.py", line 133, in smart_backspace_event chars = text.get("insert linestart", "insert") File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\IDLEenvironment.py", line 343, in get ret = self.edit.GetTextRange(start, end) File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\control.py", line 362, in GetTextRange ret = ret.decode(default_scintilla_encoding) File "C:\dev\python\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 11-12: unexpected end of data Saving in 2.6 gives: Traceback (most recent call last): File "C:\dev\python\Lib\site-packages\pythonwin\pywin\framework\editor\document.py", line 77, in OnSaveDocument self.SaveFile(fileName) File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\document.py", line 54, in SaveFile ok = view.SaveTextFile(fileName) File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\view.py", line 394, in SaveTextFile doc._SaveTextToFile(self, f) File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\document.py", line 137, in _SaveTextToFile s = view.GetTextRange() # already decoded from scintilla's encoding File "C:\dev\python\Lib\site-packages\pythonwin\pywin\scintilla\control.py", line 362, in GetTextRange ret = ret.decode(default_scintilla_encoding) File "C:\dev\python\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 28-30: invalid data win32ui.error: OnSaveDocument() virtual handler (<bound method SyntEditDocument.OnSaveDocument of <pywin.framework.editor.color.coloreditor.SyntEditDocument instance at 0x00F27D00>>) raised an exception ---------------------------------------------------------------------- >Comment By: Mark Hammond (mhammond) Date: 2009-03-30 23:03 Message: Thanks! Checking in pythonwin/pywin/scintilla/IDLEenvironment.py; new revision: 1.15; previous revision: 1.14 ---------------------------------------------------------------------- Comment By: markt (metolone) Date: 2009-02-20 13:26 Message: I made a fix to pythonwin\pywin\scintilla\IDLEenvironment.py that appears to correct the problem. I found a function _fix_eol_indexes() that corrected for partial deletion of \r\n. The function, now named _fix_indexes(), now corrects for partial deletion of UTF8 chars as well. I uploaded the file. It works by making sure the start and end variables point to the start of a valid UTF8 character. In a UTF8 byte, bit 7 on and bit 6 off is an intermediate byte. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=551954&aid=2618277&group_id=78018 |