Scite 3.5.4 : read file of utf16 of 4 four bytes can cause troubles

Brought to you by: antoniolinares, johnsoonj, kapix93, nyamatongwe, and 2 others

#1710 Scite 3.5.4 : read file of utf16 of 4 four bytes can cause troubles

Milestone: Bug

Status: closed-fixed

Owner: Neil Hodgson

Labels: scite (198) unicode (4)

Priority: 5

Updated: 2015-05-26

Created: 2015-04-16

Creator: Olivier

Private: No

In Scite 3.5.4, if a UTF16-LE file contains a character coded on 4 bytes started at position 131070 (the 2 wchar are in middle of file read buffer), it causes buffer overrun and wrong display of char.
See in attached file uni2.txt, the last character in file which is a valid unicode U+1d11e, is wrongly displayed.

1 Attachments

unit2.txt

Discussion

Neil Hodgson - 2015-04-16

labels: --> scite, unicode

status: open --> open-accepted

assigned_to: Neil Hodgson
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-04-21

While this should be fixed properly, a quicker part-fix would be to ensure that both the surrogates making up this character 0x1D11E: (D834, DD1E) are encoded as UTF-8 inside Scintilla as (ED, A0, B4, ED, B4, 9E) so that the file will at least save out the same as it was read in. Currently the character becomes bytes (9D, 87, BD, ED, B4, 9E) and is saved out with further mangling as UTF-16 (9D, 87, BD, DD1E).

Its unlikely I will look at this again for several weeks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-04-21

status: open-accepted --> open-later
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-05-15

Partial fix that avoids file corruption committed as [9403f1].

Related

Commit: [9403f1]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-05-16

Should be fixed with [0a9464]. Its a complex change so please check.

Related

Commit: [0a9464]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-05-16

status: open-later --> open-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Olivier - 2015-05-25

Tested preview 3.5.6 without any problems
Thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neil Hodgson - 2015-05-26

status: open-fixed --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.