#243 Patch for bug 3028356, reading UTF-8 files

closed
Don HO
None
9
2010-09-25
2010-08-12
No

A file is read in blocks of constant size. As of version 5.7, the bytes of a multi-byte UTF-8 character could happen to overlap between two block and would be inserted to Scintilla buffers in two halves, which sometimes will delete that character since it is invalid while only half of it is in the buffer.
This patch will keep the last multi-byte character of a block and insert it with the next block, to avoid splitting it.

Discussion

  • François-R Boyer

    Test files for DBCS and null characters

     
  • François-R Boyer

    After looking deeper into load/save code, I found a problem when a file contains embedded null characters. The new version 2 patch, now handles correctly null characters, keeps all invalid UTF-8 byte sequences, and also fix DBCS character split at end of buffer. Note that it does not yet handle properly DBCS files containing invalid characters.

     
  • Don HO

    Don HO - 2010-08-15

    Thank you for the fix.
    So which patch should I apply? fileLoadSave_patch.zip or Buffer.patch ?

    Don

     
  • Don HO

    Don HO - 2010-08-15
    • assigned_to: nobody --> donho
    • priority: 5 --> 9
     
  • François-R Boyer

    fileLoadSave_patch.zip is the one to apply. test_files.zip would fail to load/save correctly with the older patch (that I've now deleted).

     
  • Don HO

    Don HO - 2010-08-15

    François,

    Could you redo fileLoadSave_patch.zip with the latest revision (649), please?

    Thank you.
    Don

     
  • Don HO

    Don HO - 2010-08-15

    Sorry, the latest revision is 650.

    Don

     
  • François-R Boyer

    Unified patch against revision 650

     
  • François-R Boyer

    Merged patch with revision 650. I tried P4Merge to do an interactive merge... not bad.

     
  • Don HO

    Don HO - 2010-08-16

    Nice. Commited. Thank you.

     
  • Don HO

    Don HO - 2010-09-25
    • status: open --> closed
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks