• Join/Login
  • Business Software
  • Open Source Software
  • For Vendors
  • Blog
  • About
  • More
    • Articles
    • Create
    • SourceForge Podcast
    • Site Documentation
    • Subscribe to our Newsletter
    • Support Request
SourceForge logo
For Vendors Help Create Join Login
SourceForge logo
Business Software
Open Source Software
SourceForge Podcast
Resources
  • Articles
  • Case Studies
  • Blog
Menu
  • Help
  • Create
  • Join
  • Login
  • Home
  • Browse
  • KDiff3
  • Bugs
KDiff3

UTF-8-BOM has rendering / diff artifacts

A graphical text difference analyzer

Brought to you by: arondel, joachim99
This project can now be found here.
  • Summary
  • Files
  • Reviews
  • Support
  • Wiki
  • Mailing Lists
  • Code
  • Tickets ▾
    • Patches
    • Bugs
    • Feature Requests
    • Support Requests
  • News
  • Discussion
  • Donate
Menu ▾ ▴
  • Create Ticket
  • View Stats

Group

  • v1.0_(example)

Searches

  • Changes
  • Closed Tickets
  • Open Tickets

Help

  • Formatting Help

#158 UTF-8-BOM has rendering / diff artifacts

open
nobody
None
5
2012-09-23
2011-06-02
Benjamin Schroeder
No

This is occuring on Windows XP x86 and Windows 7 x86 using either 'Auto Detect Unicode' in the Regional Settings or explicitly setting the File Encoding to UTF-8-BOM.

Do a diff of the attached file with a copy of itself you will see three question marks at the start of line three.

Using a hex editor (or other editor that does not modify line endings / BOM / etc.) remove the first line leaving the BOM intact
Do a diff of this file with the original and the modified file will now render correctly.
However, the multibyte character will now also incorrectly show as a difference.

Although the attached file is a little contrived, the important thing to note is the offset of the multibyte character.
It appears to be due to the multibyte character being clipped by a buffer, as it also occurs at multiples of that offset.
ie. The multibyte character in the example is offset 0x3FFE from the end of the BOM. It also occurs at offsets 0x3FFF, 0x13FFE, 0x13FFF, 0x23FFE, etc.

Discussion

  • Benjamin Schroeder

    Benjamin Schroeder - 2011-06-02
     
    multibyte issue
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
SourceForge
  • Create a Project
  • Open Source Software
  • Business Software
  • Top Downloaded Projects
Company
  • About
  • Team
  • SourceForge Headquarters
    1320 Columbia Street Suite 310
    San Diego, CA 92101
    +1 (858) 422-6466
Resources
  • Support
  • Site Documentation
  • Site Status
  • SourceForge Reviews
SourceForge logo
© 2025 Slashdot Media. All Rights Reserved.
Terms Privacy Opt Out Advertise
×