Menu

#1024 Unicode formating characters mess document layout

v1.25
pending-fixed
nobody
None
v1.23.1
5
2014-08-14
2014-01-16
No

Text on a document (xml) is overwriten itself due to unicode formating characters

The text is using unicode characters like <U+2028> (http://www.fileformat.info/info/unicode/char/2028/index.htm) and <U+2029> (http://www.fileformat.info/info/unicode/char/2029/index.htm).

With this characters, geany is unusable, as there is no way to view the document in "normal" formating and editing is a guess work.

Geany should at least add a option to "unformat" or view as "unicode-encoded" this "separators" characters (http://www.fileformat.info/info/unicode/char/search.htm?q=separator&preview=entity), so they don't mess with the file format

i'm using debian jessie with geany 1.23.1. Attach is a example of the problem.

1 Attachments

Discussion

  • Lex Trotman

    Lex Trotman - 2014-03-18

    The editing component Geany uses (Scintilla www.scintilla.org) only has support for unicode line ends as a provisional feature. When it is fully supported Geany can add the capability to turn on that capability (if someone provides a pull request).

     
  • Colomban Wendling

    • status: open --> pending-fixed
     
  • Colomban Wendling

    With latest Geany (and Scintilla) they U+2028 shows as a "LS" control character, and U+2029 as a "PS" one. This probably isn't the expected way to display them (e.g. it doesn't cause a line break), but it doesn't break display.

    AFAIK our Scintilla version now has optional support for those Unicode separator, so a patch enabling them (possibly optionally if it has a performance impact) would be acceptable.

     
  • Colomban Wendling

    • Fixed in: None --> v1.25
     

Log in to post a comment.