Menu

#363 Upgrade to Unicode 9.0

Unknown
closed-fixed
nobody
None
v4.3
5
2018-01-10
2017-02-14
No

Unicode 9.0 has changed quite a few previously single-wide glyphs to double-wide. unicat.c's width_table should be updated accordingly. (There might be other changes as well, I don't know, so ideally all the other tables should be regererated as well.)

This change is causing quite a bit of headache right now. glibc hasn't updated yet (as of the just-released version 2.25) ( https://sourceware.org/bugzilla/show_bug.cgi?id=20313 ), although Fedora seems to claim they have patched it already ( https://fedoraproject.org/wiki/Changes/Unicode_9.0 ).

glib surprisingly updated with the micro version 2.50.1.

Most terminal emulators use glibc's methods (e.g. wcwidth()), but gnome-terminal (and other VTE based emulators) use glib's methods.

Obviously if an app's behavior doesn't match the terminal emulator's, display corruption is bound to happen. See e.g. https://bugzilla.gnome.org/show_bug.cgi?id=772812 (and the pages linked from there).

Sooner or later glibc will update, and then these problems will disappear – well, not for joe until it also updates.

(A nicer approach would be if joe relied on glibc's (or glib's but I guess you don't want another dependency) methods rather than its own implementation.)

Related

Bugs: #380

Discussion

  • Egmont Koblinger

    For this crazy transitional period (quite a few years from now until you can safely assume all OSes have upgraded and Unicode 8.0 is left behind for good) joe could ship both versions (Unicode 8.0 and 9.0) and make it easy to choose (let's say via a configure flag).

     
  • Egmont Koblinger

    ftp://ftp.unicode.org/Public/8.0.0/ucd/EastAsianWidth.txt
    ftp://ftp.unicode.org/Public/9.0.0/ucd/EastAsianWidth.txt

     

    Last edit: Egmont Koblinger 2017-02-14
  • John J. Jordan

    John J. Jordan - 2017-02-15

    Just want to point out that JOE shouldn't depend specifically on glibc (IMO the problem with the Linux monoculture). The configure flag isn't a terrible idea.

     
  • Egmont Koblinger

    wcwidth() (and its prerequisite setlocale()) is POSIX, so I assume it's available on most Unixes, maybe on Windows too?? (I've no clue about Windows programming.) glibc is just the Linux implementation (so to say). You could check if this method is available and use if it is. Or, of course, you can also stick to the current approach, just have two sets of tables and pick compile-time.

     
  • Egmont Koblinger

    Just an update:

    glibc has Unicode 9.0 support in their git, will be released as part of forthcoming version 2.26.

    Both Fedora and Debian have patched their older glibc to ship Unicode 9.0. I'm not sure about the exact version of those, but it has made it into Ubuntu 17.04 Zesty as well, its glibc ships Unicode 9.0.

    I'm not sure how much it's worth the trouble to ship both versions in joe. By the time this issue is addressed and then a tarball is released and then it makes it into distributions, they will all have pretty much all upgraded to Unicode 9.0. Maybe it's simpler if joe just simply goes ahead and upgrades and forgets about 8.0 for good. (That's what e.g. less-487 did too.)

     
  • Joe Allen

    Joe Allen - 2017-12-08

    I've added a configure environment variable that lets you choose the unicode version:

    ./configure UNICODE_VERSION=8.0.0
    ./configure UNICODE_VERSION=9.0.0
    ./configure UNICODE_VERSION=10.0.0

    I set the default to 9.0.0

    I wonder if there is a way to determine the unicode version of the glibc? If so, then the default could be based on the system.

     
  • Joe Allen

    Joe Allen - 2018-01-10
    • status: open --> closed-fixed
     

Log in to post a comment.