Menu

#359 joe should consider Unicode private use area to be printable

v4.4
closed-fixed
None
v4.3
5
2017-09-26
2016-11-06
Josip Rodin
No

Adam Borowski submitted this at https://bugs.debian.org/822074

As joe does its own character classification, rather than using glibc's
iswfoo() as everything else does, sometimes its interpretation differs.
In particular, joe fails to display any of private use area characters
(U+E000..U+F8FF, U+F000..U+FFFFD, U+100000..U+10FFFD).

Classification returned by glibc:
width 1 punct graph print

While the Unicode standard says only that codepoints in that range are "not
noncharacters" without defining their properties, there's no way to sanely
give them a control function, thus making "printable" the only remaining
option. That's what glibc does -- and that's how all programs other than
joe treat these characters.

Here's a minimal patch that fixes iswprint(PUA):

--- joe-4.1.orig/joe/unicode.c
+++ joe-4.1/joe/unicode.c
@@ -321,6 +321,7 @@ void joe_iswinit()
cclass_union(cclass_print, unicode("N"));
cclass_union(cclass_print, unicode("P"));
cclass_union(cclass_print, unicode("Zs"));
+ cclass_union(cclass_print, unicode("Co"));
cclass_opt(cclass_print);

    /* Graphical characters (no spaces) */

Classification returned by glibc:
width 1 punct graph print

I wonder about iswpunct() -- glibc somehow returns true for PUA characters,
so it might be a good idea to be consistent with it (even if I don't see why
it's set). As for iswgraph(), joe defines this function but never uses it.
Joe's wcwidth() assumes 1 for all not explicitely listed characters, so
that's same as glibc.

Discussion

  • Josip Rodin

    Josip Rodin - 2016-11-06
    • summary: joe should considers Unicode private use area to be printable --> joe should consider Unicode private use area to be printable
     
  • Joe Allen

    Joe Allen - 2016-12-08

    This is now fixed in Mercurial. I'm also marking the PUA as graphical even though JOE doesn't use this class. I don't understand the rational for marking them as punctuation, so ignoring that for now.

     
  • John J. Jordan

    John J. Jordan - 2017-09-26
    • status: open --> closed-fixed
    • assigned_to: Joe Allen
    • Group: Unknown --> v4.4
     

Log in to post a comment.