#121 how to display undisplayable or invalid characters

open
nobody
None
5
2005-06-11
2005-06-11
Egmont Koblinger
No

When the file contains a character that cannot be displayed on the
terminal (e.g. an UTF-8 encoded file contains an Euro sign, the
terminal is Latin-1, but joe is told that the file is in UTF-8) then a
question mark is displayed. I'd like to request to display a symbol
that is visually distinguisable from the normal question mark or
any other normal character, e.g. use inverse color or something
similar.

(However, at least ^K <space> correctly tells me what character
that is.)

Similar but different:

If there's an invalid byte sequence in a file (e.g. the file is not valid
UTF-8 but the terminal is UTF-8 and joe is also set to assume
UTF-8 for the file) then an X is printed. Here I'd also like to request
to print a different symbol which is visually distinguisable from
normal characters. Actually, if the terminal is UTF-8, then the
replacement character (U+FFFD) serves exactly this purpose so
that'd be the best choice.

Also, in this case ^K <space> lies to me, it says 88(0130/0x58)
which is the ascii code of the X character, so this time a real
uppercase X letter is absolutely indistingishable from a non-valid
UTF-8 sequence unless I switch to another charset with ^T E. For
invalid sequences ^K <space> should rather list the pure bytes
the displayed symbol covers than show the code of the
replacement character as this one isn't really helpful.

Discussion