EXT:KEYBOARD-INPUT issues

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi CLISP devels,

The usual preface: I do _NOT_ think of the SCREEN/*KEYBOARD-INPUT* topic
as really urgent, but Sam had asked in the "feature request" tracker, so
here is a detailed list of problems I had with EXT:*KEYBOARD-INPUT* in
the past.

Sorry for posting this on the devel-list but for a comment in the
feature-tracker it's a bit too long.

Sam's original comment from the feature tracker was:

 > SDS wrote:
 >
 > in general, clisp screen/keyboard interaction facilities are ancient
 > and probably not used too much (if at all) by the users.
 > this is obvious from the fact that (read-char *keyboard-input*) returns
 > a SYS::INPUT-CHARACTER whose accessors are not exported.
 >
 > Proposal:
 >
 > - move screen & keyboard streams from stream.d into a separate module
 > (window-stream and keyboard stream do not have to be lisp streams, and
 > if you think they have to, you can use gray streams) together with
 > xcharin.lisp.
 >
 > - make keyboard input recognize _all_ keyboard events (e.g., now f1 is
 > recognized by ctrl-f1 is not).
 >
 > - resurrect the ancient src/editor.lsp (?) in the same module(?) and
 > make it more emacs-compatible.

https://sourceforge.net/tracker/?func=detail&atid=351355&aid=1339718&group_id=1355

To  demonstrate the problems I use the LOOP example from the CLISP Impnotes,
Chapter 21.2.2., "Macro EXT:WITH-KEYBOARD", that prints all keystrokes on
the screen until the user hits the spacebar:

(ext:with-keyboard
   (loop :for char = (read-char ext:*keyboard-input*)
         :for key = (or (ext:char-key char) (character char))
         :do (print (list char key))
         :when (eql key #\Space) :return (list char key)))

1.) The EXT:*KEYBOARD-INPUT* stream works byte-oriented and does not
produce the expected results with multi-byte unicode characters.

An uni-byte ASCII character #\a produces the correct result:

(#S(SYSTEM::INPUT-CHARACTER :CHAR #\a :BITS 0 :FONT 0 :KEY NIL) #\a)

A two-byte unicode character #\ä produces two wrong results:

(#S(SYSTEM::INPUT-CHARACTER :CHAR #\LATIN_CAPITAL_LETTER_A_WITH_TILDE
        :BITS 0 :FONT 0 :KEY NIL) #\LATIN_CAPITAL_LETTER_A_WITH_TILDE)
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\CURRENCY_SIGN :BITS 0 :FONT 0 :KEY NIL)
        #\CURRENCY_SIGN)

while the correct answer would be:

(character #\ä) => #\LATIN_SMALL_LETTER_A_WITH_DIAERESIS

2.) An arrow-key pressed with no other key produces the correct result:

(#S(SYSTEM::INPUT-CHARACTER :CHAR NIL :BITS 8 :FONT 0 :KEY :LEFT) :LEFT)

while an arrow-key if pressed together with Shift, Control, or Meta [Alt]
produces an escape-sequence:

(#S(SYSTEM::INPUT-CHARACTER :CHAR #\Escape :BITS 0 :FONT 0 :KEY NIL) #\Escape)
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\[ :BITS 0 :FONT 0 :KEY NIL) #\[)
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\1 :BITS 0 :FONT 0 :KEY NIL) #\1)
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\; :BITS 0 :FONT 0 :KEY NIL) #\;)  ; <-[!]
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\5 :BITS 0 :FONT 0 :KEY NIL) #\5)
(#S(SYSTEM::INPUT-CHARACTER :CHAR #\D :BITS 0 :FONT 0 :KEY NIL) #\D)

Many escape-sequences contain a semicolon, what in a Lisp character stream
produces very nasty side-effects, because in a character stream a semicolon
is understood as the beginning of a comment, and a #\Newline or END-OF-FILE
is understood as the end of a comment. Because escape-sequences are usually
_NOT_ terminated by a #\Newline or END-OF-FILE, the Lisp reader gets stuck
in "comment mode" until a #\Newline or END-OF-FILE appears in the stream
by accident. Because EXT:*KEYBOARD-INPUT* is an INPUT stream, I cannot write
an artificial #\Newline character to it to terminate the escape-sequence.
This way it's impossible to write an escape-sequence parser on the Lisp level.

The detailed problems with EXT:*KEYBOARD-INPUT* in CLISP 2.49+ CVS HEAD are:

READ-CHAR

- Recognizes multi-byte unicode characters as multiple uni-byte ASCII
characters and reads them as several characters if invoked sequentially.

- With escape-sequences containing a semicolon, READ-CHAR does _NOT_
consider the semicolon as the beginning of a comment [what is exactly the
opposite behaviour to all other functions below], instead the semicolon
and everything after is read as ordinary uni-byte ASCII characters
without hanging.

READ-CHAR-NO-HANG

- Recognizes multi-byte unicode characters correctly, but reads only the
first byte and returns wrong results with multi-byte characters. There
currently seems to be no way to find out whether the return value of
READ-CHAR-NO-HANG is correct or not.

- With escape-sequences containing a semicolon, READ-CHAR-NO-HANG consideres
the semicolon as the end of the stream [and probably everything afterwards
as a comment], what has to the consequence, that the semicolon and everything
after is left in the EXT:*KEYBOARD-INPUT* stream and re-appears at the next
invocation of READ-CHAR.

- A READ-CHAR-NO-HANG return value of NIL does not necessarily mean that
the *keyboard-input* stream is empty.

READ-CHAR-WILL-HANG-P

- Returns T, even if there are comment characters in the stream [left from
an escape-sequence containing a semicolon], which can be read by READ-CHAR
without hanging.

PEEK-CHAR

- PEEK-CHAR with and EOF-ERROR-P argument of NIL cannot be used to test
the end of EXT:*KEYBOARD-INPUT*, because if EXT:*KEYBOARD-INPUT* is empty,
PEEK-CHAR hangs until a new SYS::INPUT-CHARACTER appears in the stream.

CLEAR-INPUT

- (CLEAR-INPUT EXT:*KEYBOARD-INPUT*) does not reliably clear
EXT:*KEYBOARD-INPUT*. Comment characters [left from an escape-sequence
containing a semicolon] are still in the stream afterwards, probably
because the comment is not terminated by a #\Newline or END-OF-FILE
and is understood as an "unterminated comment".

READ, UNREAD-CHAR, and READ-LINE

- all tree only work with Common Lisp standard characters and signal a
"wrong-type" error with SYS::INPUT-CHARACTERs.

Summary:

IMO the main problem is that EXT:*KEYBOARD-INPUT* is implemented as a
Lisp character stream with a ton of exception handling on the C level
(e.g. lots of terminal escape sequences etc. but obviously still not
many enough).

Does it really make sense to overload the "exception handling" even more
or would it be better to implement EXT:*KEYBOARD-INPUT* as a byte-stream,
what would make it much easier to write custom parsers on the Lisp level?

A Lisp parser is not necessarily less work or less complicated than a
parser written in C, but a Lisp programmer would have the chance to
adapt the parser much easier to his/her own needs.

Thanks,

- edgar

-- 
The author of this email does not necessarily endorse the following
advertisements, which are the sole responsibility of the advertiser:

EXT:*KEYBOARD-INPUT* issues