Menu

#68 Can't read multibyte unicode with ConsoleRunner

open
nobody
None
5
2011-04-26
2011-04-26
No

When reading from System.in when java is started with jline.ConsoleRunner, mutlibyte unicode characters are flattened to individual bytes.

Attached is a test case that demonstrates this. When run alone (no ConsoleRunner) the unicode character is accurately echoed back. When run through ConsoleRunner, its not.

example:

$ java -cp target/classes/:target/test-classes/ jline.TestUnicodeInput
copy and past this line: Your Head—a Bottleneck
Your Head—a Bottleneck
Your Head—a Bottleneck
bug demo failed, expected input matches actual input

$ java -cp target/classes/:target/test-classes/ jline.ConsoleRunner jline.TestUnicodeInput
copy and past this line: Your Head—a Bottleneck
Your Head—a Bottleneck
Your Heada Bottleneck
bug demo successful, expected input does not match actual input

note the unicode character is unprintable and doesn't appear in this but report.

I came across this by using ConsoleRunner with the clojure repl, which is great, but you can't paste any unicode characters into the repl and get what you expect.

Discussion

  • Adam Lehenbauer

    Adam Lehenbauer - 2011-04-26

    test case that shows the bug

     
  • Adam Lehenbauer

    Adam Lehenbauer - 2011-04-26
    • summary: Can't ready multibyte unicode with ConsoleRunner --> Can't read multibyte unicode with ConsoleRunner
     

Log in to post a comment.