#238 unexpected char: 0xA9

release_3.5
closed
Rick Giles
5
2012-10-10
2004-06-18
Frank Harper
No

When I run checkStyle from the command line, on java
source containing accented characters like é,č, etc., I
get the following exception :
Exception thrown : unexpected char: 0xA9

Of course I try to avoid typing accented characters in
my Java sources, but sometimes I forget. After a while
I start wondering why checkStyle isn't giving me any
more warnings. Then I'm in for a needle in the haystack
search for the place where I used an accented character.

I'm using checkStyle 3.4 on windows NT. I'm using UTF-8
for the character encoding.

I couldn't find any option to tell checkStyle to use
UTF-8 as the character encoding.

Example of code that causes the exception:

public class TestUnicode
{
int é = 1;

public int geté()
{
    return é;
}

}

Discussion

  • Rick Giles
    Rick Giles
    2004-06-20

    Logged In: YES
    user_id=539926

    I was unable to reproduce this problem through copy/paste of
    your example. Could you upload a jar or zip file containing
    your example?

     
  • Frank Harper
    Frank Harper
    2004-06-21

    Logged In: YES
    user_id=24772

    The jar is attached.

     
  • Frank Harper
    Frank Harper
    2004-06-21

    UnicodeTest.jar

     
    Attachments
  • Rick Giles
    Rick Giles
    2004-06-21

    Logged In: YES
    user_id=539926

    Thanks for attaching the jar. Here is the problem:

    File TestUnicode doesn't compile:

    C:\temp\UnicodeTest>javac TestUnicode.java
    TestUnicode.java:3: illegal character: \169
    int é = 1;
    ^
    TestUnicode.java:5: illegal character: \169
    public int geté()
    ^
    TestUnicode.java:7: illegal character: \169
    return é;
    ^
    3 errors

    The offending character, \169 (the copyright symbol), is the
    second character of three indentifiers, and is not a valid
    Java identifier part, according to
    Character.isJavaIdentifierPart.

    Checkstyle requires source files that compile.

     
  • Frank Harper
    Frank Harper
    2004-06-21

    Logged In: YES
    user_id=24772

    It compiles just fine for me with the following :

    javac -encoding UTF-8 TestUnicode.java

     
  • Rick Giles
    Rick Giles
    2004-06-21

    Logged In: YES
    user_id=539926

    Thanks for the extra information. I've re-opened it as a "bug".

     
  • Rick Giles
    Rick Giles
    2004-06-22

    Logged In: YES
    user_id=539926

    Added charset property to TreeWalker module in 3.5 CVS. To
    use UTF-8 encoding, set the property value to UTF-8.

     
  • Frank Harper
    Frank Harper
    2004-06-22

    Logged In: YES
    user_id=24772

    Thanks Rick. Any idea when this might make it into a realease?