Menu

#1 Bug in i18n support

closed
nobody
None
5
2002-08-12
2002-05-21
Anonymous
No

Main problem is
in "jace\source\c++\jnilib\source\jace\java\lang\String
.cpp":

string String::getCString() const {

JNIEnv* env = helper::attach();

jstring thisString = static_cast<jstring>(
getJavaObject() );

const char* utfString = env->GetStringUTFChars(
thisString, NULL );
...

jstring String::createString( const string& str ) {

JNIEnv* env = helper::attach();

jstring javaString = env->NewStringUTF( str.c_str() );
...

Such creation of strings is clearly wrong. You are using
JNI functions "*StringUTF*" for conversion from
C++ "char*" to jstring (and vica versa), but "char*" is
not UTF-string! For ASCII characters there would be no
problem (since first 128 values equals in all encodings,
including UTF8), but for any national characters there
is would be wrong conversion.

Discussion

  • Toby Reyelts

    Toby Reyelts - 2002-06-06

    Logged In: YES
    user_id=149785

    To be honest, I'm not incredibly familiar with
    internationalization in C++, and the lack of support for
    internationalization in Jace is not so much a bug as a
    purposely overlooked feature. I'm pretty sure standard C++
    supports wchar_t and wstring as 16-bit character types, but I
    don't know if they demand a particular encoding or not.

    How do you propose we add i18n support to Jace, without
    making it platform specific?

    God bless,
    -Toby

     
  • Nobody/Anonymous

    Logged In: NO

    >How do you propose we add i18n support to Jace, without
    making it platform specific

    It is very easy. Just use default system encoding (file.encoding)
    and standart String constructor :-)

    There is an example:

    // conversion from "char *" to jstring

    jstring newString(JNIEnv *env, const char * buf)
    {
    if( !buf ) return NULL;

    int bufLen = strlen(buf);

    jbyteArray jbuf = env->NewByteArray(bufLen);

    env->SetByteArrayRegion(jbuf,0,bufLen,(jbyte*)buf);

    jclass cls = env->FindClass("java/lang/String");

    jmethodID init = env->GetMethodID(cls,"<init>","([BII)V");

    jstring jstr = (jstring)env->NewObject(cls,init,jbuf,0,bufLen);

    env->DeleteLocalRef(jbuf);

    return jstr;
    }

    Similar way can be done conversion from jstring to "char*" (by
    invoking getBytes()). For better support you can supply
    additional optional parameter for name of encoding of the bytes.

    Of course such creation of strings would be slower, but at least
    in most cases it would be right conversion from bytes to
    Unicode. And continue using of standart UTF-functions - this is
    a BUG for any non ASCII environment, not a RFE. At least
    deprecate it or mark this issue in documentation. Otherwise
    most programs that use JACE would be i18n-hostile.

    Sergey Astakhov (sergeya@comita.spb.ru)

     
  • Toby Reyelts

    Toby Reyelts - 2002-08-12
    • status: open --> closed
     
  • Toby Reyelts

    Toby Reyelts - 2002-10-22

    Logged In: YES
    user_id=149785

    This should be taken care of since Jace 1.1 beta 2.

    God bless,
    -Toby Reyelts

     

Log in to post a comment.