Jace / Feature Requests / #1 Bug in i18n support

#1 Bug in i18n support

Status: closed

Owner: nobody

Labels: None

Priority: 5

Updated: 2002-08-12

Created: 2002-05-21

Creator: Anonymous

Private: No

Main problem is
in "jace\source\c++\jnilib\source\jace\java\lang\String
.cpp":

string String::getCString() const {

JNIEnv* env = helper::attach();

jstring thisString = static_cast<jstring>(
getJavaObject() );

const char* utfString = env->GetStringUTFChars(
thisString, NULL );
...

jstring String::createString( const string& str ) {

JNIEnv* env = helper::attach();

jstring javaString = env->NewStringUTF( str.c_str() );
...

Such creation of strings is clearly wrong. You are using
JNI functions "*StringUTF*" for conversion from
C++ "char*" to jstring (and vica versa), but "char*" is
not UTF-string! For ASCII characters there would be no
problem (since first 128 values equals in all encodings,
including UTF8), but for any national characters there
is would be wrong conversion.

Discussion

Toby Reyelts - 2002-06-06

Logged In: YES
user_id=149785

To be honest, I'm not incredibly familiar with
internationalization in C++, and the lack of support for
internationalization in Jace is not so much a bug as a
purposely overlooked feature. I'm pretty sure standard C++
supports wchar_t and wstring as 16-bit character types, but I
don't know if they demand a particular encoding or not.

How do you propose we add i18n support to Jace, without
making it platform specific?

God bless,
-Toby

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2002-06-12

Logged In: NO

>How do you propose we add i18n support to Jace, without
making it platform specific

It is very easy. Just use default system encoding (file.encoding)
and standart String constructor :-)

There is an example:

// conversion from "char *" to jstring

jstring newString(JNIEnv *env, const char * buf)
{
if( !buf ) return NULL;

int bufLen = strlen(buf);

jbyteArray jbuf = env->NewByteArray(bufLen);

env->SetByteArrayRegion(jbuf,0,bufLen,(jbyte*)buf);

jclass cls = env->FindClass("java/lang/String");

jmethodID init = env->GetMethodID(cls,"<init>","([BII)V");

jstring jstr = (jstring)env->NewObject(cls,init,jbuf,0,bufLen);

env->DeleteLocalRef(jbuf);

return jstr;
}

Similar way can be done conversion from jstring to "char*" (by
invoking getBytes()). For better support you can supply
additional optional parameter for name of encoding of the bytes.

Of course such creation of strings would be slower, but at least
in most cases it would be right conversion from bytes to
Unicode. And continue using of standart UTF-functions - this is
a BUG for any non ASCII environment, not a RFE. At least
deprecate it or mark this issue in documentation. Otherwise
most programs that use JACE would be i18n-hostile.

Sergey Astakhov (sergeya@comita.spb.ru)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Toby Reyelts - 2002-08-12

status: open --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Toby Reyelts - 2002-10-22

Logged In: YES
user_id=149785

This should be taken care of since Jace 1.1 beta 2.

God bless,
-Toby Reyelts

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.