Main problem is
in "jace\source\c++\jnilib\source\jace\java\lang\String
.cpp":
string String::getCString() const {
JNIEnv* env = helper::attach();
jstring thisString = static_cast<jstring>(
getJavaObject() );
const char* utfString = env->GetStringUTFChars(
thisString, NULL );
...
jstring String::createString( const string& str ) {
JNIEnv* env = helper::attach();
jstring javaString = env->NewStringUTF( str.c_str() );
...
Such creation of strings is clearly wrong. You are using
JNI functions "*StringUTF*" for conversion from
C++ "char*" to jstring (and vica versa), but "char*" is
not UTF-string! For ASCII characters there would be no
problem (since first 128 values equals in all encodings,
including UTF8), but for any national characters there
is would be wrong conversion.
Logged In: YES
user_id=149785
To be honest, I'm not incredibly familiar with
internationalization in C++, and the lack of support for
internationalization in Jace is not so much a bug as a
purposely overlooked feature. I'm pretty sure standard C++
supports wchar_t and wstring as 16-bit character types, but I
don't know if they demand a particular encoding or not.
How do you propose we add i18n support to Jace, without
making it platform specific?
God bless,
-Toby
Logged In: NO
>How do you propose we add i18n support to Jace, without
making it platform specific
It is very easy. Just use default system encoding (file.encoding)
and standart String constructor :-)
There is an example:
// conversion from "char *" to jstring
jstring newString(JNIEnv *env, const char * buf)
{
if( !buf ) return NULL;
int bufLen = strlen(buf);
jbyteArray jbuf = env->NewByteArray(bufLen);
env->SetByteArrayRegion(jbuf,0,bufLen,(jbyte*)buf);
jclass cls = env->FindClass("java/lang/String");
jmethodID init = env->GetMethodID(cls,"<init>","([BII)V");
jstring jstr = (jstring)env->NewObject(cls,init,jbuf,0,bufLen);
env->DeleteLocalRef(jbuf);
return jstr;
}
Similar way can be done conversion from jstring to "char*" (by
invoking getBytes()). For better support you can supply
additional optional parameter for name of encoding of the bytes.
Of course such creation of strings would be slower, but at least
in most cases it would be right conversion from bytes to
Unicode. And continue using of standart UTF-functions - this is
a BUG for any non ASCII environment, not a RFE. At least
deprecate it or mark this issue in documentation. Otherwise
most programs that use JACE would be i18n-hostile.
Sergey Astakhov (sergeya@comita.spb.ru)
Logged In: YES
user_id=149785
This should be taken care of since Jace 1.1 beta 2.
God bless,
-Toby Reyelts