|
From: Richard F. <rj...@fi...> - 2001-07-23 00:05:43
|
On Sun, 22 Jul 2001, Eric Lee Green wrote: > Re: Internationalization: Will think about that. You're right, we need to > do something there. Agree about the feebleness of the C++ String class, > even the Java String class is better (at least it can represent all known > international character sets!). More after I've thunk on it :-). C++ already handles wide characters. string is really a typedef of basic_string<char>, and wstring is a typedef of basic_string<wchar_t>. All we really need there is a couple of simple typedefs: #ifdef UNICODE #define TCHAR wchar_t #define _T(x) L##x #else #define TCHAR char #define _T(x) x #endif typedef basic_string<TCHAR> tstring; Then, we just use tstring in all places we would use string, and wrap all of our string constants in the _T macro. This way, we can switch between 8 and 16-bit characters just by defining UNICODE. There are probably a few other things to class this way, like input and output streams, etc. But they are fairly easy to handle. FYI, this is (sort of) what windows programs do to maintain source portability between NT/2000, which support Unicode throughout, and 95/98 which have very limited Unicode support. The trick is we need an enhanced basic_string class, that can handle the useful operations like paramter substitution and search-and-replace. And it, or a super-class, should also be able to handle the transparent language translation. What I don't know is if Java can handle ANSI strings passed to it from C++ code. Any comments? BTW, there are only a couple of problems with 16-bit character strings, the most important is that tstring.length() != sizeof(wstring.data()). If you want to know the actual byte-length of a string for an IO operation, you have to do tstring.length() * sizeof(TCHAR). -- Richard Fish, Unix/Linux Software Engineer, rj...@fi... |