Re: [Cppunit-devel] Toward True Unicode Code... Help requested

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

----- Original Message -----
From: "Duane Murphy" <dua...@ma...>
To: "CppUnit Developers" <cpp...@li...>
Sent: Sunday, April 14, 2002 7:35 PM
Subject: Re: [Cppunit-devel] Toward True Unicode Code... Help requested

> My two cents for what its worth is to do nothing.
>
> There is no "True" Unicode. There are several unicode formats. UTF-8,
> UTF-16, and the upcoming UTF-32 among others. I have been told by some
> associates that keep up with such things that UTF-16, while currently
> being used, is on its way out in favor of UTF-32. It is most often
> recommended to use the simple UTF-8 for most applications.
>
> UTF-8 will likely satisfy most of us and require absolutlely no changes.
> UTF-8 is completely compatible with ASCII (for characters < 128). UTF-8
> fits nicely in a standard string. Anyone that is concerned about such
> things has already worked around any problems involved in using UTF-8
> with std::string. This mostly involved parsing and locating character
> seperations which is of little concern to CppUnit.

Just a question on the side, does that means that if you split a string into
many lines using the '\n' character, you can use the same algorithm in ANSI
and UTF8 ? (=> even two or three bytes characters encoding don't use '\n')

>
> Another reason to do nothing is that I would hope that the C++ standards
> committee at least makes some statement about Unicode or
> internationalization. They have done lots of work to put in
> infrastructure that very few people really understand. I believe that
> they need to make some statement or show some examples of how to truly
> deal with Unicode.
>
> My recommendation is to do nothing.
>
> Is there some other driving factor behind this decision?

My original though was that it makes it easier for outputter: AFAIK you can
not set a code page saying that you're working in UTF8 (let me know if it is
possible).

Since you have API such as fwprintf, cwerr... it wouldn't be a problem to
display the output in Unicode. So I did some testing: trying to display a
few hiragana in VC++ output window. I try two differents way:
- running the test application in post-build test, and printing with
fwprintf
- from a VC++ add-ins, using IApplication::PrintToOutputWindow, which take a
unicode string as argument.

Same result for both, a few '?' characters, meaning that a conversion
occured from unicode to multi-byte charater, and failed to find a match for
the unicode character (the font used for the output window support those
unicode characters).

Basically, that means using unicode doesn't make anything easier: even if
you have unicode, you need to write special application to display the
result. The same applies to UTF8, but...

For UTF8, we already have the XmlOuputter (thanks to Fumiki suggestions, we
can now specify the encoding).

So I agree, let's not change CppUnit. It already support UTF8 and that's
enough. If anything need to be changed, it would be the GUI TestRunner to
support UTF8 and font selection.

Thanks for you feedback Duane,
Baptiste.