On Tue, Apr 20, 2010 at 5:25 PM, Baptiste Lepilleur <baptiste.lepilleur@gmail.com> wrote:
Unicode awarness in JsonCpp is fairly recent (e.g. handling of unicode escape sequence). And IMHO we still need more tests (such as testing if we correctly handle surrogate escape sequences).

If i may recommend:

http://utfcpp.sourceforge.net/

is VERY easy to use and is Public Domain. It can convert/very utf8/16/32 and has a very handy iterator class which lets you iterator over utf8/16/32 strings in a sane manner (each iteration returns on logical character, regardless of its real length).
 
implementation: typically 2 bytes on MSVC, and 4 bytes with gcc). Side note: the next C++ standard introduces utf16_t and utf32_t types to make this explicit,

Yeah!!!!
 
but those are not yet widely available in the industry, so I'd rather we do not rely on this yet.

:(
 
- What asString() and asCString() should return when initialized with an std::wstring? The string converted in utf-8?

utfcpp makes the conversion to utf8 trivial:

utf16to8( inputIteratorBegin, inputIteratorEnd, outputIterator)

e.g., something like:

std::string u8;
utf16to8( wstr.begin(), wstr.end(), std::back_inserter( u8 ) );

i only recently started using utfcpp, but i'm very impressed with how easy it is to use (i'm no Unicode expert, so i need tools like this to help me :).

--
----- stephan beal
http://wanderinghorse.net/home/stephan/