From: Stephan B. <sg...@go...> - 2010-04-20 15:39:33
|
On Tue, Apr 20, 2010 at 5:25 PM, Baptiste Lepilleur < bap...@gm...> wrote: > Unicode awarness in JsonCpp is fairly recent (e.g. handling of unicode > escape sequence). And IMHO we still need more tests (such as testing if we > correctly handle surrogate escape sequences). > If i may recommend: http://utfcpp.sourceforge.net/ is VERY easy to use and is Public Domain. It can convert/very utf8/16/32 and has a very handy iterator class which lets you iterator over utf8/16/32 strings in a sane manner (each iteration returns on logical character, regardless of its real length). > implementation: typically 2 bytes on MSVC, and 4 bytes with gcc). Side > note: the next C++ standard introduces utf16_t and utf32_t types to make > this explicit, > Yeah!!!! > but those are not yet widely available in the industry, so I'd rather we do > not rely on this yet. > :( > - What asString() and asCString() should return when initialized with an > std::wstring? The string converted in utf-8? > utfcpp makes the conversion to utf8 trivial: utf16to8( inputIteratorBegin, inputIteratorEnd, outputIterator) e.g., something like: std::string u8; utf16to8( wstr.begin(), wstr.end(), std::back_inserter( u8 ) ); i only recently started using utfcpp, but i'm very impressed with how easy it is to use (i'm no Unicode expert, so i need tools like this to help me :). -- ----- stephan beal http://wanderinghorse.net/home/stephan/ |