From: hung n. <and...@ya...> - 2009-11-25 18:45:01
|
Hi, I'm working in a project where I have to support Double Byte Character Set (DBCS). The input string I receive from a user can be in English (SBCS), Japanese (DBCS).. then I have to convert this string to UTF-8 before saving it in a UTF-8 database. I don't have any problem with SBCS input. But when the user provides DBCS then the string got corrupted after the conversion. The function I am using is: int32_t ucnv_fromUChars ( UConverter * cnv, char * dest, int32_t destCapacity, const UChar * src, int32_t srcLength, UErrorCode * pErrorCode ) An example of in put is in Japanese: src in hexadecimal: 65e5 672c 8a9e after the call, dest in hex is: 93 fa 96 7b 8c ea fc fc fc fc fc fc I printed "dest" after the call using "cout << dest; " or using a loop to print each character and I saw garbage after the first three characters. Why the string become longer in length after the conversion (from 3 into 12)? Can someone tell me if this function work with DBCS? What am I doing wrong here? Thanks for your help! -Andy. |