From: Venu A. <ven...@an...> - 2006-06-29 18:24:19
|
This is what is needed to find the character length(strCharLen)..some one could correct me if am wrong..and you need to pass the total bytes(limit). May be one should add it as std function in ICU. unsigned int strCharLen(unsigned const char* p, int limit)=09 { if( limit =3D=3D -1 ) limit =3D strlen((const char*)p); =09 int i, cLen=3D0; =09 for(i=3D0; i< limit; i +=3D CharLen(&p[i])) { cLen++; } return((unsigned int)cLen); } int CharLen(unsigned const char * buf)=09 { unsigned char c =3D *buf; if( c & 0x80 ) { // is UTF8 if( (c >=3D 0xc0) && (c <=3D 0xdf) ) { // 2 bytes; return(2); } else if( (c >=3D 0xe0) && (c <=3D 0xef) ) { // 3 bytes return(3); } else if( (c >=3D 0xf0) && (c <=3D 0xf7) ) { // 3 bytes return(4); } } return(1); } -----Original Message----- From: icu...@li... [mailto:icu...@li...] On Behalf Of Kalyan Gunda Sent: Thursday, June 29, 2006 8:39 AM To: icu...@li... Subject: Re: [icu-support] UTF-8 length I understand that storage space can be found using strlen(). I am more looking at finding number of characters in a utf8 string. The purpose is to limit the number of characters in a filename(represented by utf8name) to say 255. icu...@li... wrote on 06/29/2006 11:35:54 AM: > If the UTF-8 string is null-terminated, you can just use strlen() + > 1 to get the storage space in bytes. If it is not null-terminated, you > need to get its byte length from somewhere else! :-) > > Regarding the number of characters, you don't state your purpose so=20 > it's hard to say, but if you want to count an accented letter as one=20 > character, you may have to ensure that the text is in normalized form=20 > NFC or NFKC. > > Erik > On 6/29/06, Kalyan Gunda <kg...@us...> wrote: > Is there a rtn in ICU that can determine the number of characters in UTF-8 > string and also a rtn to determine the storage space of a UTF-8 sting? > I see U8_LENGTH(c) but the input is a UTF-16 code point....i have a=20 > UTF-8 > string that I would like to find how many characters it has? > > Thanks > kalyan > > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > = http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D= 1216 > 42 _______________________________________________ > icu-support mailing list - icu...@li... To=20 > Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > = http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D= 1216 > 42 _______________________________________________ > icu-support mailing list - icu...@li... To=20 > Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D= 121642 _______________________________________________ icu-support mailing list - icu...@li... To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support |