|
From: Nattapong S. <na...@th...> - 2008-06-11 02:24:54
|
Hi all, Per this bug ticket:http://bugs.icu-project.org/trac/ticket/6317 My concern is, if this's really what ICU design, it's not an expected for Thai. Thai leading vowels(U+0E40-U+0E44) is expect to break immediately at position after it when using CharacterBreakIterator. Given this string: "กูกินกุ้งปิ้งอยู่ในถ้ำ" Expecting outcome: "กู", "กิ", "น", "กุ้", "ง", "ปิ้", "ง", "อ", "ยู่", "ใ", "น", "ถ้ำ" ICU4J outcome: "กู", "กิ", "น", "กุ้", "ง", "ปิ้", "ง", "อ", "ยู่", "ใน", "ถ้ำ" Note: The different is "ใ" and "ใน" which "ใ" is Thai leading vowel. Best regards, Nattapong Sirilappanich Engineer, National Language Development - GCoC & TSC Thailand Software Group, IBM Thailand Co., Ltd. National Phone: 02-273-4715, Fax: 02-619-2030 International Phone: +66 2 273 4715, Fax: +66 2 619 2030 e-mail: na...@th... |