From: Markus S. <mar...@gm...> - 2007-04-20 23:16:53
|
On 4/18/07, Deborah Goldsmith <gol...@ap...> wrote: > const CompactTrieDictionary > *ICULanguageBreakFactory::loadDictionaryFor(const char *script, > int32_t breakType); Why not use the existing UScript and UBreakIteratorType enum types? And related to what George asked: Does an RBBI/DBBI remember the locale ID it was created for, if any? If it does, it _might_ be used for dictionary selection, or as a kind of hint while using the dictionary. > An alternative API would be to have loadDictionaryFor take a language > ("th", "lo") instead of a script ("Thai", "Laoo"). Any thoughts on > that? The only requirement is that different scripts/languages cannot > overlap in turns of character membership. They must be disjoint. It seems like the main parameter should be the script because it's unambiguously known from the text. The language, if we have it available, could be an additional parameter and might be ignored by the implementation. markus |