|
From: Frank T. (譚. <ft...@go...> - 2023-07-01 01:47:52
|
Dear Rich Thanks for your reply Just also want to provide some additional historical context that I forget to mention The ICU C++ LocaleBuilder API is try to feature/signature match what was in Java for years See https://docs.oracle.com/javase/8/docs/api/java/util/Locale.Builder.html in C++ What I propose here also tries to match that so we do not have yet another model only for C to build locales differently. As about your comment of "I think you might want an API that has getters, or an equivalent of the “Locale” class in C++" I think those getter already exist in https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73 “If this locale has the rg subtag, I want to get rid of it and move its value into the region field” The current uloc.h API could be used with locale builder to achieve that char localeID[ULOC_FULLNAME_CAPACITY <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700> ]; // .... copy locale to localeID UErrorCode status= U_ZERO_ERROR; ULocaleBuilder builder = ulb_open(); ulb_setLocale(builder, localeID); // If this locale has the rg subtag char value[ ULOC_FULLNAME_CAPACITY <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700> ]; if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"), value, ULOC_FULLNAME_CAPACITY <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, &status) > 0 && U_SUCCESS(status) ) { // get rid of it and ulb_setUnicodeLocaleKeyword(builder, "rg", nullptr); // to remove "rg" // move its value into the region field ulb_setRegion(builder, value); } // “If this locale ID doesn’t have a script code, if (*uloc_getScript*(localeID, value, ULOC_FULLNAME_CAPACITY <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, &status) == 0 && *uloc_getScript*(otherLocaleID, value, ULOC_FULLNAME_CAPACITY <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, &status) > 0 && U_SUCCESS(status) ) { // copy in the script code from this other locale ID." ulb_setScript(builder, value); } int32_t length = ulb_build(builder, localeID, ULOC_FULLNAME_CAPACITY, & status); ulb_close(builder); The red code is already in the ICU for a long time The blue code is what I am proposing here. On Fri, 30 Jun 2023 at 17:08, Rich Gillam <ric...@ap...> wrote: > Frank— > > I think the basic issue here is that our APIs for manipulating locale IDs > in C can be really painful to use if you’re doing anything interesting, and > that we want get rid of as much of that pain as we can. I think that’s > more important than whether we match the C++ interface or conform to some > particular definition of “builder.” That said, I’m starting to come around > to the idea that what you’re proposing really is about as good as we’re > going to get. > > I’m going to withdraw my earlier suggestion. I don’t think you’re quite > getting what I’m saying, but my suggestion (using a struct) really only > works well if you have language/script/region/variant— as soon you try to > do anything with key-value pairs, you’d need to use a hash table (or > something) instead of a struct, and not only does that mean you still have > the heap allocation I was trying to get rid of, but it actually becomes > harder to use than what you’re proposing. > > I still think there might be value in having a struct-based API that just > handled language/script/region/variant, but maybe that’s just me, and we > certainly couldn’t go with JUST that, so it’s probably not worth the > trouble. > > I still tend to like Steven’s suggestion to have one setter with a > selector code of some kind. Your point that it would require a switch > statement is valid, but I don’t think it’s actually that much of a > performance hit in practice. But if you’re trying to be as parallel as > possible to the C++ interface, this works against that, I think. I’m not > going to argue this one either way. > > I do want to raise one other issue, though: I think you might want an API > that has getters, or an equivalent of the “Locale” class in C++. A lot of > the use cases for a locale builder are “I want this locale ID, but I want > to change this one field value,” or “I want to change this one field value, > but only if this other field or attribute has such-and-such a value”. For > example: “If this locale has the rg subtag, I want to get rid of it and > move its value into the region field” or “If this locale ID doesn’t have a > script code, copy in the script code from this other locale ID." If we > still use things like uloc_getName() to do “if this field has this value” > part of the logic, we still have a problem. > > That could be a separate ticket and API proposal, of course, but I think > we want to think about it too. > > —Rich > > On Jun 30, 2023, at 3:53 PM, Frank Tang (譚永鋒) via icu-design < > icu...@li...> wrote: > > > > ---------- Forwarded message --------- > From: Frank Tang (譚永鋒) <ft...@go...> > Date: Fri, 30 Jun 2023 at 15:53 > Subject: Re: [icu-design] C API for LocaleBuilder > To: Rich Gillam <ric...@ap...> > > > > > On Fri, 30 Jun 2023 at 11:00, Rich Gillam <ric...@ap...> > wrote: > >> Frank-- >> >> That would be a different approach to construct a locale. However, I >> would like to point out several things here. >> 1. That approach won't be a "builder" >> >> >> But it would solve the problem we’re being asked to solve. >> > > no, it does not. What the bug asked for is a C builder API and whatever > the API does not follow a "builder" design pattern does not solve the > problem it asked for. A builder needs to be step by step so the decision of > each field could be spread in different times and different places in the > code. A single function with a struct does not allow that, and therefore > not fulfilling the request and requirement. > > I am open to a concrete counter proposal. It is simple to think the issue > could be solved by a simple set() w/ a constant if you only consider the 1) > language, 2) script, 3) region, 4) variant, 5) languageTag, and 6) localeId > as the only set of input. But once you start to consider 7) attributes, 8) > unicode locale keyword/value, and 9) other extensions than "-u-" (for > example "-t-" or anything in other_extension) things become complicated. > The validation and override/replacement logic for each of them will be > different > > > >> >> but what you describe would require the caller to resolve that before >> putting it into the struct. >> >> >> No, it wouldn’t. For building locales, there’s nothing special going on >> while you’re setting up the builder— it’s just a bag of fields. >> > > no, that is not true, it is NOT a bag of fields, it is an order list of > fields, not a bag => the order matter. > in bag sematic , the order does not matter > > Consider the following > > languageTag "en-Latn-GB-u-ca-roc" > language "ja" > script "Thai" > region "CN" > variant "hepburn-heploc" > unicode extension "ca"= "japanese" > language "sl" > variant "bohoric" > unicode extension "ca"= "coptic" > > which will produce sl-Thai-CN-bohoric-u-ca-coptic" > > but if you treat them as a bug and build that the order in > language "ja" > script "Thai" > region "CN" > variant "hepburn-heploc" > unicode extension "ca"= "japanese" > language "sl" > languageTag "en-Latn-GB-u-ca-roc" > variant "bohoric" > unicode extension "ca"= "coptic" > > you will get > which will produce en-Latn-GB-bohoric-u-ca-coptic" > > which result in different value > > > > >> All the magic happens when you run the build() method. >> > > That should be implementation depend and should not be mandated by the > API, there are nothing wrong to progressively build when each setting > method got called. > > >> I’m just saying that maybe instead of adding a new opaque class with a >> bunch of individual getters and setters for all those fields, we just use a >> simple struct (or a hash table, or something). >> > > As I mentioned, what you said only make sense when the underline problem > is *a bag of fields*, but the problem is this problem is not just a* bag > of fields.* The information pass into the builder have overlapping with > each other, in particular the information inside localeID and languageTag > is not multurelly exclusive from > language/script/region/variant/attribute/unicode locale keyword+value and > other extension. They over lap and part of them could be override by > another. If all the input are muturally exclusive from each other, then the > problem is much simpler and possible could be solved by "just use a simple > struct" . But that is NOT the case.(as the example I show above) now. > > > >> The approach you mentioned seems to require the caller to do a lot of >> work to fill that struct. and the caller also needs to learn how to program >> UHashTable while using that API, is that better? >> >> >> Good point. UHashTable is a pain in the butt. Maybe that’s enough right >> there to swing me back around to what you’ve already proposed… >> >> That said, I do kind of like Steven’s suggestion to just have one set() >> method and just have constants for the individual fields... >> > > If we look at the pre-existing locale C API in uloc.h we have different > function instead of a get w/ a constant > uloc_getLanguage(.... > uloc_getScript(.... > uloc_getCountry(.... > uloc_getVariant(.... > uloc_getName(.... > uloc_getBaseName(... > > notice there is no API to get attributes. > > we could also use a uloc_get() with a constant, right? but we didn't years > ago. so... should we follow the same design pattern or to a set w/ constant > > > > We have the following different kind of operation to the builder > > 1. language (w/o script and region), region, script, variant, language tag > (w/ script or region) => set > 2. attribute => add and remove (there could be multiple attributes in a > locale > 3. unicode extension set on a key ("ca" , "japanese") > 4. other extension - set on a extension ('a', "abc-efg") > 5. clear extension (say you start from "en-u-ca-japanese-nu-Thai" and you > want to clear the calendar and change nu to "Arab" to "en-u-nu-Arab") > > a simple set w/ constant method could simplify #1 above, but not powerful > enough to deal with #2 #3 #4 and #5 > Also, I personally dislike the set + enum => internally swtich(enum) > approach because almost all the time the caller know what operation need to > perform in the compile time, but this approach mandate the function > internally switch based on a value which is fixed in the program but cannot > be easily optimized by the compiler. > > Consider the following case > > > setA(context, value) { > context.a = value; > } > > setB(context, value) { > context.b = value; > } > and the caller > > setA(context, 3); > setB(context, 5); > > vs > set(context, field, value) { > switch(field) { > case A: context.a = value; break; > case B: context.b = value; break; > } > } > and the caller > set(context, A, 3); > set(context, B, 5); > > The swtich / case inside the set() simply cannot be optimized by the > compiler > > surely the first approach expose more functions but such exposure allow > the compiler to avoid unnecessary run time hit by the switch/case which > could be resovled in the compile time, not during the run time. > > > >> —Rich >> >> On Jun 30, 2023, at 12:44 AM, Frank Tang (譚永鋒) <ft...@go...> wrote: >> >> Dear Richard >> >> On Thu, 29 Jun 2023 at 18:31, Richard Gillam <ric...@ap...> >> wrote: >> >>> Frank— >>> >>> This looks fine, but also kind of heavy-weight. I wonder if it’d make >>> sense to just have a struct with all the fields and utility methods to >>> convert between a locale ID string and that struct? >>> >> >> That would be a different approach to construct a locale. However, I >> would like to point out several things here. >> 1. That approach won't be a "builder" >> The definition of a Builder design pattern is >> "Builder is a creational design pattern that lets you construct complex >> objects step by step. " >> "Problem >> Imagine a complex object that requires laborious, step-by-step >> initialization of many fields and nested objects. Such initialization code >> is usually buried inside a monstrous constructor with lots of parameters. >> Or even worse: scattered all over the client code." >> >> In other words, the key concept of "builder" itself is "step by step" and >> that allows the builder to get passing to different parts of the system to >> allow different part of the software to add information into the builder, >> without a centralized place to gather all the information together. >> see https://refactoring.guru/design-patterns/builder >> >> 2. The step by step processing also means the timing of information >> putting in matter and the later information overrides the previous >> information. >> for example, I could have a >> ULocaleBuilder builder = ulb_open(); >> ulb_setLanguageTag(builder, "en-Latn-GB-u-ca-islamic"); >> ulb_setRegion(builder, "FR"); >> ulb_setUnicodeExtension(builder, 'x', "abcd"); >> ..... >> ulb_setUnicodeLocaleKeyword(builder, "ca", "japanese"); >> ulb_setRegion(builder, "DE"); >> ulb_setUnicodeLocaleKeyword(builder, "ca", "roc"); >> ulb_build(builder...) >> ulb_close(builder) >> >> to build a locale "en-Latn-DE-u-ca-roc-x-abcd" >> >> but what you describe would require the caller to resolve that before >> putting it into the struct. >> >> I guess the weak point in that approach would be the key-value pairs at >>> the end of the locale ID— an alternative approach might be to use a >>> UHashTable (or whatever it's called) as the bearer of fields (or to provide >>> both the struct and the hash table, depending on whether the caller cares >>> about the key-value fields). The beauty of an approach like this is that >>> you save a few heap allocations and don’t have to add as many new API >>> functions. >>> >> >> The approach you mentioned seems to require the caller to do a lot of >> work to fill that struct. and the caller also needs to learn how to program >> UHashTable while using that API, is that better? >> >> >>> On the other hand, what you have is a thin wrapper around the existing >>> C++ LocaleBuilder class, so you’re not having to implement the >>> locale-mangling code all over again. >>> >>> I don’t know— what does everybody else think? >>> >>> —Rich >>> >>> On Jun 29, 2023, at 5:09 PM, Frank Tang (譚永鋒) via icu-design < >>> icu...@li...> wrote: >>> >>> >>> >>> Dear ICU team & users, >>> >>> >>> I would like to propose the following for: ICU 74 >>> >>> Please provide feedback by: Next Wednesday, July 5, or any time >>> sufficiently in advance of the feature freeze >>> >>> Designated API review: Gary Wade >>> >>> Issue: https://unicode-org.atlassian.net/browse/ICU-22365 >>> A draft PR and implementation (not including test yet) could be found at >>> https://github.com/unicode-org/icu/pull/2520 ) >>> >>> This is to follow up the C++ API in ICU 64 and provide a C API requested >>> in >>> >>> Issue to be discussed >>> >>> 1. Should the prefix be ulb_ or something else? >>> 2. What should ulb_build build? the value as returned by >>> Locale::getName() or Locale::toLanguageTag()? >>> 3. is it to use file ulocbuilder.{h,cpp}? or should it be something else? >>> >>> Add C API for ULocaleBuilder >>> >>> https://unicode-org.atlassian.net/browse/ICU-22365 >>> >>> File : icu4c/source/common/ulocbuilder.h >>> >>> // © 2023 and later: Unicode, Inc. and others. >>> // License & terms of use: http://www.unicode.org/copyright.html >>> #ifndef __ULOCBUILDER_H__ >>> #define __ULOCBUILDER_H__ >>> >>> #include "unicode/utypes.h" >>> >>> >>> /** >>> * \file >>> * \brief C API: Builder API for Locale >>> */ >>> >>> #ifndef U_HIDE_DRAFT_API >>> >>> /** >>> * Opaque type for a Locale builder. >>> * @draft ICU 74 >>> */ >>> typedef void *ULocaleBuilder; >>> >>> /** >>> * <code>ULocaleBuilder</code> is used to build stirng of valid >>> <code>locale</code> >>> * from values configured by the setters. >>> * The <code>ULocaleBuilder</code> checks if a value configured by a >>> * setter satisfies the syntax requirements defined by the >>> <code>Locale</code> >>> * class. A string of Locale created by a <code>ULocaleBuilder</code> is >>> * well-formed and can be transformed to a well-formed IETF BCP 47 >>> language tag >>> * without losing information. >>> * >>> * <p>The following example shows how to create a <code>locale</code> >>> string >>> * with the <code>ULocaleBuilder</code>. >>> * <blockquote> >>> * <pre> >>> * UErrorCode err = U_ZERO_ERROR; >>> * char buffer[ULOC_FULLNAME_CAPACITY]; >>> * ULocaleBuilder builder = ulb_open(); >>> * ulb_setLanguage(builder, "sr"); >>> * ulb_setScript(builder, "Latn"); >>> * ulb_setRegion(builder, "RS"); >>> * int32_t length = ulb_build( >>> * builder, buffer, ULOC_FULLNAME_CAPACITY, &error); >>> * ulb_close(builder); >>> * </pre> >>> * </blockquote> >>> * >>> * <p>ULocaleBuilders can be reused; <code>ulb_clear()</code> resets all >>> * fields to their default values. >>> * >>> * <p>ULocaleBuilder tracks errors in an internal UErrorCode. For all >>> setters, >>> * except ulb_setLanguageTag and ulb_setLocale, ULocaleBuilder will >>> return immediately >>> * if the internal UErrorCode is in error state. >>> * To reset internal state and error code, call clear method. >>> * The ulb_setLanguageTag and setLocale method will first clear the >>> internal >>> * UErrorCode, then track the error of the validation of the input >>> parameter >>> * into the internal UErrorCode. >>> * >>> * @draft ICU 74 >>> */ >>> >>> /** >>> * Constructs an empty ULocaleBuilder. The default value of all >>> * fields, extensions, and private use information is the >>> * empty string. The created builder should be destoried by calling >>> * ulb_close(); >>> * >>> * @draft ICU 74 >>> */ >>> U_CAPI ULocaleBuilder U_EXPORT2 >>> ulb_open(); >>> >>> /** >>> * Close the builder and destroy it's internal states. >>> * @param builder the builder >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_close(ULocaleBuilder builder); >>> >>> /** >>> * Resets the <code>ULocaleBuilder</code> to match the provided >>> * <code>locale</code>. Existing state is discarded. >>> * >>> * <p>All fields of the locale must be well-formed. >>> * <p>This method clears the internal UErrorCode. >>> * >>> * @param builder the builder >>> * @param locale the locale >>> * >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setLocale(ULocaleBuilder builder, const char* locale); >>> >>> /** >>> * Resets the ULocaleBuilder to match the provided IETF BCP 47 language >>> tag. >>> * Discards the existing state. >>> * The empty string causes the builder to be reset, like {@link #clear}. >>> * Legacy language tags (marked as “Type: grandfathered” in BCP 47) >>> * are converted to their canonical form before being processed. >>> * Otherwise, the <code>language tag</code> must be well-formed, >>> * or else the ulb_build() method will later report an >>> U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p>This method clears the internal UErrorCode. >>> * >>> * @param builder the builder >>> * @param tag the language tag, defined as IETF BCP 47 language tag. >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setLanguageTag(ULocaleBuilder builder, const char* tag); >>> >>> /** >>> * Sets the language. If <code>language</code> is the empty string, the >>> * language in this <code>ULocaleBuilder</code> is removed. Otherwise, >>> the >>> * <code>language</code> must be well-formed, or else the ulb_build() >>> method will >>> * later report an U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p>The syntax of language value is defined as >>> * [unicode_language_subtag]( >>> http://www.unicode.org/reports/tr35/tr35.html#unicode_language_subtag). >>> * >>> * @param builder the builder >>> * @param language the language >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setLanguage(ULocaleBuilder builder, const char* language); >>> >>> /** >>> * Sets the script. If <code>script</code> is the empty string, the >>> script in >>> * this <code>ULocaleBuilder</code> is removed. >>> * Otherwise, the <code>script</code> must be well-formed, or else the >>> ulb_build() >>> * method will later report an U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p>The script value is a four-letter script code as >>> * [unicode_script_subtag]( >>> http://www.unicode.org/reports/tr35/tr35.html#unicode_script_subtag) >>> * defined by ISO 15924 >>> * >>> * @param builder the builder >>> * @param script the script >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setScript(ULocaleBuilder builder, const char* script); >>> >>> /** >>> * Sets the region. If region is the empty string, the region in this >>> * <code>ULocaleBuilder</code> is removed. Otherwise, the >>> <code>region</code> >>> * must be well-formed, or else the ulb_build() method will later report >>> an >>> * U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p>The region value is defined by >>> * [unicode_region_subtag]( >>> http://www.unicode.org/reports/tr35/tr35.html#unicode_region_subtag) >>> * as a two-letter ISO 3166 code or a three-digit UN M.49 area code. >>> * >>> * <p>The region value in the <code>Locale</code> created by the >>> * <code>ULocaleBuilder</code> is always normalized to upper case. >>> * >>> * @param builder the builder >>> * @param region the region >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setRegion(ULocaleBuilder builder, const char* region); >>> >>> /** >>> * Sets the variant. If variant is the empty string, the variant in this >>> * <code>LocaleBuilder</code> is removed. Otherwise, the >>> <code>variant</code> >>> * must be well-formed, or else the ulb_build() method will later report >>> an >>> * U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p><b>Note:</b> This method checks if <code>variant</code> >>> * satisfies the >>> * [unicode_variant_subtag]( >>> http://www.unicode.org/reports/tr35/tr35.html#unicode_variant_subtag) >>> * syntax requirements, and normalizes the value to lowercase letters. >>> However, >>> * the <code>Locale</code> class does not impose any syntactic >>> * restriction on variant. To set an ill-formed variant, use a Locale >>> constructor. >>> * If there are multiple unicode_variant_subtag, the caller must >>> concatenate >>> * them with '-' as separator (ex: "foobar-fibar"). >>> * >>> * @param builder the builder >>> * @param variant the variant >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setVariant(ULocaleBuilder builder, const char* variant); >>> >>> /** >>> * Sets the extension for the given key. If the value is the empty >>> string, >>> * the extension is removed. Otherwise, the <code>key</code> and >>> * <code>value</code> must be well-formed, or else the ulb_build() >>> method will >>> * later report an U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p><b>Note:</b> The key ('u') is used for the Unicode locale >>> extension. >>> * Setting a value for this key replaces any existing Unicode locale >>> key/type >>> * pairs with those defined in the extension. >>> * >>> * <p><b>Note:</b> The key ('x') is used for the private use code. To be >>> * well-formed, the value for this key needs only to have subtags of one >>> to >>> * eight alphanumeric characters, not two to eight as in the general >>> case. >>> * >>> * @param builder the builder >>> * @param key the extension key >>> * @param value the extension value >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setExtension(ULocaleBuilder builder, char key, const char* value); >>> >>> /** >>> * Sets the Unicode locale keyword type for the given key. If the type >>> * StringPiece is constructed with a nullptr, the keyword is removed. >>> * If the type is the empty string, the keyword is set without type >>> subtags. >>> * Otherwise, the key and type must be well-formed, or else the >>> ulb_build() >>> * method will later report an U_ILLEGAL_ARGUMENT_ERROR. >>> * >>> * <p>Keys and types are converted to lower case. >>> * >>> * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension} >>> * replaces all Unicode locale keywords with those defined in the >>> * extension. >>> * >>> * @param builder the builder >>> * @param key the Unicode locale key >>> * @param type the Unicode locale type >>> * @return This builder. >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_setUnicodeLocaleKeyword(ULocaleBuilder builder, >>> const char* key, const char* type); >>> >>> /** >>> * Adds a unicode locale attribute, if not already present, otherwise >>> * has no effect. The attribute must not be empty string and must be >>> * well-formed or U_ILLEGAL_ARGUMENT_ERROR will be set to status >>> * during the ulb_build() call. >>> * >>> * @param builder the builder >>> * @param attribute the attribute >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_addUnicodeLocaleAttribute(ULocaleBuilder builder, const char* >>> attribute); >>> >>> /** >>> * Removes a unicode locale attribute, if present, otherwise has no >>> * effect. The attribute must not be empty string and must be >>> well-formed >>> * or U_ILLEGAL_ARGUMENT_ERROR will be set to status during the >>> ulb_build() call. >>> * >>> * <p>Attribute comparison for removal is case-insensitive. >>> * >>> * @param builder the builder >>> * @param attribute the attribute >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_removeUnicodeLocaleAttribute(ULocaleBuilder builder, const char* >>> attribute); >>> >>> /** >>> * Resets the builder to its initial, empty state. >>> * <p>This method clears the internal UErrorCode. >>> * >>> * @param builder the builder >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_clear(ULocaleBuilder builder); >>> >>> /** >>> * Resets the extensions to their initial, empty state. >>> * Language, script, region and variant are unchanged. >>> * >>> * @param builder the builder >>> * @draft ICU 74 >>> */ >>> U_CAPI void U_EXPORT2 >>> ulb_clearExtensions(ULocaleBuilder builder); >>> >>> /* >>> * Build the Locale< stirng from the fields set >>> * on this builder. >>> * If any set methods or during the ulb_build() call require memory >>> allocation >>> * but fail U_MEMORY_ALLOCATION_ERROR will be set to status. >>> * If any of the fields set by the setters are not well-formed, the >>> status >>> * will be set to U_ILLEGAL_ARGUMENT_ERROR. The state of the builder will >>> * not change after the ulb_build() call and the caller is free to keep >>> using >>> * the same builder to build more locales. >>> * >>> * @param builder the builder >>> * @param err the error code >>> * @return the length of the locale id in buffer >>> * @draft ICU 74 >>> */ >>> U_CAPI int32_t U_EXPORT2 >>> ulb_build(ULocaleBuilder builder, char* buffer, int32_t bufferCapacity, >>> UErrorCode* err); >>> >>> /** >>> * Sets the UErrorCode if an error occurred while recording sets. >>> * Preserves older error codes in the outErrorCode. >>> * >>> * @param builder the builder >>> * @param outErrorCode Set to an error code that occurred while setting >>> subtags. >>> * Unchanged if there is no such error or if >>> outErrorCode >>> * already contained an error. >>> * @return true if U_FAILURE(*outErrorCode) >>> * @draft ICU 74 >>> */ >>> U_CAPI UBool U_EXPORT2 >>> ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode); >>> >>> #endif /* U_HIDE_DRAFT_API */ >>> >>> #endif // __ULOCBUILDER_H__ >>> -- >>> Frank Yung-Fong Tang >>> 譚永鋒 / 🌭🍊 >>> Sr. Software Engineer >>> _______________________________________________ >>> icu-design mailing list >>> icu...@li... >>> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design >>> >>> >>> >> >> -- >> Frank Yung-Fong Tang >> 譚永鋒 / 🌭🍊 >> Sr. Software Engineer >> >> >> > > -- > Frank Yung-Fong Tang > 譚永鋒 / 🌭🍊 > Sr. Software Engineer > > > -- > Frank Yung-Fong Tang > 譚永鋒 / 🌭🍊 > Sr. Software Engineer > > _______________________________________________ > icu-design mailing list > icu...@li... > To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design > > > -- Frank Yung-Fong Tang 譚永鋒 / 🌭🍊 Sr. Software Engineer |