|
From: Andy H. <and...@gm...> - 2023-07-01 20:04:33
|
A few quick comments. This C API tracks very closely to the C++ LocaleBuilder class, which makes good sense. The definition of the ULocaleBuilder type, > /** > * Opaque type for a Locale builder. > * @draft ICU 74 > */ > typedef void *ULocaleBuilder; doesn't follow the pattern used in similar ICU plain C APIs. For example, from uldnames.h:45, /** * Opaque C service object type for the locale display names API * @stable ICU 4.4 */ struct ULocaleDisplayNames; /** * C typedef for struct ULocaleDisplayNames. * @stable ICU 4.4 */ typedef struct ULocaleDisplayNames ULocaleDisplayNames; With this changed, all of the uses of ULocaleBuilder would become explicitly pointers U_CAPI void U_EXPORT2 > ulb_close(ULocaleBuilder builder); would be U_CAPI void U_EXPORT2 ulb_close(ULocaleBuilder *builder); ==== We should define a LocalULocaleBuilderPointer type. See other existing C APIs for how to define it, e.g. uldnames.h:84 - 101. U_DEFINE_LOCAL_OPEN_POINTER(LocalULocaleBuilderPointer, ULocaleBuilder, ulb_close); ==== ulb_open() will presumably return NULL on failure, for example, if creating the underlying LocaleBuilder fails. ==== ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode); The builder should be const ulb_copyErrorTo(const ULocaleBuilder *builder, UErrorCode *outErrorCode); -- Andy On Fri, Jun 30, 2023 at 9:24 PM Rich Gillam via icu-design < icu...@li...> wrote: > Frank— > > I think those getter already exist in > https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73 > > if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"), > value, ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, > &status) > 0 && U_SUCCESS(status) ) { > > > Right. My point is that those APIs are a pain in the butt to use because > you have to allocate a static buffer to receive the characters, even if > you’re just checking that the field is empty. A call to one of those > functions tends to take about three lines of code. If we’re already going > to have an object there that has internally allocated the storage for the > characters and can keep us from writing a bunch of that boilerplate (like > the Locale object in C++), it’d be nice to leverage that to make the code > snippet you show below even simpler. > > —Rich > > On Jun 30, 2023, at 6:47 PM, Frank Tang (譚永鋒) <ft...@go...> wrote: > > Dear Rich > > Thanks for your reply > Just also want to provide some additional historical context that I forget > to mention > > The ICU C++ LocaleBuilder API is try to feature/signature match what was > in Java for years > See > https://docs.oracle.com/javase/8/docs/api/java/util/Locale.Builder.html > in C++ > What I propose here also tries to match that so we do not have yet another > model only for C to build locales differently. > > > As about your comment of "I think you might want an API that has getters, > or an equivalent of the “Locale” class in C++" > > I think those getter already exist in > https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73 > > > “If this locale has the rg subtag, I want to get rid of it and move its > value into the region field” > The current uloc.h API could be used with locale builder to achieve that > > char localeID[ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700> > ]; > // .... copy locale to localeID > UErrorCode status= U_ZERO_ERROR; > ULocaleBuilder builder = ulb_open(); > ulb_setLocale(builder, localeID); > // If this locale has the rg subtag > char value[ ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700> > ]; > if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"), > value, ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, > &status) > 0 && U_SUCCESS(status) ) { > // get rid of it and > ulb_setUnicodeLocaleKeyword(builder, "rg", nullptr); // to remove "rg" > // move its value into the region field > ulb_setRegion(builder, value); > } > // “If this locale ID doesn’t have a script code, > if (*uloc_getScript*(localeID, value, ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, > &status) == 0 && > *uloc_getScript*(otherLocaleID, value, ULOC_FULLNAME_CAPACITY > <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>, > &status) > 0 && U_SUCCESS(status) ) { > // copy in the script code from this other locale ID." > ulb_setScript(builder, value); > > } > int32_t length = ulb_build(builder, localeID, ULOC_FULLNAME_CAPACITY, & > status); > ulb_close(builder); > > The red code is already in the ICU for a long time > The blue code is what I am proposing here. > > > > On Fri, 30 Jun 2023 at 17:08, Rich Gillam <ric...@ap...> > wrote: > >> Frank— >> >> I think the basic issue here is that our APIs for manipulating locale IDs >> in C can be really painful to use if you’re doing anything interesting, and >> that we want get rid of as much of that pain as we can. I think that’s >> more important than whether we match the C++ interface or conform to some >> particular definition of “builder.” That said, I’m starting to come around >> to the idea that what you’re proposing really is about as good as we’re >> going to get. >> >> I’m going to withdraw my earlier suggestion. I don’t think you’re quite >> getting what I’m saying, but my suggestion (using a struct) really only >> works well if you have language/script/region/variant— as soon you try to >> do anything with key-value pairs, you’d need to use a hash table (or >> something) instead of a struct, and not only does that mean you still have >> the heap allocation I was trying to get rid of, but it actually becomes >> harder to use than what you’re proposing. >> >> I still think there might be value in having a struct-based API that just >> handled language/script/region/variant, but maybe that’s just me, and we >> certainly couldn’t go with JUST that, so it’s probably not worth the >> trouble. >> >> I still tend to like Steven’s suggestion to have one setter with a >> selector code of some kind. Your point that it would require a switch >> statement is valid, but I don’t think it’s actually that much of a >> performance hit in practice. But if you’re trying to be as parallel as >> possible to the C++ interface, this works against that, I think. I’m not >> going to argue this one either way. >> >> I do want to raise one other issue, though: I think you might want an API >> that has getters, or an equivalent of the “Locale” class in C++. A lot of >> the use cases for a locale builder are “I want this locale ID, but I want >> to change this one field value,” or “I want to change this one field value, >> but only if this other field or attribute has such-and-such a value”. For >> example: “If this locale has the rg subtag, I want to get rid of it and >> move its value into the region field” or “If this locale ID doesn’t have a >> script code, copy in the script code from this other locale ID." If we >> still use things like uloc_getName() to do “if this field has this value” >> part of the logic, we still have a problem. >> >> That could be a separate ticket and API proposal, of course, but I think >> we want to think about it too. >> >> —Rich >> >> On Jun 30, 2023, at 3:53 PM, Frank Tang (譚永鋒) via icu-design < >> icu...@li...> wrote: >> >> >> >> ---------- Forwarded message --------- >> From: Frank Tang (譚永鋒) <ft...@go...> >> Date: Fri, 30 Jun 2023 at 15:53 >> Subject: Re: [icu-design] C API for LocaleBuilder >> To: Rich Gillam <ric...@ap...> >> >> >> >> >> On Fri, 30 Jun 2023 at 11:00, Rich Gillam <ric...@ap...> >> wrote: >> >>> Frank-- >>> >>> That would be a different approach to construct a locale. However, I >>> would like to point out several things here. >>> 1. That approach won't be a "builder" >>> >>> >>> But it would solve the problem we’re being asked to solve. >>> >> >> no, it does not. What the bug asked for is a C builder API and whatever >> the API does not follow a "builder" design pattern does not solve the >> problem it asked for. A builder needs to be step by step so the decision of >> each field could be spread in different times and different places in the >> code. A single function with a struct does not allow that, and therefore >> not fulfilling the request and requirement. >> >> I am open to a concrete counter proposal. It is simple to think the issue >> could be solved by a simple set() w/ a constant if you only consider the 1) >> language, 2) script, 3) region, 4) variant, 5) languageTag, and 6) localeId >> as the only set of input. But once you start to consider 7) attributes, 8) >> unicode locale keyword/value, and 9) other extensions than "-u-" (for >> example "-t-" or anything in other_extension) things become complicated. >> The validation and override/replacement logic for each of them will be >> different >> >> >> >>> >>> but what you describe would require the caller to resolve that before >>> putting it into the struct. >>> >>> >>> No, it wouldn’t. For building locales, there’s nothing special going on >>> while you’re setting up the builder— it’s just a bag of fields. >>> >> >> no, that is not true, it is NOT a bag of fields, it is an order list of >> fields, not a bag => the order matter. >> in bag sematic , the order does not matter >> >> Consider the following >> >> languageTag "en-Latn-GB-u-ca-roc" >> language "ja" >> script "Thai" >> region "CN" >> variant "hepburn-heploc" >> unicode extension "ca"= "japanese" >> language "sl" >> variant "bohoric" >> unicode extension "ca"= "coptic" >> >> which will produce sl-Thai-CN-bohoric-u-ca-coptic" >> >> but if you treat them as a bug and build that the order in >> language "ja" >> script "Thai" >> region "CN" >> variant "hepburn-heploc" >> unicode extension "ca"= "japanese" >> language "sl" >> languageTag "en-Latn-GB-u-ca-roc" >> variant "bohoric" >> unicode extension "ca"= "coptic" >> >> you will get >> which will produce en-Latn-GB-bohoric-u-ca-coptic" >> >> which result in different value >> >> >> >> >>> All the magic happens when you run the build() method. >>> >> >> That should be implementation depend and should not be mandated by the >> API, there are nothing wrong to progressively build when each setting >> method got called. >> >> >>> I’m just saying that maybe instead of adding a new opaque class with a >>> bunch of individual getters and setters for all those fields, we just use a >>> simple struct (or a hash table, or something). >>> >> >> As I mentioned, what you said only make sense when the underline problem >> is *a bag of fields*, but the problem is this problem is not just a* bag >> of fields.* The information pass into the builder have overlapping with >> each other, in particular the information inside localeID and languageTag >> is not multurelly exclusive from >> language/script/region/variant/attribute/unicode locale keyword+value and >> other extension. They over lap and part of them could be override by >> another. If all the input are muturally exclusive from each other, then the >> problem is much simpler and possible could be solved by "just use a simple >> struct" . But that is NOT the case.(as the example I show above) now. >> >> >> >>> The approach you mentioned seems to require the caller to do a lot of >>> work to fill that struct. and the caller also needs to learn how to program >>> UHashTable while using that API, is that better? >>> >>> >>> Good point. UHashTable is a pain in the butt. Maybe that’s enough >>> right there to swing me back around to what you’ve already proposed… >>> >>> That said, I do kind of like Steven’s suggestion to just have one set() >>> method and just have constants for the individual fields... >>> >> >> If we look at the pre-existing locale C API in uloc.h we have different >> function instead of a get w/ a constant >> uloc_getLanguage(.... >> uloc_getScript(.... >> uloc_getCountry(.... >> uloc_getVariant(.... >> uloc_getName(.... >> uloc_getBaseName(... >> >> notice there is no API to get attributes. >> >> we could also use a uloc_get() with a constant, right? but we didn't >> years ago. so... should we follow the same design pattern or to a set w/ >> constant >> >> >> >> We have the following different kind of operation to the builder >> >> 1. language (w/o script and region), region, script, variant, language >> tag (w/ script or region) => set >> 2. attribute => add and remove (there could be multiple attributes in a >> locale >> 3. unicode extension set on a key ("ca" , "japanese") >> 4. other extension - set on a extension ('a', "abc-efg") >> 5. clear extension (say you start from "en-u-ca-japanese-nu-Thai" and you >> want to clear the calendar and change nu to "Arab" to "en-u-nu-Arab") >> >> a simple set w/ constant method could simplify #1 above, but not powerful >> enough to deal with #2 #3 #4 and #5 >> Also, I personally dislike the set + enum => internally swtich(enum) >> approach because almost all the time the caller know what operation need to >> perform in the compile time, but this approach mandate the function >> internally switch based on a value which is fixed in the program but cannot >> be easily optimized by the compiler. >> >> Consider the following case >> >> >> setA(context, value) { >> context.a = value; >> } >> >> setB(context, value) { >> context.b = value; >> } >> and the caller >> >> setA(context, 3); >> setB(context, 5); >> >> vs >> set(context, field, value) { >> switch(field) { >> case A: context.a = value; break; >> case B: context.b = value; break; >> } >> } >> and the caller >> set(context, A, 3); >> set(context, B, 5); >> >> The swtich / case inside the set() simply cannot be optimized by the >> compiler >> >> surely the first approach expose more functions but such exposure allow >> the compiler to avoid unnecessary run time hit by the switch/case which >> could be resovled in the compile time, not during the run time. >> >> >> >>> —Rich >>> >>> On Jun 30, 2023, at 12:44 AM, Frank Tang (譚永鋒) <ft...@go...> wrote: >>> >>> Dear Richard >>> >>> On Thu, 29 Jun 2023 at 18:31, Richard Gillam <ric...@ap...> >>> wrote: >>> >>>> Frank— >>>> >>>> This looks fine, but also kind of heavy-weight. I wonder if it’d make >>>> sense to just have a struct with all the fields and utility methods to >>>> convert between a locale ID string and that struct? >>>> >>> >>> That would be a different approach to construct a locale. However, I >>> would like to point out several things here. >>> 1. That approach won't be a "builder" >>> The definition of a Builder design pattern is >>> "Builder is a creational design pattern that lets you construct complex >>> objects step by step. " >>> "Problem >>> Imagine a complex object that requires laborious, step-by-step >>> initialization of many fields and nested objects. Such initialization code >>> is usually buried inside a monstrous constructor with lots of parameters. >>> Or even worse: scattered all over the client code." >>> >>> In other words, the key concept of "builder" itself is "step by step" >>> and that allows the builder to get passing to different parts of the system >>> to allow different part of the software to add information into the >>> builder, without a centralized place to gather all the information >>> together. >>> see https://refactoring.guru/design-patterns/builder >>> >>> 2. The step by step processing also means the timing of information >>> putting in matter and the later information overrides the previous >>> information. >>> for example, I could have a >>> ULocaleBuilder builder = ulb_open(); >>> ulb_setLanguageTag(builder, "en-Latn-GB-u-ca-islamic"); >>> ulb_setRegion(builder, "FR"); >>> ulb_setUnicodeExtension(builder, 'x', "abcd"); >>> ..... >>> ulb_setUnicodeLocaleKeyword(builder, "ca", "japanese"); >>> ulb_setRegion(builder, "DE"); >>> ulb_setUnicodeLocaleKeyword(builder, "ca", "roc"); >>> ulb_build(builder...) >>> ulb_close(builder) >>> >>> to build a locale "en-Latn-DE-u-ca-roc-x-abcd" >>> >>> but what you describe would require the caller to resolve that before >>> putting it into the struct. >>> >>> I guess the weak point in that approach would be the key-value pairs at >>>> the end of the locale ID— an alternative approach might be to use a >>>> UHashTable (or whatever it's called) as the bearer of fields (or to provide >>>> both the struct and the hash table, depending on whether the caller cares >>>> about the key-value fields). The beauty of an approach like this is that >>>> you save a few heap allocations and don’t have to add as many new API >>>> functions. >>>> >>> >>> The approach you mentioned seems to require the caller to do a lot of >>> work to fill that struct. and the caller also needs to learn how to program >>> UHashTable while using that API, is that better? >>> >>> >>>> On the other hand, what you have is a thin wrapper around the existing >>>> C++ LocaleBuilder class, so you’re not having to implement the >>>> locale-mangling code all over again. >>>> >>>> I don’t know— what does everybody else think? >>>> >>>> —Rich >>>> >>>> On Jun 29, 2023, at 5:09 PM, Frank Tang (譚永鋒) via icu-design < >>>> icu...@li...> wrote: >>>> >>>> >>>> >>>> Dear ICU team & users, >>>> >>>> >>>> I would like to propose the following for: ICU 74 >>>> >>>> Please provide feedback by: Next Wednesday, July 5, or any time >>>> sufficiently in advance of the feature freeze >>>> >>>> Designated API review: Gary Wade >>>> >>>> Issue: https://unicode-org.atlassian.net/browse/ICU-22365 >>>> A draft PR and implementation (not including test yet) could be found at >>>> https://github.com/unicode-org/icu/pull/2520 ) >>>> >>>> This is to follow up the C++ API in ICU 64 and provide a C API >>>> requested in >>>> >>>> Issue to be discussed >>>> >>>> 1. Should the prefix be ulb_ or something else? >>>> 2. What should ulb_build build? the value as returned by >>>> Locale::getName() or Locale::toLanguageTag()? >>>> 3. is it to use file ulocbuilder.{h,cpp}? or should it be something >>>> else? >>>> >>>> Add C API for ULocaleBuilder >>>> >>>> https://unicode-org.atlassian.net/browse/ICU-22365 >>>> >>>> File : icu4c/source/common/ulocbuilder.h >>>> >>>> // © 2023 and later: Unicode, Inc. and others. >>>> // License & terms of use: http://www.unicode.org/copyright.html >>>> #ifndef __ULOCBUILDER_H__ >>>> #define __ULOCBUILDER_H__ >>>> >>>> #include "unicode/utypes.h" >>>> >>>> >>>> /** >>>> * \file >>>> * \brief C API: Builder API for Locale >>>> */ >>>> >>>> #ifndef U_HIDE_DRAFT_API >>>> >>>> /** >>>> * Opaque type for a Locale builder. >>>> * @draft ICU 74 >>>> */ >>>> typedef void *ULocaleBuilder; >>>> >>>> /** >>>> * <code>ULocaleBuilder</code> is used to build stirng of valid >>>> <code>locale</code> >>>> * from values configured by the setters. >>>> * The <code>ULocaleBuilder</code> checks if a value configured by a >>>> * setter satisfies the syntax requirements defined by the >>>> <code>Locale</code> >>>> * class. A string of Locale created by a <code>ULocaleBuilder</code> >>>> is >>>> * well-formed and can be transformed to a well-formed IETF BCP 47 >>>> language tag >>>> * without losing information. >>>> * >>>> * <p>The following example shows how to create a <code>locale</code> >>>> string >>>> * with the <code>ULocaleBuilder</code>. >>>> * <blockquote> >>>> * <pre> >>>> * UErrorCode err = U_ZERO_ERROR; >>>> * char buffer[ULOC_FULLNAME_CAPACITY]; >>>> * ULocaleBuilder builder = ulb_open(); >>>> * ulb_setLanguage(builder, "sr"); >>>> * ulb_setScript(builder, "Latn"); >>>> * ulb_setRegion(builder, "RS"); >>>> * int32_t length = ulb_build( >>>> * builder, buffer, ULOC_FULLNAME_CAPACITY, &error); >>>> * ulb_close(builder); >>>> * </pre> >>>> * </blockquote> >>>> * >>>> * <p>ULocaleBuilders can be reused; <code>ulb_clear()</code> resets all >>>> * fields to their default values. >>>> * >>>> * <p>ULocaleBuilder tracks errors in an internal UErrorCode. For all >>>> setters, >>>> * except ulb_setLanguageTag and ulb_setLocale, ULocaleBuilder will >>>> return immediately >>>> * if the internal UErrorCode is in error state. >>>> * To reset internal state and error code, call clear method. >>>> * The ulb_setLanguageTag and setLocale method will first clear the >>>> internal >>>> * UErrorCode, then track the error of the validation of the input >>>> parameter >>>> * into the internal UErrorCode. >>>> * >>>> * @draft ICU 74 >>>> */ >>>> >>>> /** >>>> * Constructs an empty ULocaleBuilder. The default value of all >>>> * fields, extensions, and private use information is the >>>> * empty string. The created builder should be destoried by calling >>>> * ulb_close(); >>>> * >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI ULocaleBuilder U_EXPORT2 >>>> ulb_open(); >>>> >>>> /** >>>> * Close the builder and destroy it's internal states. >>>> * @param builder the builder >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_close(ULocaleBuilder builder); >>>> >>>> /** >>>> * Resets the <code>ULocaleBuilder</code> to match the provided >>>> * <code>locale</code>. Existing state is discarded. >>>> * >>>> * <p>All fields of the locale must be well-formed. >>>> * <p>This method clears the internal UErrorCode. >>>> * >>>> * @param builder the builder >>>> * @param locale the locale >>>> * >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setLocale(ULocaleBuilder builder, const char* locale); >>>> >>>> /** >>>> * Resets the ULocaleBuilder to match the provided IETF BCP 47 language >>>> tag. >>>> * Discards the existing state. >>>> * The empty string causes the builder to be reset, like {@link #clear}. >>>> * Legacy language tags (marked as “Type: grandfathered” in BCP 47) >>>> * are converted to their canonical form before being processed. >>>> * Otherwise, the <code>language tag</code> must be well-formed, >>>> * or else the ulb_build() method will later report an >>>> U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p>This method clears the internal UErrorCode. >>>> * >>>> * @param builder the builder >>>> * @param tag the language tag, defined as IETF BCP 47 language tag. >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setLanguageTag(ULocaleBuilder builder, const char* tag); >>>> >>>> /** >>>> * Sets the language. If <code>language</code> is the empty string, the >>>> * language in this <code>ULocaleBuilder</code> is removed. Otherwise, >>>> the >>>> * <code>language</code> must be well-formed, or else the ulb_build() >>>> method will >>>> * later report an U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p>The syntax of language value is defined as >>>> * [unicode_language_subtag]( >>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_language_subtag). >>>> * >>>> * @param builder the builder >>>> * @param language the language >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setLanguage(ULocaleBuilder builder, const char* language); >>>> >>>> /** >>>> * Sets the script. If <code>script</code> is the empty string, the >>>> script in >>>> * this <code>ULocaleBuilder</code> is removed. >>>> * Otherwise, the <code>script</code> must be well-formed, or else the >>>> ulb_build() >>>> * method will later report an U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p>The script value is a four-letter script code as >>>> * [unicode_script_subtag]( >>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_script_subtag) >>>> * defined by ISO 15924 >>>> * >>>> * @param builder the builder >>>> * @param script the script >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setScript(ULocaleBuilder builder, const char* script); >>>> >>>> /** >>>> * Sets the region. If region is the empty string, the region in this >>>> * <code>ULocaleBuilder</code> is removed. Otherwise, the >>>> <code>region</code> >>>> * must be well-formed, or else the ulb_build() method will later >>>> report an >>>> * U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p>The region value is defined by >>>> * [unicode_region_subtag]( >>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_region_subtag) >>>> * as a two-letter ISO 3166 code or a three-digit UN M.49 area code. >>>> * >>>> * <p>The region value in the <code>Locale</code> created by the >>>> * <code>ULocaleBuilder</code> is always normalized to upper case. >>>> * >>>> * @param builder the builder >>>> * @param region the region >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setRegion(ULocaleBuilder builder, const char* region); >>>> >>>> /** >>>> * Sets the variant. If variant is the empty string, the variant in >>>> this >>>> * <code>LocaleBuilder</code> is removed. Otherwise, the >>>> <code>variant</code> >>>> * must be well-formed, or else the ulb_build() method will later >>>> report an >>>> * U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p><b>Note:</b> This method checks if <code>variant</code> >>>> * satisfies the >>>> * [unicode_variant_subtag]( >>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_variant_subtag) >>>> * syntax requirements, and normalizes the value to lowercase letters. >>>> However, >>>> * the <code>Locale</code> class does not impose any syntactic >>>> * restriction on variant. To set an ill-formed variant, use a Locale >>>> constructor. >>>> * If there are multiple unicode_variant_subtag, the caller must >>>> concatenate >>>> * them with '-' as separator (ex: "foobar-fibar"). >>>> * >>>> * @param builder the builder >>>> * @param variant the variant >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setVariant(ULocaleBuilder builder, const char* variant); >>>> >>>> /** >>>> * Sets the extension for the given key. If the value is the empty >>>> string, >>>> * the extension is removed. Otherwise, the <code>key</code> and >>>> * <code>value</code> must be well-formed, or else the ulb_build() >>>> method will >>>> * later report an U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p><b>Note:</b> The key ('u') is used for the Unicode locale >>>> extension. >>>> * Setting a value for this key replaces any existing Unicode locale >>>> key/type >>>> * pairs with those defined in the extension. >>>> * >>>> * <p><b>Note:</b> The key ('x') is used for the private use code. To be >>>> * well-formed, the value for this key needs only to have subtags of >>>> one to >>>> * eight alphanumeric characters, not two to eight as in the general >>>> case. >>>> * >>>> * @param builder the builder >>>> * @param key the extension key >>>> * @param value the extension value >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setExtension(ULocaleBuilder builder, char key, const char* value); >>>> >>>> /** >>>> * Sets the Unicode locale keyword type for the given key. If the type >>>> * StringPiece is constructed with a nullptr, the keyword is removed. >>>> * If the type is the empty string, the keyword is set without type >>>> subtags. >>>> * Otherwise, the key and type must be well-formed, or else the >>>> ulb_build() >>>> * method will later report an U_ILLEGAL_ARGUMENT_ERROR. >>>> * >>>> * <p>Keys and types are converted to lower case. >>>> * >>>> * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension} >>>> * replaces all Unicode locale keywords with those defined in the >>>> * extension. >>>> * >>>> * @param builder the builder >>>> * @param key the Unicode locale key >>>> * @param type the Unicode locale type >>>> * @return This builder. >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_setUnicodeLocaleKeyword(ULocaleBuilder builder, >>>> const char* key, const char* type); >>>> >>>> /** >>>> * Adds a unicode locale attribute, if not already present, otherwise >>>> * has no effect. The attribute must not be empty string and must be >>>> * well-formed or U_ILLEGAL_ARGUMENT_ERROR will be set to status >>>> * during the ulb_build() call. >>>> * >>>> * @param builder the builder >>>> * @param attribute the attribute >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_addUnicodeLocaleAttribute(ULocaleBuilder builder, const char* >>>> attribute); >>>> >>>> /** >>>> * Removes a unicode locale attribute, if present, otherwise has no >>>> * effect. The attribute must not be empty string and must be >>>> well-formed >>>> * or U_ILLEGAL_ARGUMENT_ERROR will be set to status during the >>>> ulb_build() call. >>>> * >>>> * <p>Attribute comparison for removal is case-insensitive. >>>> * >>>> * @param builder the builder >>>> * @param attribute the attribute >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_removeUnicodeLocaleAttribute(ULocaleBuilder builder, const char* >>>> attribute); >>>> >>>> /** >>>> * Resets the builder to its initial, empty state. >>>> * <p>This method clears the internal UErrorCode. >>>> * >>>> * @param builder the builder >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_clear(ULocaleBuilder builder); >>>> >>>> /** >>>> * Resets the extensions to their initial, empty state. >>>> * Language, script, region and variant are unchanged. >>>> * >>>> * @param builder the builder >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI void U_EXPORT2 >>>> ulb_clearExtensions(ULocaleBuilder builder); >>>> >>>> /* >>>> * Build the Locale< stirng from the fields set >>>> * on this builder. >>>> * If any set methods or during the ulb_build() call require memory >>>> allocation >>>> * but fail U_MEMORY_ALLOCATION_ERROR will be set to status. >>>> * If any of the fields set by the setters are not well-formed, the >>>> status >>>> * will be set to U_ILLEGAL_ARGUMENT_ERROR. The state of the builder >>>> will >>>> * not change after the ulb_build() call and the caller is free to keep >>>> using >>>> * the same builder to build more locales. >>>> * >>>> * @param builder the builder >>>> * @param err the error code >>>> * @return the length of the locale id in buffer >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI int32_t U_EXPORT2 >>>> ulb_build(ULocaleBuilder builder, char* buffer, int32_t bufferCapacity, >>>> UErrorCode* err); >>>> >>>> /** >>>> * Sets the UErrorCode if an error occurred while recording sets. >>>> * Preserves older error codes in the outErrorCode. >>>> * >>>> * @param builder the builder >>>> * @param outErrorCode Set to an error code that occurred while setting >>>> subtags. >>>> * Unchanged if there is no such error or if >>>> outErrorCode >>>> * already contained an error. >>>> * @return true if U_FAILURE(*outErrorCode) >>>> * @draft ICU 74 >>>> */ >>>> U_CAPI UBool U_EXPORT2 >>>> ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode); >>>> >>>> #endif /* U_HIDE_DRAFT_API */ >>>> >>>> #endif // __ULOCBUILDER_H__ >>>> -- >>>> Frank Yung-Fong Tang >>>> 譚永鋒 / 🌭🍊 >>>> Sr. Software Engineer >>>> _______________________________________________ >>>> icu-design mailing list >>>> icu...@li... >>>> To Un/Subscribe: >>>> https://lists.sourceforge.net/lists/listinfo/icu-design >>>> >>>> >>>> >>> >>> -- >>> Frank Yung-Fong Tang >>> 譚永鋒 / 🌭🍊 >>> Sr. Software Engineer >>> >>> >>> >> >> -- >> Frank Yung-Fong Tang >> 譚永鋒 / 🌭🍊 >> Sr. Software Engineer >> >> >> -- >> Frank Yung-Fong Tang >> 譚永鋒 / 🌭🍊 >> Sr. Software Engineer >> >> _______________________________________________ >> icu-design mailing list >> icu...@li... >> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design >> >> >> > > -- > Frank Yung-Fong Tang > 譚永鋒 / 🌭🍊 > Sr. Software Engineer > > > _______________________________________________ > icu-design mailing list > icu...@li... > To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design |