Re: [icu-design] C API for LocaleBuilder

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

A few quick comments. This C API tracks very closely to the C++
LocaleBuilder class, which makes good sense.

The definition of the ULocaleBuilder type,

> /**
>  * Opaque type for a Locale builder.
>  * @draft ICU 74
>  */
> typedef void *ULocaleBuilder;

doesn't follow the pattern used in similar ICU plain C APIs. For example,
from uldnames.h:45,

 /**
 * Opaque C service object type for the locale display names API
 * @stable ICU 4.4
 */
struct ULocaleDisplayNames;

/**
 * C typedef for struct ULocaleDisplayNames.
 * @stable ICU 4.4
 */

typedef struct ULocaleDisplayNames ULocaleDisplayNames;

With this changed, all of the uses of ULocaleBuilder would become
explicitly pointers

U_CAPI void U_EXPORT2
> ulb_close(ULocaleBuilder builder);

would be

U_CAPI void U_EXPORT2
ulb_close(ULocaleBuilder *builder);

 ====

We should define a LocalULocaleBuilderPointer type. See other existing C
APIs for how to define it, e.g. uldnames.h:84 - 101.

U_DEFINE_LOCAL_OPEN_POINTER(LocalULocaleBuilderPointer, ULocaleBuilder,
ulb_close);

====

ulb_open() will presumably return NULL on failure, for example, if creating
the underlying LocaleBuilder fails.

====

ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode);

The builder should be const

 ulb_copyErrorTo(const ULocaleBuilder *builder, UErrorCode *outErrorCode);

  -- Andy

On Fri, Jun 30, 2023 at 9:24 PM Rich Gillam via icu-design <
icu...@li...> wrote:

> Frank—
>
> I think those getter already exist in
> https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73
>
> if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"),
>          value, ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
> &status) > 0 && U_SUCCESS(status) ) {
>
>
> Right.  My point is that those APIs are a pain in the butt to use because
> you have to allocate a static buffer to receive the characters, even if
> you’re just checking that the field is empty.  A call to one of those
> functions tends to take about three lines of code.  If we’re already going
> to have an object there that has internally allocated the storage for the
> characters and can keep us from writing a bunch of that boilerplate (like
> the Locale object in C++), it’d be nice to leverage that to make the code
> snippet you show below even simpler.
>
> —Rich
>
> On Jun 30, 2023, at 6:47 PM, Frank Tang (譚永鋒) <ft...@go...> wrote:
>
> Dear Rich
>
> Thanks for your reply
> Just also want to provide some additional historical context that I forget
> to mention
>
> The ICU C++ LocaleBuilder API is try to feature/signature match what was
> in Java for years
> See
> https://docs.oracle.com/javase/8/docs/api/java/util/Locale.Builder.html
> in C++
> What I propose here also tries to match that so we do not have yet another
> model only for C to build locales differently.
>
>
> As about your comment of "I think you might want an API that has getters,
> or an equivalent of the “Locale” class in C++"
>
> I think those getter already exist in
> https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73
>
>
> “If this locale has the rg subtag, I want to get rid of it and move its
> value into the region field”
> The current uloc.h API could be used with locale builder to achieve that
>
> char localeID[ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>
> ];
> // .... copy locale to localeID
> UErrorCode status= U_ZERO_ERROR;
> ULocaleBuilder builder = ulb_open();
> ulb_setLocale(builder, localeID);
> // If this locale has the rg subtag
> char value[ ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>
> ];
> if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"),
>          value, ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
> &status) > 0 && U_SUCCESS(status) ) {
>     // get rid of it and
>   ulb_setUnicodeLocaleKeyword(builder, "rg", nullptr); // to remove "rg"
>   // move its value into the region field
>   ulb_setRegion(builder, value);
> }
> // “If this locale ID doesn’t have a script code,
> if (*uloc_getScript*(localeID,  value, ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
> &status) == 0 &&
>     *uloc_getScript*(otherLocaleID,  value, ULOC_FULLNAME_CAPACITY
> <https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
> &status) > 0 && U_SUCCESS(status) ) {
>     // copy in the script code from this other locale ID."
>   ulb_setScript(builder, value);
>
> }
> int32_t length = ulb_build(builder, localeID, ULOC_FULLNAME_CAPACITY, &
> status);
> ulb_close(builder);
>
> The red code is already in the ICU for a long time
> The blue code is what I am proposing here.
>
>
>
> On Fri, 30 Jun 2023 at 17:08, Rich Gillam <ric...@ap...>
> wrote:
>
>> Frank—
>>
>> I think the basic issue here is that our APIs for manipulating locale IDs
>> in C can be really painful to use if you’re doing anything interesting, and
>> that we want get rid of as much of that pain as we can.  I think that’s
>> more important than whether we match the C++ interface or conform to some
>> particular definition of “builder.”  That said, I’m starting to come around
>> to the idea that what you’re proposing really is about as good as we’re
>> going to get.
>>
>> I’m going to withdraw my earlier suggestion.  I don’t think you’re quite
>> getting what I’m saying, but my suggestion (using a struct) really only
>> works well if you have language/script/region/variant— as soon you try to
>> do anything with key-value pairs, you’d need to use a hash table (or
>> something) instead of a struct, and not only does that mean you still have
>> the heap allocation I was trying to get rid of, but it actually becomes
>> harder to use than what you’re proposing.
>>
>> I still think there might be value in having a struct-based API that just
>> handled language/script/region/variant, but maybe that’s just me, and we
>> certainly couldn’t go with JUST that, so it’s probably not worth the
>> trouble.
>>
>> I still tend to like Steven’s suggestion to have one setter with a
>> selector code of some kind.  Your point that it would require a switch
>> statement is valid, but I don’t think it’s actually that much of a
>> performance hit in practice.  But if you’re trying to be as parallel as
>> possible to the C++ interface, this works against that, I think.  I’m not
>> going to argue this one either way.
>>
>> I do want to raise one other issue, though: I think you might want an API
>> that has getters, or an equivalent of the “Locale” class in C++.  A lot of
>> the use cases for a locale builder are “I want this locale ID, but I want
>> to change this one field value,” or “I want to change this one field value,
>> but only if this other field or attribute has such-and-such a value”.  For
>> example: “If this locale has the rg subtag, I want to get rid of it and
>> move its value into the region field” or “If this locale ID doesn’t have a
>> script code, copy in the script code from this other locale ID."  If we
>> still use things like uloc_getName() to do “if this field has this value”
>> part of the logic, we still have a problem.
>>
>> That could be a separate ticket and API proposal, of course, but I think
>> we want to think about it too.
>>
>> —Rich
>>
>> On Jun 30, 2023, at 3:53 PM, Frank Tang (譚永鋒) via icu-design <
>> icu...@li...> wrote:
>>
>>
>>
>> ---------- Forwarded message ---------
>> From: Frank Tang (譚永鋒) <ft...@go...>
>> Date: Fri, 30 Jun 2023 at 15:53
>> Subject: Re: [icu-design] C API for LocaleBuilder
>> To: Rich Gillam <ric...@ap...>
>>
>>
>>
>>
>> On Fri, 30 Jun 2023 at 11:00, Rich Gillam <ric...@ap...>
>> wrote:
>>
>>> Frank--
>>>
>>> That would be a different approach to construct a locale. However, I
>>> would like to point out several things here.
>>> 1. That approach won't be a "builder"
>>>
>>>
>>> But it would solve the problem we’re being asked to solve.
>>>
>>
>> no, it does not. What the bug asked for is a C builder API and whatever
>> the API does not follow  a "builder" design pattern does not solve the
>> problem it asked for. A builder needs to be step by step so the decision of
>> each field could be spread in different times and different places in the
>> code. A single function with a struct does not allow that, and therefore
>> not fulfilling the request and requirement.
>>
>> I am open to a concrete counter proposal. It is simple to think the issue
>> could be solved by a simple set() w/ a constant if you only consider the 1)
>> language, 2) script, 3) region, 4) variant, 5) languageTag, and 6) localeId
>> as the only set of input. But once you start to consider 7) attributes, 8)
>> unicode locale keyword/value, and 9) other extensions than "-u-" (for
>> example "-t-" or anything in other_extension) things become complicated.
>> The validation and override/replacement logic for each of them will be
>> different
>>
>>
>>
>>>
>>> but what you describe would require the caller to resolve that before
>>> putting it into the struct.
>>>
>>>
>>> No, it wouldn’t.  For building locales, there’s nothing special going on
>>> while you’re setting up the builder— it’s just a bag of fields.
>>>
>>
>> no, that is not true, it is NOT a bag of fields, it is an order list of
>> fields, not a bag => the order matter.
>> in bag sematic , the order does not matter
>>
>> Consider the following
>>
>> languageTag "en-Latn-GB-u-ca-roc"
>> language "ja"
>> script "Thai"
>> region "CN"
>> variant "hepburn-heploc"
>> unicode extension "ca"= "japanese"
>> language "sl"
>> variant "bohoric"
>> unicode extension "ca"= "coptic"
>>
>> which will produce sl-Thai-CN-bohoric-u-ca-coptic"
>>
>> but if you treat them as a bug and build that the order in
>> language "ja"
>> script "Thai"
>> region "CN"
>> variant "hepburn-heploc"
>> unicode extension "ca"= "japanese"
>> language "sl"
>> languageTag "en-Latn-GB-u-ca-roc"
>> variant "bohoric"
>> unicode extension "ca"= "coptic"
>>
>> you will get
>> which will produce en-Latn-GB-bohoric-u-ca-coptic"
>>
>> which result in different value
>>
>>
>>
>>
>>>  All the magic happens when you run the build() method.
>>>
>>
>> That should be implementation depend and should not be mandated by the
>> API, there are nothing wrong to progressively build when each setting
>> method got called.
>>
>>
>>>  I’m just saying that maybe instead of adding a new opaque class with a
>>> bunch of individual getters and setters for all those fields, we just use a
>>> simple struct (or a hash table, or something).
>>>
>>
>> As I mentioned, what you said only make sense when the underline problem
>> is *a bag of fields*, but the problem is this problem is not just a* bag
>> of fields.* The information pass into the builder have overlapping with
>> each other, in particular the information inside localeID and languageTag
>> is not multurelly exclusive from
>> language/script/region/variant/attribute/unicode locale keyword+value and
>> other extension. They over lap and part of them could be override by
>> another. If all the input are muturally exclusive from each other, then the
>> problem is much simpler and possible could be solved by "just use a simple
>> struct" . But that is NOT the case.(as the example I show above) now.
>>
>>
>>
>>> The approach you mentioned seems to require the caller to do a lot of
>>> work to fill that struct. and the caller also needs to learn how to program
>>> UHashTable while using that API, is that better?
>>>
>>>
>>> Good point.  UHashTable is a pain in the butt.  Maybe that’s enough
>>> right there to swing me back around to what you’ve already proposed…
>>>
>>> That said, I do kind of like Steven’s suggestion to just have one set()
>>> method and just have constants for the individual fields...
>>>
>>
>> If we look at the pre-existing locale C API in uloc.h we have different
>> function instead of a get w/ a constant
>> uloc_getLanguage(....
>> uloc_getScript(....
>> uloc_getCountry(....
>> uloc_getVariant(....
>> uloc_getName(....
>> uloc_getBaseName(...
>>
>> notice there is no API to get attributes.
>>
>> we could also use a uloc_get() with a constant, right? but we didn't
>> years ago. so... should we follow the same design pattern or to a set w/
>> constant
>>
>>
>>
>> We have the following different kind of operation to the builder
>>
>> 1. language (w/o script and region), region, script, variant, language
>> tag (w/ script or region) => set
>> 2. attribute => add and remove (there could be multiple attributes in a
>> locale
>> 3. unicode extension set on a key ("ca" , "japanese")
>> 4. other extension - set on a extension ('a', "abc-efg")
>> 5. clear extension (say you start from "en-u-ca-japanese-nu-Thai" and you
>> want to clear the calendar and change nu to "Arab"   to "en-u-nu-Arab")
>>
>> a simple set w/ constant method could simplify #1 above, but not powerful
>> enough to deal with #2 #3 #4 and #5
>> Also, I personally dislike the set + enum => internally swtich(enum)
>> approach because almost all the time the caller know what operation need to
>> perform in the compile time, but this approach mandate the function
>> internally switch based on a value which is fixed in the program but cannot
>> be easily optimized by the compiler.
>>
>> Consider the following case
>>
>>
>> setA(context, value) {
>>    context.a = value;
>> }
>>
>> setB(context, value) {
>>    context.b = value;
>> }
>> and the caller
>>
>> setA(context, 3);
>> setB(context, 5);
>>
>> vs
>> set(context, field, value) {
>>   switch(field) {
>>      case A: context.a = value; break;
>>      case B: context.b = value; break;
>>   }
>> }
>> and the caller
>> set(context, A, 3);
>> set(context, B, 5);
>>
>> The swtich / case inside the set() simply cannot be optimized by the
>> compiler
>>
>>  surely the first approach expose more functions but such exposure allow
>> the compiler to avoid unnecessary run time hit by the switch/case which
>> could be resovled in the compile time, not during the run time.
>>
>>
>>
>>> —Rich
>>>
>>> On Jun 30, 2023, at 12:44 AM, Frank Tang (譚永鋒) <ft...@go...> wrote:
>>>
>>> Dear Richard
>>>
>>> On Thu, 29 Jun 2023 at 18:31, Richard Gillam <ric...@ap...>
>>> wrote:
>>>
>>>> Frank—
>>>>
>>>> This looks fine, but also kind of heavy-weight.  I wonder if it’d make
>>>> sense to just have a struct with all the fields and utility methods to
>>>> convert between a locale ID string and that struct?
>>>>
>>>
>>> That would be a different approach to construct a locale. However, I
>>> would like to point out several things here.
>>> 1. That approach won't be a "builder"
>>> The definition of a Builder design pattern is
>>> "Builder is a creational design pattern that lets you construct complex
>>> objects step by step. "
>>> "Problem
>>> Imagine a complex object that requires laborious, step-by-step
>>> initialization of many fields and nested objects. Such initialization code
>>> is usually buried inside a monstrous constructor with lots of parameters.
>>> Or even worse: scattered all over the client code."
>>>
>>> In other words, the key concept of "builder" itself is "step by step"
>>> and that allows the builder to get passing to different parts of the system
>>> to allow different part of the software to add information into the
>>> builder, without a centralized place to gather all the information
>>> together.
>>> see https://refactoring.guru/design-patterns/builder
>>>
>>> 2.  The step by step processing also means the timing of information
>>> putting in matter and the later information overrides the previous
>>> information.
>>> for example, I could have a
>>> ULocaleBuilder builder = ulb_open();
>>> ulb_setLanguageTag(builder, "en-Latn-GB-u-ca-islamic");
>>> ulb_setRegion(builder, "FR");
>>> ulb_setUnicodeExtension(builder, 'x', "abcd");
>>> .....
>>> ulb_setUnicodeLocaleKeyword(builder, "ca", "japanese");
>>> ulb_setRegion(builder, "DE");
>>> ulb_setUnicodeLocaleKeyword(builder, "ca", "roc");
>>> ulb_build(builder...)
>>> ulb_close(builder)
>>>
>>> to build a locale "en-Latn-DE-u-ca-roc-x-abcd"
>>>
>>> but what you describe would require the caller to resolve that before
>>> putting it into the struct.
>>>
>>>  I guess the weak point in that approach would be the key-value pairs at
>>>> the end of the locale ID— an alternative approach might be to use a
>>>> UHashTable (or whatever it's called) as the bearer of fields (or to provide
>>>> both the struct and the hash table, depending on whether the caller cares
>>>> about the key-value fields).  The beauty of an approach like this is that
>>>> you save a few heap allocations and don’t have to add as many new API
>>>> functions.
>>>>
>>>
>>> The approach you mentioned seems to require the caller to do a lot of
>>> work to fill that struct. and the caller also needs to learn how to program
>>> UHashTable while using that API, is that better?
>>>
>>>
>>>> On the other hand, what you have is a thin wrapper around the existing
>>>> C++ LocaleBuilder class, so you’re not having to implement the
>>>> locale-mangling code all over again.
>>>>
>>>> I don’t know— what does everybody else think?
>>>>
>>>> —Rich
>>>>
>>>> On Jun 29, 2023, at 5:09 PM, Frank Tang (譚永鋒) via icu-design <
>>>> icu...@li...> wrote:
>>>>
>>>>
>>>>
>>>> Dear ICU team & users,
>>>>
>>>>
>>>> I would like to propose the following for: ICU 74
>>>>
>>>> Please provide feedback by: Next Wednesday, July 5, or any time
>>>> sufficiently in advance of the feature freeze
>>>>
>>>> Designated API review: Gary Wade
>>>>
>>>> Issue: https://unicode-org.atlassian.net/browse/ICU-22365
>>>> A draft PR and implementation (not including test yet) could be found at
>>>> https://github.com/unicode-org/icu/pull/2520 )
>>>>
>>>> This is to follow up the C++ API in ICU 64 and provide a C API
>>>> requested in
>>>>
>>>> Issue to be discussed
>>>>
>>>> 1. Should the prefix be ulb_ or something else?
>>>> 2. What should ulb_build build? the value as returned by
>>>> Locale::getName() or Locale::toLanguageTag()?
>>>> 3. is it to use file ulocbuilder.{h,cpp}? or should it be something
>>>> else?
>>>>
>>>> Add C API for ULocaleBuilder
>>>>
>>>> https://unicode-org.atlassian.net/browse/ICU-22365
>>>>
>>>> File : icu4c/source/common/ulocbuilder.h
>>>>
>>>> // © 2023 and later: Unicode, Inc. and others.
>>>> // License & terms of use: http://www.unicode.org/copyright.html
>>>> #ifndef __ULOCBUILDER_H__
>>>> #define __ULOCBUILDER_H__
>>>>
>>>> #include "unicode/utypes.h"
>>>>
>>>>
>>>> /**
>>>>  * \file
>>>>  * \brief C API: Builder API for Locale
>>>>  */
>>>>
>>>> #ifndef U_HIDE_DRAFT_API
>>>>
>>>> /**
>>>>  * Opaque type for a Locale builder.
>>>>  * @draft ICU 74
>>>>  */
>>>> typedef void *ULocaleBuilder;
>>>>
>>>> /**
>>>>  * <code>ULocaleBuilder</code> is used to build stirng of valid
>>>> <code>locale</code>
>>>>  * from values configured by the setters.
>>>>  * The <code>ULocaleBuilder</code> checks if a value configured by a
>>>>  * setter satisfies the syntax requirements defined by the
>>>> <code>Locale</code>
>>>>  * class.  A string of Locale created by a <code>ULocaleBuilder</code>
>>>> is
>>>>  * well-formed and can be transformed to a well-formed IETF BCP 47
>>>> language tag
>>>>  * without losing information.
>>>>  *
>>>>  * <p>The following example shows how to create a <code>locale</code>
>>>> string
>>>>  * with the <code>ULocaleBuilder</code>.
>>>>  * <blockquote>
>>>>  * <pre>
>>>>  *     UErrorCode err = U_ZERO_ERROR;
>>>>  *     char buffer[ULOC_FULLNAME_CAPACITY];
>>>>  *     ULocaleBuilder builder = ulb_open();
>>>>  *     ulb_setLanguage(builder, "sr");
>>>>  *     ulb_setScript(builder, "Latn");
>>>>  *     ulb_setRegion(builder, "RS");
>>>>  *     int32_t length = ulb_build(
>>>>  *         builder, buffer, ULOC_FULLNAME_CAPACITY, &error);
>>>>  *     ulb_close(builder);
>>>>  * </pre>
>>>>  * </blockquote>
>>>>  *
>>>>  * <p>ULocaleBuilders can be reused; <code>ulb_clear()</code> resets all
>>>>  * fields to their default values.
>>>>  *
>>>>  * <p>ULocaleBuilder tracks errors in an internal UErrorCode. For all
>>>> setters,
>>>>  * except ulb_setLanguageTag and ulb_setLocale, ULocaleBuilder will
>>>> return immediately
>>>>  * if the internal UErrorCode is in error state.
>>>>  * To reset internal state and error code, call clear method.
>>>>  * The ulb_setLanguageTag and setLocale method will first clear the
>>>> internal
>>>>  * UErrorCode, then track the error of the validation of the input
>>>> parameter
>>>>  * into the internal UErrorCode.
>>>>  *
>>>>  * @draft ICU 74
>>>>  */
>>>>
>>>> /**
>>>>  * Constructs an empty ULocaleBuilder. The default value of all
>>>>  * fields, extensions, and private use information is the
>>>>  * empty string. The created builder should be destoried by calling
>>>>  * ulb_close();
>>>>  *
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI ULocaleBuilder U_EXPORT2
>>>> ulb_open();
>>>>
>>>> /**
>>>>  * Close the builder and destroy it's internal states.
>>>>  * @param builder the builder
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_close(ULocaleBuilder builder);
>>>>
>>>> /**
>>>>  * Resets the <code>ULocaleBuilder</code> to match the provided
>>>>  * <code>locale</code>.  Existing state is discarded.
>>>>  *
>>>>  * <p>All fields of the locale must be well-formed.
>>>>  * <p>This method clears the internal UErrorCode.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param locale the locale
>>>>  *
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setLocale(ULocaleBuilder builder, const char* locale);
>>>>
>>>> /**
>>>>  * Resets the ULocaleBuilder to match the provided IETF BCP 47 language
>>>> tag.
>>>>  * Discards the existing state.
>>>>  * The empty string causes the builder to be reset, like {@link #clear}.
>>>>  * Legacy language tags (marked as “Type: grandfathered” in BCP 47)
>>>>  * are converted to their canonical form before being processed.
>>>>  * Otherwise, the <code>language tag</code> must be well-formed,
>>>>  * or else the ulb_build() method will later report an
>>>> U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p>This method clears the internal UErrorCode.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param tag the language tag, defined as IETF BCP 47 language tag.
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setLanguageTag(ULocaleBuilder builder, const char* tag);
>>>>
>>>> /**
>>>>  * Sets the language.  If <code>language</code> is the empty string, the
>>>>  * language in this <code>ULocaleBuilder</code> is removed. Otherwise,
>>>> the
>>>>  * <code>language</code> must be well-formed, or else the ulb_build()
>>>> method will
>>>>  * later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p>The syntax of language value is defined as
>>>>  * [unicode_language_subtag](
>>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_language_subtag).
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param language the language
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setLanguage(ULocaleBuilder builder, const char* language);
>>>>
>>>> /**
>>>>  * Sets the script. If <code>script</code> is the empty string, the
>>>> script in
>>>>  * this <code>ULocaleBuilder</code> is removed.
>>>>  * Otherwise, the <code>script</code> must be well-formed, or else the
>>>> ulb_build()
>>>>  * method will later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p>The script value is a four-letter script code as
>>>>  * [unicode_script_subtag](
>>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_script_subtag)
>>>>  * defined by ISO 15924
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param script the script
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setScript(ULocaleBuilder builder, const char* script);
>>>>
>>>> /**
>>>>  * Sets the region.  If region is the empty string, the region in this
>>>>  * <code>ULocaleBuilder</code> is removed. Otherwise, the
>>>> <code>region</code>
>>>>  * must be well-formed, or else the ulb_build() method will later
>>>> report an
>>>>  * U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p>The region value is defined by
>>>>  *  [unicode_region_subtag](
>>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_region_subtag)
>>>>  * as a two-letter ISO 3166 code or a three-digit UN M.49 area code.
>>>>  *
>>>>  * <p>The region value in the <code>Locale</code> created by the
>>>>  * <code>ULocaleBuilder</code> is always normalized to upper case.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param region the region
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setRegion(ULocaleBuilder builder, const char* region);
>>>>
>>>> /**
>>>>  * Sets the variant.  If variant is the empty string, the variant in
>>>> this
>>>>  * <code>LocaleBuilder</code> is removed.  Otherwise, the
>>>> <code>variant</code>
>>>>  * must be well-formed, or else the ulb_build() method will later
>>>> report an
>>>>  * U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p><b>Note:</b> This method checks if <code>variant</code>
>>>>  * satisfies the
>>>>  * [unicode_variant_subtag](
>>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_variant_subtag)
>>>>  * syntax requirements, and normalizes the value to lowercase letters.
>>>> However,
>>>>  * the <code>Locale</code> class does not impose any syntactic
>>>>  * restriction on variant. To set an ill-formed variant, use a Locale
>>>> constructor.
>>>>  * If there are multiple unicode_variant_subtag, the caller must
>>>> concatenate
>>>>  * them with '-' as separator (ex: "foobar-fibar").
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param variant the variant
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setVariant(ULocaleBuilder builder, const char* variant);
>>>>
>>>> /**
>>>>  * Sets the extension for the given key. If the value is the empty
>>>> string,
>>>>  * the extension is removed.  Otherwise, the <code>key</code> and
>>>>  * <code>value</code> must be well-formed, or else the ulb_build()
>>>> method will
>>>>  * later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p><b>Note:</b> The key ('u') is used for the Unicode locale
>>>> extension.
>>>>  * Setting a value for this key replaces any existing Unicode locale
>>>> key/type
>>>>  * pairs with those defined in the extension.
>>>>  *
>>>>  * <p><b>Note:</b> The key ('x') is used for the private use code. To be
>>>>  * well-formed, the value for this key needs only to have subtags of
>>>> one to
>>>>  * eight alphanumeric characters, not two to eight as in the general
>>>> case.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param key the extension key
>>>>  * @param value the extension value
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setExtension(ULocaleBuilder builder, char key, const char* value);
>>>>
>>>> /**
>>>>  * Sets the Unicode locale keyword type for the given key. If the type
>>>>  * StringPiece is constructed with a nullptr, the keyword is removed.
>>>>  * If the type is the empty string, the keyword is set without type
>>>> subtags.
>>>>  * Otherwise, the key and type must be well-formed, or else the
>>>> ulb_build()
>>>>  * method will later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>>  *
>>>>  * <p>Keys and types are converted to lower case.
>>>>  *
>>>>  * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension}
>>>>  * replaces all Unicode locale keywords with those defined in the
>>>>  * extension.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param key the Unicode locale key
>>>>  * @param type the Unicode locale type
>>>>  * @return This builder.
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_setUnicodeLocaleKeyword(ULocaleBuilder builder,
>>>>         const char* key, const char* type);
>>>>
>>>> /**
>>>>  * Adds a unicode locale attribute, if not already present, otherwise
>>>>  * has no effect.  The attribute must not be empty string and must be
>>>>  * well-formed or U_ILLEGAL_ARGUMENT_ERROR will be set to status
>>>>  * during the ulb_build() call.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param attribute the attribute
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_addUnicodeLocaleAttribute(ULocaleBuilder builder, const char*
>>>> attribute);
>>>>
>>>> /**
>>>>  * Removes a unicode locale attribute, if present, otherwise has no
>>>>  * effect.  The attribute must not be empty string and must be
>>>> well-formed
>>>>  * or U_ILLEGAL_ARGUMENT_ERROR will be set to status during the
>>>> ulb_build() call.
>>>>  *
>>>>  * <p>Attribute comparison for removal is case-insensitive.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param attribute the attribute
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_removeUnicodeLocaleAttribute(ULocaleBuilder builder, const char*
>>>> attribute);
>>>>
>>>> /**
>>>>  * Resets the builder to its initial, empty state.
>>>>  * <p>This method clears the internal UErrorCode.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_clear(ULocaleBuilder builder);
>>>>
>>>> /**
>>>>  * Resets the extensions to their initial, empty state.
>>>>  * Language, script, region and variant are unchanged.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI void U_EXPORT2
>>>> ulb_clearExtensions(ULocaleBuilder builder);
>>>>
>>>> /*
>>>>  * Build the Locale< stirng from the fields set
>>>>  * on this builder.
>>>>  * If any set methods or during the ulb_build() call require memory
>>>> allocation
>>>>  * but fail U_MEMORY_ALLOCATION_ERROR will be set to status.
>>>>  * If any of the fields set by the setters are not well-formed, the
>>>> status
>>>>  * will be set to U_ILLEGAL_ARGUMENT_ERROR. The state of the builder
>>>> will
>>>>  * not change after the ulb_build() call and the caller is free to keep
>>>> using
>>>>  * the same builder to build more locales.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param err the error code
>>>>  * @return the length of the locale id in buffer
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI int32_t U_EXPORT2
>>>> ulb_build(ULocaleBuilder builder, char* buffer, int32_t bufferCapacity,
>>>> UErrorCode* err);
>>>>
>>>> /**
>>>>  * Sets the UErrorCode if an error occurred while recording sets.
>>>>  * Preserves older error codes in the outErrorCode.
>>>>  *
>>>>  * @param builder the builder
>>>>  * @param outErrorCode Set to an error code that occurred while setting
>>>> subtags.
>>>>  *                  Unchanged if there is no such error or if
>>>> outErrorCode
>>>>  *                  already contained an error.
>>>>  * @return true if U_FAILURE(*outErrorCode)
>>>>  * @draft ICU 74
>>>>  */
>>>> U_CAPI UBool U_EXPORT2
>>>> ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode);
>>>>
>>>> #endif  /* U_HIDE_DRAFT_API */
>>>>
>>>> #endif  // __ULOCBUILDER_H__
>>>> --
>>>> Frank Yung-Fong Tang
>>>> 譚永鋒 / 🌭🍊
>>>> Sr. Software Engineer
>>>> _______________________________________________
>>>> icu-design mailing list
>>>> icu...@li...
>>>> To Un/Subscribe:
>>>> https://lists.sourceforge.net/lists/listinfo/icu-design
>>>>
>>>>
>>>>
>>>
>>> --
>>> Frank Yung-Fong Tang
>>> 譚永鋒 / 🌭🍊
>>> Sr. Software Engineer
>>>
>>>
>>>
>>
>> --
>> Frank Yung-Fong Tang
>> 譚永鋒 / 🌭🍊
>> Sr. Software Engineer
>>
>>
>> --
>> Frank Yung-Fong Tang
>> 譚永鋒 / 🌭🍊
>> Sr. Software Engineer
>>
>> _______________________________________________
>> icu-design mailing list
>> icu...@li...
>> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design
>>
>>
>>
>
> --
> Frank Yung-Fong Tang
> 譚永鋒 / 🌭🍊
> Sr. Software Engineer
>
>
> _______________________________________________
> icu-design mailing list
> icu...@li...
> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design

Re: [icu-design] C API for LocaleBuilder

Open Source C/C++/Java libraries from Unicode

Re: [icu-design] C API for LocaleBuilder