Re: [icu-design] C API for LocaleBuilder

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Rich

Thanks for your reply
Just also want to provide some additional historical context that I forget
to mention

The ICU C++ LocaleBuilder API is try to feature/signature match what was in
Java for years
See https://docs.oracle.com/javase/8/docs/api/java/util/Locale.Builder.html
in C++
What I propose here also tries to match that so we do not have yet another
model only for C to build locales differently.

As about your comment of "I think you might want an API that has getters,
or an equivalent of the “Locale” class in C++"

I think those getter already exist in
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a490a02bfc1dd2c007911680a9cd87b73

“If this locale has the rg subtag, I want to get rid of it and move its
value into the region field”
The current uloc.h API could be used with locale builder to achieve that

char localeID[ULOC_FULLNAME_CAPACITY
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>
];
// .... copy locale to localeID
UErrorCode status= U_ZERO_ERROR;
ULocaleBuilder builder = ulb_open();
ulb_setLocale(builder, localeID);
// If this locale has the rg subtag
char value[ ULOC_FULLNAME_CAPACITY
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>
];
if (*uloc_getKeywordValue*(localeID, *uloc_toLegacyKey* ("rg"),
         value, ULOC_FULLNAME_CAPACITY
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
&status) > 0 && U_SUCCESS(status) ) {
    // get rid of it and
  ulb_setUnicodeLocaleKeyword(builder, "rg", nullptr); // to remove "rg"
  // move its value into the region field
  ulb_setRegion(builder, value);
}
// “If this locale ID doesn’t have a script code,
if (*uloc_getScript*(localeID,  value, ULOC_FULLNAME_CAPACITY
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
&status) == 0 &&
    *uloc_getScript*(otherLocaleID,  value, ULOC_FULLNAME_CAPACITY
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/icu/source/common/unicode/uloc.h;bpv=1;bpt=1;l=264?q=uloc.h&ss=chromium&gsn=ULOC_FULLNAME_CAPACITY&gs=KYTHE%3A%2F%2Fkythe%3A%2F%2Fchromium.googlesource.com%2Fcodesearch%2Fchromium%2Fsrc%2F%2Fmain%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Ficu%2Fsource%2Fcommon%2Funicode%2Fuloc.h%23ULOC_FULLNAME_CAPACITY%2523m%254010700>,
&status) > 0 && U_SUCCESS(status) ) {
    // copy in the script code from this other locale ID."
  ulb_setScript(builder, value);

}
int32_t length = ulb_build(builder, localeID, ULOC_FULLNAME_CAPACITY, &
status);
ulb_close(builder);

The red code is already in the ICU for a long time
The blue code is what I am proposing here.

On Fri, 30 Jun 2023 at 17:08, Rich Gillam <ric...@ap...> wrote:

> Frank—
>
> I think the basic issue here is that our APIs for manipulating locale IDs
> in C can be really painful to use if you’re doing anything interesting, and
> that we want get rid of as much of that pain as we can.  I think that’s
> more important than whether we match the C++ interface or conform to some
> particular definition of “builder.”  That said, I’m starting to come around
> to the idea that what you’re proposing really is about as good as we’re
> going to get.
>
> I’m going to withdraw my earlier suggestion.  I don’t think you’re quite
> getting what I’m saying, but my suggestion (using a struct) really only
> works well if you have language/script/region/variant— as soon you try to
> do anything with key-value pairs, you’d need to use a hash table (or
> something) instead of a struct, and not only does that mean you still have
> the heap allocation I was trying to get rid of, but it actually becomes
> harder to use than what you’re proposing.
>
> I still think there might be value in having a struct-based API that just
> handled language/script/region/variant, but maybe that’s just me, and we
> certainly couldn’t go with JUST that, so it’s probably not worth the
> trouble.
>
> I still tend to like Steven’s suggestion to have one setter with a
> selector code of some kind.  Your point that it would require a switch
> statement is valid, but I don’t think it’s actually that much of a
> performance hit in practice.  But if you’re trying to be as parallel as
> possible to the C++ interface, this works against that, I think.  I’m not
> going to argue this one either way.
>
> I do want to raise one other issue, though: I think you might want an API
> that has getters, or an equivalent of the “Locale” class in C++.  A lot of
> the use cases for a locale builder are “I want this locale ID, but I want
> to change this one field value,” or “I want to change this one field value,
> but only if this other field or attribute has such-and-such a value”.  For
> example: “If this locale has the rg subtag, I want to get rid of it and
> move its value into the region field” or “If this locale ID doesn’t have a
> script code, copy in the script code from this other locale ID."  If we
> still use things like uloc_getName() to do “if this field has this value”
> part of the logic, we still have a problem.
>
> That could be a separate ticket and API proposal, of course, but I think
> we want to think about it too.
>
> —Rich
>
> On Jun 30, 2023, at 3:53 PM, Frank Tang (譚永鋒) via icu-design <
> icu...@li...> wrote:
>
>
>
> ---------- Forwarded message ---------
> From: Frank Tang (譚永鋒) <ft...@go...>
> Date: Fri, 30 Jun 2023 at 15:53
> Subject: Re: [icu-design] C API for LocaleBuilder
> To: Rich Gillam <ric...@ap...>
>
>
>
>
> On Fri, 30 Jun 2023 at 11:00, Rich Gillam <ric...@ap...>
> wrote:
>
>> Frank--
>>
>> That would be a different approach to construct a locale. However, I
>> would like to point out several things here.
>> 1. That approach won't be a "builder"
>>
>>
>> But it would solve the problem we’re being asked to solve.
>>
>
> no, it does not. What the bug asked for is a C builder API and whatever
> the API does not follow  a "builder" design pattern does not solve the
> problem it asked for. A builder needs to be step by step so the decision of
> each field could be spread in different times and different places in the
> code. A single function with a struct does not allow that, and therefore
> not fulfilling the request and requirement.
>
> I am open to a concrete counter proposal. It is simple to think the issue
> could be solved by a simple set() w/ a constant if you only consider the 1)
> language, 2) script, 3) region, 4) variant, 5) languageTag, and 6) localeId
> as the only set of input. But once you start to consider 7) attributes, 8)
> unicode locale keyword/value, and 9) other extensions than "-u-" (for
> example "-t-" or anything in other_extension) things become complicated.
> The validation and override/replacement logic for each of them will be
> different
>
>
>
>>
>> but what you describe would require the caller to resolve that before
>> putting it into the struct.
>>
>>
>> No, it wouldn’t.  For building locales, there’s nothing special going on
>> while you’re setting up the builder— it’s just a bag of fields.
>>
>
> no, that is not true, it is NOT a bag of fields, it is an order list of
> fields, not a bag => the order matter.
> in bag sematic , the order does not matter
>
> Consider the following
>
> languageTag "en-Latn-GB-u-ca-roc"
> language "ja"
> script "Thai"
> region "CN"
> variant "hepburn-heploc"
> unicode extension "ca"= "japanese"
> language "sl"
> variant "bohoric"
> unicode extension "ca"= "coptic"
>
> which will produce sl-Thai-CN-bohoric-u-ca-coptic"
>
> but if you treat them as a bug and build that the order in
> language "ja"
> script "Thai"
> region "CN"
> variant "hepburn-heploc"
> unicode extension "ca"= "japanese"
> language "sl"
> languageTag "en-Latn-GB-u-ca-roc"
> variant "bohoric"
> unicode extension "ca"= "coptic"
>
> you will get
> which will produce en-Latn-GB-bohoric-u-ca-coptic"
>
> which result in different value
>
>
>
>
>>  All the magic happens when you run the build() method.
>>
>
> That should be implementation depend and should not be mandated by the
> API, there are nothing wrong to progressively build when each setting
> method got called.
>
>
>>  I’m just saying that maybe instead of adding a new opaque class with a
>> bunch of individual getters and setters for all those fields, we just use a
>> simple struct (or a hash table, or something).
>>
>
> As I mentioned, what you said only make sense when the underline problem
> is *a bag of fields*, but the problem is this problem is not just a* bag
> of fields.* The information pass into the builder have overlapping with
> each other, in particular the information inside localeID and languageTag
> is not multurelly exclusive from
> language/script/region/variant/attribute/unicode locale keyword+value and
> other extension. They over lap and part of them could be override by
> another. If all the input are muturally exclusive from each other, then the
> problem is much simpler and possible could be solved by "just use a simple
> struct" . But that is NOT the case.(as the example I show above) now.
>
>
>
>> The approach you mentioned seems to require the caller to do a lot of
>> work to fill that struct. and the caller also needs to learn how to program
>> UHashTable while using that API, is that better?
>>
>>
>> Good point.  UHashTable is a pain in the butt.  Maybe that’s enough right
>> there to swing me back around to what you’ve already proposed…
>>
>> That said, I do kind of like Steven’s suggestion to just have one set()
>> method and just have constants for the individual fields...
>>
>
> If we look at the pre-existing locale C API in uloc.h we have different
> function instead of a get w/ a constant
> uloc_getLanguage(....
> uloc_getScript(....
> uloc_getCountry(....
> uloc_getVariant(....
> uloc_getName(....
> uloc_getBaseName(...
>
> notice there is no API to get attributes.
>
> we could also use a uloc_get() with a constant, right? but we didn't years
> ago. so... should we follow the same design pattern or to a set w/ constant
>
>
>
> We have the following different kind of operation to the builder
>
> 1. language (w/o script and region), region, script, variant, language tag
> (w/ script or region) => set
> 2. attribute => add and remove (there could be multiple attributes in a
> locale
> 3. unicode extension set on a key ("ca" , "japanese")
> 4. other extension - set on a extension ('a', "abc-efg")
> 5. clear extension (say you start from "en-u-ca-japanese-nu-Thai" and you
> want to clear the calendar and change nu to "Arab"   to "en-u-nu-Arab")
>
> a simple set w/ constant method could simplify #1 above, but not powerful
> enough to deal with #2 #3 #4 and #5
> Also, I personally dislike the set + enum => internally swtich(enum)
> approach because almost all the time the caller know what operation need to
> perform in the compile time, but this approach mandate the function
> internally switch based on a value which is fixed in the program but cannot
> be easily optimized by the compiler.
>
> Consider the following case
>
>
> setA(context, value) {
>    context.a = value;
> }
>
> setB(context, value) {
>    context.b = value;
> }
> and the caller
>
> setA(context, 3);
> setB(context, 5);
>
> vs
> set(context, field, value) {
>   switch(field) {
>      case A: context.a = value; break;
>      case B: context.b = value; break;
>   }
> }
> and the caller
> set(context, A, 3);
> set(context, B, 5);
>
> The swtich / case inside the set() simply cannot be optimized by the
> compiler
>
>  surely the first approach expose more functions but such exposure allow
> the compiler to avoid unnecessary run time hit by the switch/case which
> could be resovled in the compile time, not during the run time.
>
>
>
>> —Rich
>>
>> On Jun 30, 2023, at 12:44 AM, Frank Tang (譚永鋒) <ft...@go...> wrote:
>>
>> Dear Richard
>>
>> On Thu, 29 Jun 2023 at 18:31, Richard Gillam <ric...@ap...>
>> wrote:
>>
>>> Frank—
>>>
>>> This looks fine, but also kind of heavy-weight.  I wonder if it’d make
>>> sense to just have a struct with all the fields and utility methods to
>>> convert between a locale ID string and that struct?
>>>
>>
>> That would be a different approach to construct a locale. However, I
>> would like to point out several things here.
>> 1. That approach won't be a "builder"
>> The definition of a Builder design pattern is
>> "Builder is a creational design pattern that lets you construct complex
>> objects step by step. "
>> "Problem
>> Imagine a complex object that requires laborious, step-by-step
>> initialization of many fields and nested objects. Such initialization code
>> is usually buried inside a monstrous constructor with lots of parameters.
>> Or even worse: scattered all over the client code."
>>
>> In other words, the key concept of "builder" itself is "step by step" and
>> that allows the builder to get passing to different parts of the system to
>> allow different part of the software to add information into the builder,
>> without a centralized place to gather all the information together.
>> see https://refactoring.guru/design-patterns/builder
>>
>> 2.  The step by step processing also means the timing of information
>> putting in matter and the later information overrides the previous
>> information.
>> for example, I could have a
>> ULocaleBuilder builder = ulb_open();
>> ulb_setLanguageTag(builder, "en-Latn-GB-u-ca-islamic");
>> ulb_setRegion(builder, "FR");
>> ulb_setUnicodeExtension(builder, 'x', "abcd");
>> .....
>> ulb_setUnicodeLocaleKeyword(builder, "ca", "japanese");
>> ulb_setRegion(builder, "DE");
>> ulb_setUnicodeLocaleKeyword(builder, "ca", "roc");
>> ulb_build(builder...)
>> ulb_close(builder)
>>
>> to build a locale "en-Latn-DE-u-ca-roc-x-abcd"
>>
>> but what you describe would require the caller to resolve that before
>> putting it into the struct.
>>
>>  I guess the weak point in that approach would be the key-value pairs at
>>> the end of the locale ID— an alternative approach might be to use a
>>> UHashTable (or whatever it's called) as the bearer of fields (or to provide
>>> both the struct and the hash table, depending on whether the caller cares
>>> about the key-value fields).  The beauty of an approach like this is that
>>> you save a few heap allocations and don’t have to add as many new API
>>> functions.
>>>
>>
>> The approach you mentioned seems to require the caller to do a lot of
>> work to fill that struct. and the caller also needs to learn how to program
>> UHashTable while using that API, is that better?
>>
>>
>>> On the other hand, what you have is a thin wrapper around the existing
>>> C++ LocaleBuilder class, so you’re not having to implement the
>>> locale-mangling code all over again.
>>>
>>> I don’t know— what does everybody else think?
>>>
>>> —Rich
>>>
>>> On Jun 29, 2023, at 5:09 PM, Frank Tang (譚永鋒) via icu-design <
>>> icu...@li...> wrote:
>>>
>>>
>>>
>>> Dear ICU team & users,
>>>
>>>
>>> I would like to propose the following for: ICU 74
>>>
>>> Please provide feedback by: Next Wednesday, July 5, or any time
>>> sufficiently in advance of the feature freeze
>>>
>>> Designated API review: Gary Wade
>>>
>>> Issue: https://unicode-org.atlassian.net/browse/ICU-22365
>>> A draft PR and implementation (not including test yet) could be found at
>>> https://github.com/unicode-org/icu/pull/2520 )
>>>
>>> This is to follow up the C++ API in ICU 64 and provide a C API requested
>>> in
>>>
>>> Issue to be discussed
>>>
>>> 1. Should the prefix be ulb_ or something else?
>>> 2. What should ulb_build build? the value as returned by
>>> Locale::getName() or Locale::toLanguageTag()?
>>> 3. is it to use file ulocbuilder.{h,cpp}? or should it be something else?
>>>
>>> Add C API for ULocaleBuilder
>>>
>>> https://unicode-org.atlassian.net/browse/ICU-22365
>>>
>>> File : icu4c/source/common/ulocbuilder.h
>>>
>>> // © 2023 and later: Unicode, Inc. and others.
>>> // License & terms of use: http://www.unicode.org/copyright.html
>>> #ifndef __ULOCBUILDER_H__
>>> #define __ULOCBUILDER_H__
>>>
>>> #include "unicode/utypes.h"
>>>
>>>
>>> /**
>>>  * \file
>>>  * \brief C API: Builder API for Locale
>>>  */
>>>
>>> #ifndef U_HIDE_DRAFT_API
>>>
>>> /**
>>>  * Opaque type for a Locale builder.
>>>  * @draft ICU 74
>>>  */
>>> typedef void *ULocaleBuilder;
>>>
>>> /**
>>>  * <code>ULocaleBuilder</code> is used to build stirng of valid
>>> <code>locale</code>
>>>  * from values configured by the setters.
>>>  * The <code>ULocaleBuilder</code> checks if a value configured by a
>>>  * setter satisfies the syntax requirements defined by the
>>> <code>Locale</code>
>>>  * class.  A string of Locale created by a <code>ULocaleBuilder</code> is
>>>  * well-formed and can be transformed to a well-formed IETF BCP 47
>>> language tag
>>>  * without losing information.
>>>  *
>>>  * <p>The following example shows how to create a <code>locale</code>
>>> string
>>>  * with the <code>ULocaleBuilder</code>.
>>>  * <blockquote>
>>>  * <pre>
>>>  *     UErrorCode err = U_ZERO_ERROR;
>>>  *     char buffer[ULOC_FULLNAME_CAPACITY];
>>>  *     ULocaleBuilder builder = ulb_open();
>>>  *     ulb_setLanguage(builder, "sr");
>>>  *     ulb_setScript(builder, "Latn");
>>>  *     ulb_setRegion(builder, "RS");
>>>  *     int32_t length = ulb_build(
>>>  *         builder, buffer, ULOC_FULLNAME_CAPACITY, &error);
>>>  *     ulb_close(builder);
>>>  * </pre>
>>>  * </blockquote>
>>>  *
>>>  * <p>ULocaleBuilders can be reused; <code>ulb_clear()</code> resets all
>>>  * fields to their default values.
>>>  *
>>>  * <p>ULocaleBuilder tracks errors in an internal UErrorCode. For all
>>> setters,
>>>  * except ulb_setLanguageTag and ulb_setLocale, ULocaleBuilder will
>>> return immediately
>>>  * if the internal UErrorCode is in error state.
>>>  * To reset internal state and error code, call clear method.
>>>  * The ulb_setLanguageTag and setLocale method will first clear the
>>> internal
>>>  * UErrorCode, then track the error of the validation of the input
>>> parameter
>>>  * into the internal UErrorCode.
>>>  *
>>>  * @draft ICU 74
>>>  */
>>>
>>> /**
>>>  * Constructs an empty ULocaleBuilder. The default value of all
>>>  * fields, extensions, and private use information is the
>>>  * empty string. The created builder should be destoried by calling
>>>  * ulb_close();
>>>  *
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI ULocaleBuilder U_EXPORT2
>>> ulb_open();
>>>
>>> /**
>>>  * Close the builder and destroy it's internal states.
>>>  * @param builder the builder
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_close(ULocaleBuilder builder);
>>>
>>> /**
>>>  * Resets the <code>ULocaleBuilder</code> to match the provided
>>>  * <code>locale</code>.  Existing state is discarded.
>>>  *
>>>  * <p>All fields of the locale must be well-formed.
>>>  * <p>This method clears the internal UErrorCode.
>>>  *
>>>  * @param builder the builder
>>>  * @param locale the locale
>>>  *
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setLocale(ULocaleBuilder builder, const char* locale);
>>>
>>> /**
>>>  * Resets the ULocaleBuilder to match the provided IETF BCP 47 language
>>> tag.
>>>  * Discards the existing state.
>>>  * The empty string causes the builder to be reset, like {@link #clear}.
>>>  * Legacy language tags (marked as “Type: grandfathered” in BCP 47)
>>>  * are converted to their canonical form before being processed.
>>>  * Otherwise, the <code>language tag</code> must be well-formed,
>>>  * or else the ulb_build() method will later report an
>>> U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p>This method clears the internal UErrorCode.
>>>  *
>>>  * @param builder the builder
>>>  * @param tag the language tag, defined as IETF BCP 47 language tag.
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setLanguageTag(ULocaleBuilder builder, const char* tag);
>>>
>>> /**
>>>  * Sets the language.  If <code>language</code> is the empty string, the
>>>  * language in this <code>ULocaleBuilder</code> is removed. Otherwise,
>>> the
>>>  * <code>language</code> must be well-formed, or else the ulb_build()
>>> method will
>>>  * later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p>The syntax of language value is defined as
>>>  * [unicode_language_subtag](
>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_language_subtag).
>>>  *
>>>  * @param builder the builder
>>>  * @param language the language
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setLanguage(ULocaleBuilder builder, const char* language);
>>>
>>> /**
>>>  * Sets the script. If <code>script</code> is the empty string, the
>>> script in
>>>  * this <code>ULocaleBuilder</code> is removed.
>>>  * Otherwise, the <code>script</code> must be well-formed, or else the
>>> ulb_build()
>>>  * method will later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p>The script value is a four-letter script code as
>>>  * [unicode_script_subtag](
>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_script_subtag)
>>>  * defined by ISO 15924
>>>  *
>>>  * @param builder the builder
>>>  * @param script the script
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setScript(ULocaleBuilder builder, const char* script);
>>>
>>> /**
>>>  * Sets the region.  If region is the empty string, the region in this
>>>  * <code>ULocaleBuilder</code> is removed. Otherwise, the
>>> <code>region</code>
>>>  * must be well-formed, or else the ulb_build() method will later report
>>> an
>>>  * U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p>The region value is defined by
>>>  *  [unicode_region_subtag](
>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_region_subtag)
>>>  * as a two-letter ISO 3166 code or a three-digit UN M.49 area code.
>>>  *
>>>  * <p>The region value in the <code>Locale</code> created by the
>>>  * <code>ULocaleBuilder</code> is always normalized to upper case.
>>>  *
>>>  * @param builder the builder
>>>  * @param region the region
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setRegion(ULocaleBuilder builder, const char* region);
>>>
>>> /**
>>>  * Sets the variant.  If variant is the empty string, the variant in this
>>>  * <code>LocaleBuilder</code> is removed.  Otherwise, the
>>> <code>variant</code>
>>>  * must be well-formed, or else the ulb_build() method will later report
>>> an
>>>  * U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p><b>Note:</b> This method checks if <code>variant</code>
>>>  * satisfies the
>>>  * [unicode_variant_subtag](
>>> http://www.unicode.org/reports/tr35/tr35.html#unicode_variant_subtag)
>>>  * syntax requirements, and normalizes the value to lowercase letters.
>>> However,
>>>  * the <code>Locale</code> class does not impose any syntactic
>>>  * restriction on variant. To set an ill-formed variant, use a Locale
>>> constructor.
>>>  * If there are multiple unicode_variant_subtag, the caller must
>>> concatenate
>>>  * them with '-' as separator (ex: "foobar-fibar").
>>>  *
>>>  * @param builder the builder
>>>  * @param variant the variant
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setVariant(ULocaleBuilder builder, const char* variant);
>>>
>>> /**
>>>  * Sets the extension for the given key. If the value is the empty
>>> string,
>>>  * the extension is removed.  Otherwise, the <code>key</code> and
>>>  * <code>value</code> must be well-formed, or else the ulb_build()
>>> method will
>>>  * later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p><b>Note:</b> The key ('u') is used for the Unicode locale
>>> extension.
>>>  * Setting a value for this key replaces any existing Unicode locale
>>> key/type
>>>  * pairs with those defined in the extension.
>>>  *
>>>  * <p><b>Note:</b> The key ('x') is used for the private use code. To be
>>>  * well-formed, the value for this key needs only to have subtags of one
>>> to
>>>  * eight alphanumeric characters, not two to eight as in the general
>>> case.
>>>  *
>>>  * @param builder the builder
>>>  * @param key the extension key
>>>  * @param value the extension value
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setExtension(ULocaleBuilder builder, char key, const char* value);
>>>
>>> /**
>>>  * Sets the Unicode locale keyword type for the given key. If the type
>>>  * StringPiece is constructed with a nullptr, the keyword is removed.
>>>  * If the type is the empty string, the keyword is set without type
>>> subtags.
>>>  * Otherwise, the key and type must be well-formed, or else the
>>> ulb_build()
>>>  * method will later report an U_ILLEGAL_ARGUMENT_ERROR.
>>>  *
>>>  * <p>Keys and types are converted to lower case.
>>>  *
>>>  * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension}
>>>  * replaces all Unicode locale keywords with those defined in the
>>>  * extension.
>>>  *
>>>  * @param builder the builder
>>>  * @param key the Unicode locale key
>>>  * @param type the Unicode locale type
>>>  * @return This builder.
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_setUnicodeLocaleKeyword(ULocaleBuilder builder,
>>>         const char* key, const char* type);
>>>
>>> /**
>>>  * Adds a unicode locale attribute, if not already present, otherwise
>>>  * has no effect.  The attribute must not be empty string and must be
>>>  * well-formed or U_ILLEGAL_ARGUMENT_ERROR will be set to status
>>>  * during the ulb_build() call.
>>>  *
>>>  * @param builder the builder
>>>  * @param attribute the attribute
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_addUnicodeLocaleAttribute(ULocaleBuilder builder, const char*
>>> attribute);
>>>
>>> /**
>>>  * Removes a unicode locale attribute, if present, otherwise has no
>>>  * effect.  The attribute must not be empty string and must be
>>> well-formed
>>>  * or U_ILLEGAL_ARGUMENT_ERROR will be set to status during the
>>> ulb_build() call.
>>>  *
>>>  * <p>Attribute comparison for removal is case-insensitive.
>>>  *
>>>  * @param builder the builder
>>>  * @param attribute the attribute
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_removeUnicodeLocaleAttribute(ULocaleBuilder builder, const char*
>>> attribute);
>>>
>>> /**
>>>  * Resets the builder to its initial, empty state.
>>>  * <p>This method clears the internal UErrorCode.
>>>  *
>>>  * @param builder the builder
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_clear(ULocaleBuilder builder);
>>>
>>> /**
>>>  * Resets the extensions to their initial, empty state.
>>>  * Language, script, region and variant are unchanged.
>>>  *
>>>  * @param builder the builder
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI void U_EXPORT2
>>> ulb_clearExtensions(ULocaleBuilder builder);
>>>
>>> /*
>>>  * Build the Locale< stirng from the fields set
>>>  * on this builder.
>>>  * If any set methods or during the ulb_build() call require memory
>>> allocation
>>>  * but fail U_MEMORY_ALLOCATION_ERROR will be set to status.
>>>  * If any of the fields set by the setters are not well-formed, the
>>> status
>>>  * will be set to U_ILLEGAL_ARGUMENT_ERROR. The state of the builder will
>>>  * not change after the ulb_build() call and the caller is free to keep
>>> using
>>>  * the same builder to build more locales.
>>>  *
>>>  * @param builder the builder
>>>  * @param err the error code
>>>  * @return the length of the locale id in buffer
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI int32_t U_EXPORT2
>>> ulb_build(ULocaleBuilder builder, char* buffer, int32_t bufferCapacity,
>>> UErrorCode* err);
>>>
>>> /**
>>>  * Sets the UErrorCode if an error occurred while recording sets.
>>>  * Preserves older error codes in the outErrorCode.
>>>  *
>>>  * @param builder the builder
>>>  * @param outErrorCode Set to an error code that occurred while setting
>>> subtags.
>>>  *                  Unchanged if there is no such error or if
>>> outErrorCode
>>>  *                  already contained an error.
>>>  * @return true if U_FAILURE(*outErrorCode)
>>>  * @draft ICU 74
>>>  */
>>> U_CAPI UBool U_EXPORT2
>>> ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode);
>>>
>>> #endif  /* U_HIDE_DRAFT_API */
>>>
>>> #endif  // __ULOCBUILDER_H__
>>> --
>>> Frank Yung-Fong Tang
>>> 譚永鋒 / 🌭🍊
>>> Sr. Software Engineer
>>> _______________________________________________
>>> icu-design mailing list
>>> icu...@li...
>>> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design
>>>
>>>
>>>
>>
>> --
>> Frank Yung-Fong Tang
>> 譚永鋒 / 🌭🍊
>> Sr. Software Engineer
>>
>>
>>
>
> --
> Frank Yung-Fong Tang
> 譚永鋒 / 🌭🍊
> Sr. Software Engineer
>
>
> --
> Frank Yung-Fong Tang
> 譚永鋒 / 🌭🍊
> Sr. Software Engineer
>
> _______________________________________________
> icu-design mailing list
> icu...@li...
> To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design
>
>
>

-- 
Frank Yung-Fong Tang
譚永鋒 / 🌭🍊
Sr. Software Engineer

Re: [icu-design] C API for LocaleBuilder

Open Source C/C++/Java libraries from Unicode

Re: [icu-design] C API for LocaleBuilder