|
From: Frank T. (譚. <ft...@go...> - 2023-06-30 00:10:04
|
Dear ICU team & users, I would like to propose the following for: ICU 74 Please provide feedback by: Next Wednesday, July 5, or any time sufficiently in advance of the feature freeze Designated API review: Gary Wade Issue: https://unicode-org.atlassian.net/browse/ICU-22365 A draft PR and implementation (not including test yet) could be found at https://github.com/unicode-org/icu/pull/2520 ) This is to follow up the C++ API in ICU 64 and provide a C API requested in Issue to be discussed 1. Should the prefix be ulb_ or something else? 2. What should ulb_build build? the value as returned by Locale::getName() or Locale::toLanguageTag()? 3. is it to use file ulocbuilder.{h,cpp}? or should it be something else? Add C API for ULocaleBuilder https://unicode-org.atlassian.net/browse/ICU-22365 File : icu4c/source/common/ulocbuilder.h // © 2023 and later: Unicode, Inc. and others. // License & terms of use: http://www.unicode.org/copyright.html #ifndef __ULOCBUILDER_H__ #define __ULOCBUILDER_H__ #include "unicode/utypes.h" /** * \file * \brief C API: Builder API for Locale */ #ifndef U_HIDE_DRAFT_API /** * Opaque type for a Locale builder. * @draft ICU 74 */ typedef void *ULocaleBuilder; /** * <code>ULocaleBuilder</code> is used to build stirng of valid <code>locale</code> * from values configured by the setters. * The <code>ULocaleBuilder</code> checks if a value configured by a * setter satisfies the syntax requirements defined by the <code>Locale</code> * class. A string of Locale created by a <code>ULocaleBuilder</code> is * well-formed and can be transformed to a well-formed IETF BCP 47 language tag * without losing information. * * <p>The following example shows how to create a <code>locale</code> string * with the <code>ULocaleBuilder</code>. * <blockquote> * <pre> * UErrorCode err = U_ZERO_ERROR; * char buffer[ULOC_FULLNAME_CAPACITY]; * ULocaleBuilder builder = ulb_open(); * ulb_setLanguage(builder, "sr"); * ulb_setScript(builder, "Latn"); * ulb_setRegion(builder, "RS"); * int32_t length = ulb_build( * builder, buffer, ULOC_FULLNAME_CAPACITY, &error); * ulb_close(builder); * </pre> * </blockquote> * * <p>ULocaleBuilders can be reused; <code>ulb_clear()</code> resets all * fields to their default values. * * <p>ULocaleBuilder tracks errors in an internal UErrorCode. For all setters, * except ulb_setLanguageTag and ulb_setLocale, ULocaleBuilder will return immediately * if the internal UErrorCode is in error state. * To reset internal state and error code, call clear method. * The ulb_setLanguageTag and setLocale method will first clear the internal * UErrorCode, then track the error of the validation of the input parameter * into the internal UErrorCode. * * @draft ICU 74 */ /** * Constructs an empty ULocaleBuilder. The default value of all * fields, extensions, and private use information is the * empty string. The created builder should be destoried by calling * ulb_close(); * * @draft ICU 74 */ U_CAPI ULocaleBuilder U_EXPORT2 ulb_open(); /** * Close the builder and destroy it's internal states. * @param builder the builder * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_close(ULocaleBuilder builder); /** * Resets the <code>ULocaleBuilder</code> to match the provided * <code>locale</code>. Existing state is discarded. * * <p>All fields of the locale must be well-formed. * <p>This method clears the internal UErrorCode. * * @param builder the builder * @param locale the locale * * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setLocale(ULocaleBuilder builder, const char* locale); /** * Resets the ULocaleBuilder to match the provided IETF BCP 47 language tag. * Discards the existing state. * The empty string causes the builder to be reset, like {@link #clear}. * Legacy language tags (marked as “Type: grandfathered” in BCP 47) * are converted to their canonical form before being processed. * Otherwise, the <code>language tag</code> must be well-formed, * or else the ulb_build() method will later report an U_ILLEGAL_ARGUMENT_ERROR. * * <p>This method clears the internal UErrorCode. * * @param builder the builder * @param tag the language tag, defined as IETF BCP 47 language tag. * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setLanguageTag(ULocaleBuilder builder, const char* tag); /** * Sets the language. If <code>language</code> is the empty string, the * language in this <code>ULocaleBuilder</code> is removed. Otherwise, the * <code>language</code> must be well-formed, or else the ulb_build() method will * later report an U_ILLEGAL_ARGUMENT_ERROR. * * <p>The syntax of language value is defined as * [unicode_language_subtag]( http://www.unicode.org/reports/tr35/tr35.html#unicode_language_subtag). * * @param builder the builder * @param language the language * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setLanguage(ULocaleBuilder builder, const char* language); /** * Sets the script. If <code>script</code> is the empty string, the script in * this <code>ULocaleBuilder</code> is removed. * Otherwise, the <code>script</code> must be well-formed, or else the ulb_build() * method will later report an U_ILLEGAL_ARGUMENT_ERROR. * * <p>The script value is a four-letter script code as * [unicode_script_subtag]( http://www.unicode.org/reports/tr35/tr35.html#unicode_script_subtag) * defined by ISO 15924 * * @param builder the builder * @param script the script * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setScript(ULocaleBuilder builder, const char* script); /** * Sets the region. If region is the empty string, the region in this * <code>ULocaleBuilder</code> is removed. Otherwise, the <code>region</code> * must be well-formed, or else the ulb_build() method will later report an * U_ILLEGAL_ARGUMENT_ERROR. * * <p>The region value is defined by * [unicode_region_subtag]( http://www.unicode.org/reports/tr35/tr35.html#unicode_region_subtag) * as a two-letter ISO 3166 code or a three-digit UN M.49 area code. * * <p>The region value in the <code>Locale</code> created by the * <code>ULocaleBuilder</code> is always normalized to upper case. * * @param builder the builder * @param region the region * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setRegion(ULocaleBuilder builder, const char* region); /** * Sets the variant. If variant is the empty string, the variant in this * <code>LocaleBuilder</code> is removed. Otherwise, the <code>variant</code> * must be well-formed, or else the ulb_build() method will later report an * U_ILLEGAL_ARGUMENT_ERROR. * * <p><b>Note:</b> This method checks if <code>variant</code> * satisfies the * [unicode_variant_subtag]( http://www.unicode.org/reports/tr35/tr35.html#unicode_variant_subtag) * syntax requirements, and normalizes the value to lowercase letters. However, * the <code>Locale</code> class does not impose any syntactic * restriction on variant. To set an ill-formed variant, use a Locale constructor. * If there are multiple unicode_variant_subtag, the caller must concatenate * them with '-' as separator (ex: "foobar-fibar"). * * @param builder the builder * @param variant the variant * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setVariant(ULocaleBuilder builder, const char* variant); /** * Sets the extension for the given key. If the value is the empty string, * the extension is removed. Otherwise, the <code>key</code> and * <code>value</code> must be well-formed, or else the ulb_build() method will * later report an U_ILLEGAL_ARGUMENT_ERROR. * * <p><b>Note:</b> The key ('u') is used for the Unicode locale extension. * Setting a value for this key replaces any existing Unicode locale key/type * pairs with those defined in the extension. * * <p><b>Note:</b> The key ('x') is used for the private use code. To be * well-formed, the value for this key needs only to have subtags of one to * eight alphanumeric characters, not two to eight as in the general case. * * @param builder the builder * @param key the extension key * @param value the extension value * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setExtension(ULocaleBuilder builder, char key, const char* value); /** * Sets the Unicode locale keyword type for the given key. If the type * StringPiece is constructed with a nullptr, the keyword is removed. * If the type is the empty string, the keyword is set without type subtags. * Otherwise, the key and type must be well-formed, or else the ulb_build() * method will later report an U_ILLEGAL_ARGUMENT_ERROR. * * <p>Keys and types are converted to lower case. * * <p><b>Note</b>:Setting the 'u' extension via {@link #setExtension} * replaces all Unicode locale keywords with those defined in the * extension. * * @param builder the builder * @param key the Unicode locale key * @param type the Unicode locale type * @return This builder. * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_setUnicodeLocaleKeyword(ULocaleBuilder builder, const char* key, const char* type); /** * Adds a unicode locale attribute, if not already present, otherwise * has no effect. The attribute must not be empty string and must be * well-formed or U_ILLEGAL_ARGUMENT_ERROR will be set to status * during the ulb_build() call. * * @param builder the builder * @param attribute the attribute * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_addUnicodeLocaleAttribute(ULocaleBuilder builder, const char* attribute); /** * Removes a unicode locale attribute, if present, otherwise has no * effect. The attribute must not be empty string and must be well-formed * or U_ILLEGAL_ARGUMENT_ERROR will be set to status during the ulb_build() call. * * <p>Attribute comparison for removal is case-insensitive. * * @param builder the builder * @param attribute the attribute * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_removeUnicodeLocaleAttribute(ULocaleBuilder builder, const char* attribute); /** * Resets the builder to its initial, empty state. * <p>This method clears the internal UErrorCode. * * @param builder the builder * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_clear(ULocaleBuilder builder); /** * Resets the extensions to their initial, empty state. * Language, script, region and variant are unchanged. * * @param builder the builder * @draft ICU 74 */ U_CAPI void U_EXPORT2 ulb_clearExtensions(ULocaleBuilder builder); /* * Build the Locale< stirng from the fields set * on this builder. * If any set methods or during the ulb_build() call require memory allocation * but fail U_MEMORY_ALLOCATION_ERROR will be set to status. * If any of the fields set by the setters are not well-formed, the status * will be set to U_ILLEGAL_ARGUMENT_ERROR. The state of the builder will * not change after the ulb_build() call and the caller is free to keep using * the same builder to build more locales. * * @param builder the builder * @param err the error code * @return the length of the locale id in buffer * @draft ICU 74 */ U_CAPI int32_t U_EXPORT2 ulb_build(ULocaleBuilder builder, char* buffer, int32_t bufferCapacity, UErrorCode* err); /** * Sets the UErrorCode if an error occurred while recording sets. * Preserves older error codes in the outErrorCode. * * @param builder the builder * @param outErrorCode Set to an error code that occurred while setting subtags. * Unchanged if there is no such error or if outErrorCode * already contained an error. * @return true if U_FAILURE(*outErrorCode) * @draft ICU 74 */ U_CAPI UBool U_EXPORT2 ulb_copyErrorTo(ULocaleBuilder builder, UErrorCode *outErrorCode); #endif /* U_HIDE_DRAFT_API */ #endif // __ULOCBUILDER_H__ -- Frank Yung-Fong Tang 譚永鋒 / 🌭🍊 Sr. Software Engineer |