From: Markus S. <mar...@gm...> - 2009-02-17 19:17:07
|
On Tue, Feb 17, 2009 at 10:18 AM, Andy Heninger <and...@gm...>wrote: > On Mon, Feb 16, 2009 at 5:27 PM, Markus Scherer <mar...@gm...> > wrote: > > NB: > > I could just as well add such functions outside the UnicodeString class, > as > > UnicodeString UnicodeStringFromUTF32(const UChar32 *utf32, length); > > and > > int32_t UnicodeStringToUTF32( > > const UnicodeString &s, > > UChar32 *utf32, int32_t capacity, UErrorCode *pErrorCode); > > Would that be better? > > I think we should follow the conventions and style of the existing > UnicodeString class as closely as possible, which would suggest > constructors and/or member functions. Seems like it would be less > confusing overall. Note: Sometime soon I plan to propose additional functions, for creating a UnicodeString from an STL string (UTF-8) and vice versa. It seems cleaner to add all of these as non-member functions. It might also be nice to have dedicated functions for UTF-8 char* (not taking a charset name parameter), and those would not work well as constructor/setTo overloads. We already got into that problem with the dedicated from-invariant-characters constructor/setTo for which we had to invent a weird signature with a special enum type. In terms of performance, I don't think there is much of a difference. Whichever way the API is done, a conversion from UTF-32 to UTF-16 has to be done. It's fast, but not as fast as the existing setTo() which either just do a memcpy() or alias the UnicodeString's internal pointer to a buffer. In fact, by not providing constructor/setTo functions for "expensive" operations, they stand out better to someone looking at code. But you are right that we didn't follow this model with our existing constructors (only with the setTo() functions.) Ok, there is setTo(UChar32) -- but that's not much of a conversion, it's just a U16_APPEND() macro wrapper :-) markus |