From: Stefan S. <se...@sy...> - 2003-02-24 19:56:32
|
Ole Laursen wrote: > But I really don't see the big problem here. std::string is really > just a fancy way of saying 'char *', right? No. > And any decent > Unicode-aware string library will have a convenient conversion from > std::string, right? No. > So if you need to process the individual characters (in my experience > with gtkmm/glibmm, this is seldomly needed) you can simply treat the > input/output from the library as raw data which you feed to your > string library. Why is this a problem? I don't fully get your point. Are you advocating libxml++ continuing to use std::string ? That's really a bad idea IMO: 'char *' is, beside being used for strings in C, a data type used for generic memory, i.e. there are no semantics associated with it (such as 'null terminated string'). std::string represents *text*, and as such, it has a lot more meaning. You can iterate over the elements, expecting to get at individual characters. Just to name an example. While it may be true that you can (technically) use std::string to contain utf8 data, the std::string *interface* would be completely inappropriate (beside the 'data()' and 'length()' methods :-) Please don't abuse std::string in such a horrible way. But to go along the line you seem to suggest: libxml++ may use a 'data container' that is agnostic of the encoding or any related interpretation of the content. That may actually not even be such a bad idea, since it could just be a smart pointer taking over the memory from libxml2, freeing the data in its destructor using xmlFree(). That would make it possible to abstract the unicode library away as my suggestion, and would replace my suggested compile-time polymorphism by runtime-polymorphism (assuming appropriate conversion functions doing the 'convert from/to libxml2' work). It wouldn't incure much performance penalty, as there is no additional copying involved. Regards, Stefan |