Re: [Cppunit-devel] Unicode Support (was: 1.6.0 is released!)
Brought to you by:
blep
From: Baptiste L. <bl...@cl...> - 2001-09-25 20:27:45
|
----- Original Message ----- From: "Michael Arnoldus" <ch...@mu...> To: "Cpp Unit Develpment Mailing List" <cpp...@li...> Sent: Monday, September 24, 2001 3:00 PM Subject: Re: [Cppunit-devel] Unicode Support (was: 1.6.0 is released!) >Let me see if I understand you correctly. When not changing anything in CppUnit, you are degrading the strings in CppUnit to be simple containers which contains "something the TestRunner know what to do with" - and currently it just happens to be ASCII strings. Actually, I was rather suggesting MBCS only as a work around having only std::string (which should be able to contain MBCS string, right?). My point was rather that unless you are doing application which essence is to manipulate Unicode object (dictionnary, translator,...), you can probably get away with restricting your Unicode string to latin1 for ASSERT_EQUAL and ASSERT_MESSAGE. >This might work, but I think it should be reflected in the design then - f.ex. by letting the TestRunner pass the strings as a pointer to a subclass of a specific class (or maybe a template parameter class). Yes, but let's not consider it for now. That solution strike me as wrong (who know what they would put in a string ;-) ). >I personnaly do not like a design where a string is something we use to stuff something else into and then decode it again - this way doom lies! Don't know much about MBCS, just that is a way to encode character set depending on your page (meaning you need to know your page to decode). >You are right __FILE__ are not unicode. >Can you use unicode strings in the assert macros the way you have done it? No. I convert the Unicode string to latin1 in my implementation of toString(), which is not really a loss since there are just object dump and I control their content. >Yes, unicode use a different runtime library. Managing all those configuration will become a source of headache: 2 DLL config. 2 static config (not yet, but requested) *2 for unicode *2 : cppunit and testrunner => if cppunit remains the same for unicode and ansi string, there twice less... >A while ago you asked me about Unicode programming on unix. I have found an article about unicode on linux: http://www-106.ibm.com/developerworks/linux/library/l-linuni.html I'll try to give it a look. I'm using Qt (http://www.trolltech.com) for now which implements its one Unicode support (lots better than MFC if I might say). >I now the first Unicode port for windows I did, was not a very good job. I'll be willing to it right, but I need somebody who understands CppUnit to talk to about the design. Nor was it a bad job. The solution could be acceptable (the definition of the generic string/stream would need to be centralized, but that's it). What bother me, is that it have a global impact on CppUnit, and having string that change that way is a source of headache. What the user of CppUnit should do ? Use CppUnit::String or CString, or Tools::String (typedef to std::string or std::wstring like in CppUnit).... Note from that point, I'm just throwing ideas around, hoping to discover some interesting stuffs... I think it should be possible to design something a lots cleaner. Let's put the fact down: 1) User define strings that may be Unicode are: - result of assertion_traits::toString(), used by ASSERT_EQUAL - parameter of ASSERT_MESSAGE 2) Those strings are just a way to convey additional information with the test failure. Let's ignore the current implementation of ASSERTxxx for now. Let's imagine that we have: ASSERT_UNICODE_MESSAGE( unicodeMessage, condition ); How to we get the unicode string to the TestRunner. The obvious answer is using Exception, the class used to report the failure detail to the TestRunner. Once we got that part down, then the remaining elements could be tackle down. So my take would be that the first step toward adding unicode support to CppUnit would be to add Unicode support to Exception. I believe Exception should still provide a std::string interface to the failure message (all TestRunner are not written to support unicode), but should also provide a std::wstring interface. We would have: std::string what() const; std::wstring unicodeWhat() const Constructor would need to be changed to accept std::string or std::wstring for the message. That would also impact Exception subclass: NotEqualException take two strings at construction, so two constructors should also be provided. (PS: can we do std::wstring( L"\u306b\u307b" ) ? how to you go inputing hardcoded unicode string?) My guess would be that Exception would store everything as std::wstring (the wider format). We should also have utility function to from and to unicode (unicode to ansi could be a dummy convertion: if character code is not in range 0-255 then replace with '?'). For user, everything is transparent. TestRunners could use either the ansi or the unicode version to retrieve information. And while I'm at it, I can think of a way to deal with ansi/const char *conversion: template<typename StringType> struct convert_to_wstring { std::wstring toWString( StringType str ) { return str; } }; template<typename StringType> std::wstring convertToWString( StringType str ) { return convert_to_wstring<StringType>( str ); } And you specialize for const char*, std::string, and possibility other user define string... That template function could be use anywhere to uniformize string to wstring (for exemple implementation of assertion_traits that returns a std::string instead of std::wstring). The only dark point would be: static std::string toString( const T& x ) { OStringStream ost; ost << x; return ost.str(); } That use an ansi stream. What would be the impact of changing that to a wide stream ? What do you think ? Baptiste. --- Baptiste Lepilleur <gai...@fr...> http://gaiacrtn.free.fr/index.html Author of The Text Reformatter, a tool for fanfiction readers and writers. Language: English, French (Well, I'm French). |