From: Немос <nem...@ya...> - 2008-04-16 14:16:35
|
I have some problem with setlocale() - when I try to change output code page it does not work. I use this example to test setlocale(): #include <stdio.h> #include <locale.h> int main(int argc, char *argv[]) { try { char text[]="Привет"; setlocale(LC_CTYPE,".866"); puts(text); setlocale(LC_CTYPE,".1251"); puts(text); } catch (...) { printf("catch exception"); }; return 0; }; Build by command g++ main.cpp -o 1.exe Program print two identical string, but in different code page it must be different text. I test it with gcc-*-3.4.5-20060117-1 and gcc-*-3.4.5-20060117-2. In VS this code work fine. What I do wrong? P. S. Sorry for my english. |
From: Brian D. <br...@de...> - 2008-04-16 14:35:13
|
????? wrote: > I have some problem with setlocale() - when I try to change > output code page it does not work. I use this example to > test setlocale(): As I understand it, setlocale() will only change the behavior of string functions in the C runtime. If you want to change the output codepage of the current console you need to use SetConsoleOutputCP(), and likewise with SetConsoleCP() for the input codepage. > In VS this code work fine. Visual Studio uses a different/newer version of the C runtime, whereas MinGW uses the old MSVCRT.DLL as that is the only one that's part of the operating system and present on any machine without worrying about redistributables. You can try linking against the newer versions and see if the behavior is any different. Brian |
From: Немос <nem...@ya...> - 2008-04-17 14:56:36
|
Brian Dessent пишет: > ????? wrote: > >> I have some problem with setlocale() - when I try to change >> output code page it does not work. I use this example to >> test setlocale(): > > As I understand it, setlocale() will only change the behavior of string > functions in the C runtime. If you want to change the output codepage > of the current console you need to use SetConsoleOutputCP(), and > likewise with SetConsoleCP() for the input codepage. I need code that will be work in linux and windows with minimum #ifdef. So I think butter use new version of msvcrt.dll. >> In VS this code work fine. > > Visual Studio uses a different/newer version of the C runtime, whereas > MinGW uses the old MSVCRT.DLL as that is the only one that's part of the > operating system and present on any machine without worrying about > redistributables. You can try linking against the newer versions and > see if the behavior is any different. > > Brian As I understand to use new version of msvcrt.dll i must add option -lmsvcr80 to build command. But in this case when i try to lunch program i have error message from Microsoft Visual C++ Runtime Library: R6034 An application has made an attempt to load the C runtime library incorrectly. |
From: Greg C. <gch...@sb...> - 2008-04-17 15:03:53
|
On 2008-04-17 14:52Z, Немос wrote: > > As I understand to use new version of msvcrt.dll i must add > option -lmsvcr80 to build command. But in this case when i > try to lunch program i have error message from Microsoft > Visual C++ Runtime Library: http://cygwin.com/ml/cygwin/2006-11/msg00710.html |
From: Немос <nem...@ya...> - 2008-04-21 11:41:37
|
Greg Chicares пишет: > On 2008-04-17 14:52Z, Немос wrote: >> As I understand to use new version of msvcrt.dll i must add >> option -lmsvcr80 to build command. But in this case when i >> try to lunch program i have error message from Microsoft >> Visual C++ Runtime Library: > > http://cygwin.com/ml/cygwin/2006-11/msg00710.html > Thanks, but it`s not enough. As I understand from google msvcr90 (or msvcr80) must be initialize write. I decide to use SetConsoleOutputCP() and #ifdef. When be more free time find a way to use msvcr80.dll. |
From: Roumen P. <bug...@ro...> - 2008-04-22 18:45:11
|
Немос wrote: > Немос пишет: >> Greg Chicares пишет: >>> On 2008-04-17 14:52Z, Немос wrote: >>>> As I understand to use new version of msvcrt.dll i must add >>>> option -lmsvcr80 to build command. But in this case when i >>>> try to lunch program i have error message from Microsoft >>>> Visual C++ Runtime Library: >>> http://cygwin.com/ml/cygwin/2006-11/msg00710.html >>> >> Thanks, but it`s not enough. As I understand from google >> msvcr90 (or msvcr80) must be initialize write. I decide to >> use SetConsoleOutputCP() and #ifdef. When be more free time >> find a way to use msvcr80.dll. > I found the problem. To use msvcr80.dll or msvcr90.dll need > to create manifest. > And no needs to do as says in > http://cygwin.com/ml/cygwin/2006-11/msg00710.html > need only add -lmsvcr80. > As I found problem with setlocale() have mingw32-make (and > may be other utils too). > When i try to use it i get error, and get it on russian but > message was in code page 1251 and console work in code > page 866. When I switch code page (by chcp command) before > run mingw32-make i can read message. > There use setlocale for switch to default locale but it > don`t switch output code page. > For this I must create new thread or add bug? May be console output is changed on NT 5.1 (XP). On system before NT 5.1 (9x/me,nt4,w2k) I think that console output codeset(codepage) is not synchronized with system code-page(1251). So on old systems it could be in IBM866 or IBM855 (depend from OS-version). Sorry, I could not remember well. Roumen |
From: Немос <nem...@ya...> - 2008-04-23 05:46:04
|
Roumen Petrov пишет: skip >> As I found problem with setlocale() have mingw32-make (and may be >> other utils too). >> When i try to use it i get error, and get it on russian but >> message was in code page 1251 and console work in code page 866. When >> I switch code page (by chcp command) before run mingw32-make i can >> read message. >> There use setlocale for switch to default locale but it don`t switch >> output code page. >> For this I must create new thread or add bug? > > May be console output is changed on NT 5.1 (XP). > On system before NT 5.1 (9x/me,nt4,w2k) I think that console output > codeset(codepage) is not synchronized with system code-page(1251). > So on old systems it could be in IBM866 or IBM855 (depend from OS-version). > Sorry, I could not remember well. > > Roumen > I have Windows XP SP2, setlocale() don`t change the output codeset (if use msvcrt.dll, if use msvcr80 or high it change), from it problem I begin this thread. Now question what i must do with mingw32-make (and may be other utils) problem what i found: begin new thread or add bug? |
From: Earnie B. <ea...@us...> - 2008-04-23 15:17:43
|
Quoting Немос <nem...@ya...>: > Roumen Petrov пишет: > skip >>> As I found problem with setlocale() have mingw32-make (and may be >>> other utils too). >>> When i try to use it i get error, and get it on russian but >>> message was in code page 1251 and console work in code page 866. When >>> I switch code page (by chcp command) before run mingw32-make i can >>> read message. >>> There use setlocale for switch to default locale but it don`t switch >>> output code page. >>> For this I must create new thread or add bug? >> >> May be console output is changed on NT 5.1 (XP). >> On system before NT 5.1 (9x/me,nt4,w2k) I think that console output >> codeset(codepage) is not synchronized with system code-page(1251). >> So on old systems it could be in IBM866 or IBM855 (depend from OS-version). >> Sorry, I could not remember well. >> >> Roumen >> > I have Windows XP SP2, setlocale() don`t change the output > codeset (if use msvcrt.dll, if use msvcr80 or high it > change), from it problem I begin this thread. > Now question what i must do with mingw32-make (and may be > other utils) problem what i found: begin new thread or add bug? > Perhaps the tools need UNICODE support. I.E.: Someone, *you*, needs to add support for the wide characters in the code pages you want to use. You might need to go to the specific tool lists to get help. For instance mak...@gn... is the list for the make source we use to produce mingw32-make. Are there configure switches for UNICODE support already and if so is the windows port of that software able to perform correctly? If yes to both then we should be able to host a distribution of the tool that works correctly. Earnie |
From: Немос <nem...@ya...> - 2008-04-24 15:09:32
|
Earnie Boyd пишет: skip > > Perhaps the tools need UNICODE support. I.E.: Someone, *you*, needs to > add support for the wide characters in the code pages you want to use. > You might need to go to the specific tool lists to get help. For > instance mak...@gn... is the list for the make source we use to > produce mingw32-make. Are there configure switches for UNICODE support > already and if so is the windows port of that software able to perform > correctly? If yes to both then we should be able to host a > distribution of the tool that works correctly. > > Earnie > Problem not in Unicode support. As i found mingw32-make use strerror() from msvcrt.dll. It give error message for error code at default language (for system or for process) codding in default charset, in my case it russian and code page 1251. As i think need extract code page number from string return from setlocale(LC_ALL, "") and switch console output (and may be input) code page. |
From: Yongwei W. <wuy...@gm...> - 2008-04-17 04:19:04
|
2008/4/16 Немос <nem...@ya...>: > I have some problem with setlocale() - when I try to change > output code page it does not work. I use this example to > test setlocale(): > > #include <stdio.h> > #include <locale.h> > > int main(int argc, char *argv[]) > { > try > { > char text[]="Привет"; > setlocale(LC_CTYPE,".866"); > puts(text); > setlocale(LC_CTYPE,".1251"); > puts(text); > } > catch (...) > { > printf("catch exception"); > }; > return 0; > }; > > Build by command > g++ main.cpp -o 1.exe > > Program print two identical string, but in different code > page it must be different text. > > I test it with gcc-*-3.4.5-20060117-1 and > gcc-*-3.4.5-20060117-2. > > In VS this code work fine. I do not understand why you said "In VS this code work fine". On the contrary, my test with Visual Studio 2003/2005 shows that the output is exactly the same. I know one way that will output different strings in VS. If you use wchar_t and _putws, things will be different, since _putws will convert the encoding from UTF-16 to the console code page. You should save the source file in UTF-16, in this case; otherwise the result is unreliable, since you cannot tell MSVC the source encoding, and the system legacy encoding is always used, which may or may not be what you want. AFAIK, MinGW will always interpret the input literals as Latin-1, so it does not work in MinGW. I think you will need to use the \u sequences in this case. The following code works in both MSVC 7.1/8.0 and MinGW, and probably achieves what you want. wchar_t text[]=L"\u041f\u0440\u0438\u0432\u0435\u0442"; setlocale(LC_CTYPE,".866"); _putws(text); setlocale(LC_CTYPE,".1251"); _putws(text); Best regards, Yongwei -- Wu Yongwei URL: http://wyw.dcweb.cn/ |
From: Немос <nem...@ya...> - 2008-04-17 14:45:30
|
Yongwei Wu пишет: skip >> >> In VS this code work fine. > > I do not understand why you said "In VS this code work fine". On the > contrary, my test with Visual Studio 2003/2005 shows that the output > is exactly the same. May be not install needed fonts for russian. I test code with VS 2008, source file in cp1251. > I know one way that will output different strings in VS. If you use > wchar_t and _putws, things will be different, since _putws will > convert the encoding from UTF-16 to the console code page. You should > save the source file in UTF-16, in this case; otherwise the result is > unreliable, since you cannot tell MSVC the source encoding, and the > system legacy encoding is always used, which may or may not be what > you want. I need that code work in Linux and Windows, and source in utf-8. But VS (compiler) don`t understand source in utf-8 without bom, and gcc don`t like bom. So I use mingw for win32 build. > AFAIK, MinGW will always interpret the input literals as Latin-1, so > it does not work in MinGW. I think you will need to use the \u > sequences in this case. The following code works in both MSVC 7.1/8.0 > and MinGW, and probably achieves what you want. > > wchar_t text[]=L"\u041f\u0440\u0438\u0432\u0435\u0442"; > setlocale(LC_CTYPE,".866"); > _putws(text); > setlocale(LC_CTYPE,".1251"); > _putws(text); |
From: Yongwei W. <wuy...@gm...> - 2008-04-18 04:57:42
|
2008/4/17 Немос <nem...@ya...>: > Yongwei Wu пишет: > skip > > >> > >> In VS this code work fine. > > > > I do not understand why you said "In VS this code work fine". On the > > contrary, my test with Visual Studio 2003/2005 shows that the output > > is exactly the same. > > May be not install needed fonts for russian. I test code > with VS 2008, source file in cp1251. I saw the expected output, in Cyrillic (with GCC, MSVC 7.1/8.0). I am not sure about 2008. It is a surprising result. I do not understand why setlocale should change the output. ANSI strings should not change when the locale changes. > > I know one way that will output different strings in VS. If you use > > wchar_t and _putws, things will be different, since _putws will > > convert the encoding from UTF-16 to the console code page. You should > > save the source file in UTF-16, in this case; otherwise the result is > > unreliable, since you cannot tell MSVC the source encoding, and the > > system legacy encoding is always used, which may or may not be what > > you want. > > I need that code work in Linux and Windows, and source in > utf-8. But VS (compiler) don`t understand source in utf-8 > without bom, and gcc don`t like bom. > So I use mingw for win32 build. Are you sure GCC works with UTF-8? My test shows that GCC always interpret the input in Latin-1. Of course BOM is invalid in this case. > > AFAIK, MinGW will always interpret the input literals as Latin-1, so > > it does not work in MinGW. I think you will need to use the \u > > sequences in this case. The following code works in both MSVC 7.1/8.0 > > and MinGW, and probably achieves what you want. > > > > wchar_t text[]=L"\u041f\u0440\u0438\u0432\u0435\u0442"; > > setlocale(LC_CTYPE,".866"); > > _putws(text); > > setlocale(LC_CTYPE,".1251"); > > _putws(text); I really wonder why your real problem is. Why do you want to have different outputs? (Notice that setlocale will not change the real console encoding, as mentioned by other people). I guess you *should* use Unicode, something similar to above. Best regards, Yongwei -- Wu Yongwei URL: http://wyw.dcweb.cn/ |
From: Giel v. S. <me...@mo...> - 2008-04-18 13:23:34
Attachments:
signature.asc
|
Yongwei Wu schreef: > 2008/4/17 Немос <nem...@ya...>: >> Yongwei Wu пишет: >>>> In VS this code work fine. >>> I do not understand why you said "In VS this code work fine". On the >>> contrary, my test with Visual Studio 2003/2005 shows that the output >>> is exactly the same. >> May be not install needed fonts for russian. I test code >> with VS 2008, source file in cp1251. > > I saw the expected output, in Cyrillic (with GCC, MSVC 7.1/8.0). I am > not sure about 2008. It is a surprising result. I do not understand > why setlocale should change the output. ANSI strings should not > change when the locale changes. AFAIK the strings themselves should indeed not change when using different locale settings, the way how they're interpreted might validly change though. To my knowledge this _should_ affect the way how the standard I/O functions work on non-binary output. I haven't bothered to check the ISO C89 or C99 standards, but according to "man fputws" (a manpage for "putws" was unavailable, though it are essentially the same functions, except that putws implicitly uses stdout): > int fputws(const wchar_t *ws, FILE *stream); > ... > The behavior of fputws() depends on the LC_CTYPE category of the current locale. This leads me to believe that it's actually quite valid that the behaviour of putws() changes when LC_CTYPE changes. >>> I know one way that will output different strings in VS. If you use >>> wchar_t and _putws, things will be different, since _putws will >>> convert the encoding from UTF-16 to the console code page. You should >>> save the source file in UTF-16, in this case; otherwise the result is >>> unreliable, since you cannot tell MSVC the source encoding, and the >>> system legacy encoding is always used, which may or may not be what >>> you want. >> I need that code work in Linux and Windows, and source in >> utf-8. But VS (compiler) don`t understand source in utf-8 >> without bom, and gcc don`t like bom. >> So I use mingw for win32 build. > > Are you sure GCC works with UTF-8? My test shows that GCC always > interpret the input in Latin-1. Of course BOM is invalid in this > case. AFAIK GCC doesn't concern itself with encodings at all. Too my knowledge a C (or C++) compiler shouldn't (and GCC doesn't) attempt to interpret the bytes in a string and as such it shouldn't attempt to work with any specific encoding on the string. This basically means that you can use whatever encoding you want, as long as the functions that work on that string can work with that encoding. So AFAIK this isn't an issue of a compiler but an issue of the specific libraries you use (in this case the standard C library). -- Giel |
From: Yongwei W. <wuy...@gm...> - 2008-04-19 05:26:42
|
2008/4/18 Giel van Schijndel <me...@mo...>: > >>> I know one way that will output different strings in VS. If you use > >>> wchar_t and _putws, things will be different, since _putws will > >>> convert the encoding from UTF-16 to the console code page. You should > >>> save the source file in UTF-16, in this case; otherwise the result is > >>> unreliable, since you cannot tell MSVC the source encoding, and the > >>> system legacy encoding is always used, which may or may not be what > >>> you want. > >> I need that code work in Linux and Windows, and source in > >> utf-8. But VS (compiler) don`t understand source in utf-8 > >> without bom, and gcc don`t like bom. > >> So I use mingw for win32 build. > > > > Are you sure GCC works with UTF-8? My test shows that GCC always > > interpret the input in Latin-1. Of course BOM is invalid in this > > case. > > AFAIK GCC doesn't concern itself with encodings at all. Too my knowledge > a C (or C++) compiler shouldn't (and GCC doesn't) attempt to interpret > the bytes in a string and as such it shouldn't attempt to work with any > specific encoding on the string. This basically means that you can use > whatever encoding you want, as long as the functions that work on that > string can work with that encoding. > > So AFAIK this isn't an issue of a compiler but an issue of the specific > libraries you use (in this case the standard C library). I beg to disagree. This does not matter when dealing with a char string, but it matters when dealing with wchar_t strings. How do you expect the system deal with this line: wchar_t text = L"Привет"; However, I must make a correction here. While GCC 2.95.3 and 3.2.3 always interpret the string literals in Latin-1--i.e., convert each byte "xy" to "xy 00" on Windows (little-endian, 2-byte wchar_t)--GCC 3.4.5 interpret the input literals in UTF-8 (a good thing really!). So, yes, the source code should be in UTF-8 for GCC. Regretfully MSVC works differently, and always interpret the string literals in the current legacy system encoding, except when there is the BOM character. I guess the best thing one can do is make GCC interpret the BOM (or, at least, check and ignore the UTF-8 BOM), if it is not already in the 4.x tree. Best regards, Yongwei -- Wu Yongwei URL: http://wyw.dcweb.cn/ |
From: Charles W. <cwi...@us...> - 2008-04-19 17:22:17
|
Yongwei Wu wrote: > I guess the best thing one can do is make GCC > interpret the BOM (or, at least, check and ignore the UTF-8 BOM), if > it is not already in the 4.x tree. See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415 http://gcc.gnu.org/ml/gcc-patches/2008-04/msg01371.html -- Chuck |
From: Yongwei W. <wuy...@gm...> - 2008-04-19 11:23:06
|
2008/4/17 Yongwei Wu <wuy...@gm...>: > 2008/4/16 Немос <nem...@ya...>: > > > I have some problem with setlocale() - when I try to change > > output code page it does not work. I use this example to > > test setlocale(): > > > > #include <stdio.h> > > #include <locale.h> > > > > int main(int argc, char *argv[]) > > { > > try > > { > > char text[]="Привет"; > > setlocale(LC_CTYPE,".866"); > > puts(text); > > setlocale(LC_CTYPE,".1251"); > > puts(text); > > } > > catch (...) > > { > > printf("catch exception"); > > }; > > return 0; > > }; > > > > Build by command > > g++ main.cpp -o 1.exe > > > > Program print two identical string, but in different code > > page it must be different text. > > > > I test it with gcc-*-3.4.5-20060117-1 and > > gcc-*-3.4.5-20060117-2. > > > > In VS this code work fine. > > I do not understand why you said "In VS this code work fine". On the > contrary, my test with Visual Studio 2003/2005 shows that the output > is exactly the same. It seems I messed something up last time. After playing with Visual C++ 9.0 Express Edition for a while, I realized that _putws and puts have different behaviour in 8.0/9.0 than in 6.0/7.1. In MSVC up to 7.1, puts functions are basically equivalent to the corresponding fputs functions. I.e., puts(str) is equivalent to: fputs(str, stdout); fputs(str, "\n:); However, in 8.0 and later versions, they behave differently. Their output is dependent on the current console code page under modern Windows versions. According to the VS2005 MSDN info on _putws: Under Windows NT, _putwch writes Unicode characters using the current CONSOLE LOCALE setting. What is not explicit is the fact that even puts is affected. It will convert the string from the runtime locale (set by setlocale) to the console locale (set by SetConsoleOutputCP, or the CHCP command). That was the reason why Nemos saw different outputs in his original test. I do not think this behaviour is portable, and would suggest Nemos not to rely on this behaviour. Best regards, Yongwei -- Wu Yongwei URL: http://wyw.dcweb.cn/ |
From: Немос <nem...@ya...> - 2008-04-22 16:01:45
|
Немос пишет: > Greg Chicares пишет: >> On 2008-04-17 14:52Z, Немос wrote: >>> As I understand to use new version of msvcrt.dll i must add >>> option -lmsvcr80 to build command. But in this case when i >>> try to lunch program i have error message from Microsoft >>> Visual C++ Runtime Library: >> http://cygwin.com/ml/cygwin/2006-11/msg00710.html >> > Thanks, but it`s not enough. As I understand from google > msvcr90 (or msvcr80) must be initialize write. I decide to > use SetConsoleOutputCP() and #ifdef. When be more free time > find a way to use msvcr80.dll. I found the problem. To use msvcr80.dll or msvcr90.dll need to create manifest. And no needs to do as says in http://cygwin.com/ml/cygwin/2006-11/msg00710.html need only add -lmsvcr80. As I found problem with setlocale() have mingw32-make (and may be other utils too). When i try to use it i get error, and get it on russian but message was in code page 1251 and console work in code page 866. When I switch code page (by chcp command) before run mingw32-make i can read message. There use setlocale for switch to default locale but it don`t switch output code page. For this I must create new thread or add bug? |