On 6/24/2011 4:17 AM, Yongwei Wu wrote:
> On 24 June 2011 00:54, Charles Wilson <cwilso11-Rn4VEauK+AKRv+LV9MX5uipxlwaOVQ5f@...> wrote:
>> Anyway, modifying our tools to always 'setlocale(LC_CTYPE, "C")' just
>> seems counterproductive, a step backwards, to me.
> It might be the way to go, if the existing code does not conform to
> the way Microsoft expects it, and it is too big a trouble to change
> it. BTW, the consequences of calling setlocale (other than "C") is
> often not well documented.
> *Microsoft thinks if the locale is not "C", it will do some conversion
> for you.* This may not be what programmers want, though I think
> Microsoft has its rationales to do that.
How..."nice"...of Microsoft. And typical -- to assume they know best.
> This is at least the third time I was bitten by Microsoft surprises in
> dealing with non-ASCII characters:
> 1) Microsoft does not allow passing a string that is "invalid" in the
> current locale to the format string in *printf. One Vim crash was
> caused by that and was later fixed by calling setlocale(LC_CTYPE,
OK, this affects MinGW (*) programs only, since MSYS applications have
their own implementation of all of the *printf functions.
(*) But...if you build the MinGW application with
-D__USE_MINGW_ANSI_STDIO, then you get an independent implementation of
the printf functions. Dunno if that would *help* in this case -- I
doubt our libmingwex.a implementation is multibyte aware. It probably
just interprets them as single-byte strings.
What happens if you compile your test program with -D__USE_MINGW_ANSI_STDIO?
> 2) Beginning in Visual Studio 2005, std::fstream has a dependency on
> the locale when opening files that have non-ASCII characters in the
> name. I resorted to the (non-standard) fstream::open(wchar_t)
> interface to overcome that. MinGW does not support this form.
But stuff compiled using mingw does not use the msvcrt71 runtime (**)
that is part of VS2005. MinGW-compiled stuff uses plain old msvcrt.dll.
(**) Well, it doesn't unless you do a lot of work to make that happen.
> 3) Now I see that incomplete characters cannot be output to the
> console (but still OK for the files). And Microsoft converts the
> characters from the locale to the console code page for you.
But the problem, as I see it, is a mismatch between two *different*
non-C locales, right? Plus the fact that the msvcrt setlocale(*, "")
function does not respect environment variables.
There is a gnulib module that is intended to fix this, specifically for
mingw, just introduced earlier this year.
Perhaps we should look into importing this setlocale replacement into
*all* i18n-capable mingw apps we provide. (We can't add it to libmingwex
because it's LGPL, not public domain. For simplicitly, we could ship a
simple LGPL .a in a mingw-libsetlocale package that provides this
module, and then add it to the LDFLAGS of other mingw packages...)
>> (*) Even cygwin didn't add really GOOD support for i18n until 1.7; msys
>> is based on 1.3...However, there's got to be a reasonable fix that still
>> allows i18n, because nobody reported problems with cgywin-1.3 and the
>> cygwin gcc of that era -- which /was/ NLS-enabled. OTOH, Win7 didn't
>> exist back then, either.
> It has nothing to do with MSYS, since I am not using it.
I see that now.
>> Yongwei Wu -- what /terminal/ are you using? There are several
>> alteratives that may affect your experience:
>> * invoking gcc from a bash shell, running in a cmd.exe window?
> The same as using the Command Prompt without bash (Cygwin).
>> * invoking gcc from a bash shell, running in rxvt
> As expected, characters are correct. Rxvt (Cygwin) is I/O redirection,
> as far as MSVCRT is concerned.
>> * invoking gcc from a bash shell, running in mintty
> No experience. I assume it is the same as rxvt.
Actually, no. rxvt is ancient, dead, and buried, and has ZERO support
for codepages, charsets, or i18n of any kind. mintty has a lot of
modern win32-native support for such things.
>> * invoking gcc directly from within a cmd.exe window (no bash)
> This is what I use, and has the described behaviour.
Well, that's certainly the simplest case.
However, the effect of using cygwin-bash is probably much different than
using msys-bash, so I would not consider them interchangeable.
For now, tho, we should concentrate on fixing the simple problem: mingw
exe's running in plain old cmd.exe with no bash at all.