|
From: Hisham S. <sue...@ya...> - 2016-09-08 05:07:09
|
I totally agree and the following simple experiment proves it:
#include <stdio.h>
int main() { char s[] = "\xEC\xAC\x9F\xEA\n";
char * t = s; while (*t != '\0') { putchar (*t); t++; } return 0;}
This program has in s my first name in Arabic encoded using code page 720. According to the command window properties this is the current code page on my computer. However I still get garbage.When I redirect the output to a file then display this file using notepad++ I get garbage, but when I change the encoding to "OEM 720", it displays correctly... hisham...
On Thursday, September 8, 2016 5:02 AM, Yongwei Wu <wuy...@gm...> wrote:
On 8 September 2016 at 01:42, Eli Zaretskii <el...@gn...> wrote:
> --text follows this line--
>> Date: Wed, 7 Sep 2016 17:07:27 +0000 (UTC)
>> From: Hisham Sueyllam <sue...@ya...>
>>
>> I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic,
>> including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any
>> editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have
>> Windows 10 64-bit and gcc 5.3.0 (the latest)
>> By the way if anyone is curious, the Arabic string is my first name:
>>
>> #include <stdio.h>
>>
>> int main() {
>> puts(u8"هشام\n");
>> return 0;
>> }
>
> Your email is encoded in UTF-8, and you use Emacs, so I strongly
> suspect the Arabic string in your source file was also encoded in
> UTF-8. That will never work; you need to encode it in your system's
> ANSI codepage, I think. Windows includes only a very sporadic support
> for UTF-8, and in particular cannot use it in multibyte strings.
The problem is that GCC treats the file source as UTF-8, but the puts
function can accept only an ‘ANSI’ string. The conclusion is the same
as yours: removing ‘u8’ and saving the file as the ‘ANSI’ code page
should work.
As discussed earlier, putws does not work. Otherwise, putws(L"هشام\n")
is the logical solution on Windows.
--
Yongwei Wu
URL: http://wyw.dcweb.cn/
|