#36 Does aria2c support Unicode?

1.18.2
closed
tujikawa
None
5
2013-11-10
2013-10-12
Nisto
No

Simple question: does it support Unicode? I have tried my best to get it to save to Unicode paths/filenames, but aria2c doesn't seem to be able to open a handle for anything containing Unicode characters (all logs show "?" instead of the actual characters), even with alternative command-line apps (e.g. ConEmu) that properly shows Unicode characters. I am on Windows, if that makes any difference...

Discussion

  • tujikawa
    tujikawa
    2013-10-20

    Currently aria2 internally uses utf-8. On Windows we uses ANSI codepage to UTF-8 conversion. I have no problem with Japanese characters on Windows 7 in -o option and percent-encoded URIs. aria2 1.18.0 has a bug in percent-encoding, and it may prevent from saving a file with the correct file name.

     
  • Nisto
    Nisto
    2013-10-30

    I'm on Windows XP though. I just tried the latest version, 1.18.1 and it still doesn't work for me. It was never an issue with URL encoded file names - it fails on opening a handle for the filename(s) (due to the characters being passed as "?", which is an erroneous character in paths on Win NTFS):

    -> [AbstractDiskWriter.cc:206] errNum=123 errorCode=16 Failed to open the file C:/aria2-1.18.1-win-32bit-build1/????, cause: The filename, directory name, or volume label syntax is incorrect.

    When you tried it, were you using Japanese locale by any chance? I tried opening cmd.exe via AppLocale with Japanese locale, and sending arguments to aria2c in that manner (I was using only Japanese characters, with the --out option by the way), but that didn't help - even if I set CHCP to 65001 (Unicode) as well. I might try changing the actual SYSTEM locale eventually to see if that helps, but not unless I HAVE to...

     
    Last edit: Nisto 2013-10-30
  • tujikawa
    tujikawa
    2013-10-31

    It turns out that Japanese code page is 932 (ANSI/OEM) and not unicode at all. This is why Japanese characters works on my PC.

     
  • tujikawa
    tujikawa
    2013-10-31

    aria2 currently converts command-line argument to UTF-8, assuming that the input is system default ANSI codepage. If we can detect console codepage, then we can use it as conversion hint.

     
  • tujikawa
    tujikawa
    2013-10-31

    Looks like to read unicode command-line args, windows requires special functions. We utilize them and successfully use unicode characters in 65001 codepage. See commit 3a8e8f8 in master.

     
  • Nisto
    Nisto
    2013-10-31

    Sounds like great news. Thanks, Tsujikawa! Will check it out when you guys release the next version (and there is no rush).

     
  • tujikawa
    tujikawa
    2013-11-10

    • status: open --> closed
    • Group: Undecided --> 1.18.2