From: Allin C. <cot...@wf...> - 2020-11-01 20:24:37
|
On Sun, 1 Nov 2020, Ethan A Merritt wrote: > On Saturday, 31 October 2020 17:36:03 PST Allin Cottrell wrote: >> On Sun, 25 Oct 2020, Allin Cottrell wrote: >> >>> I tried googling this but didn't find an answer -- sorry if I should have >>> just tried harder! My question is: can gnuplot on Windows handle a unicode >>> filename argument passed in UTF-16? As in >>> >>> path/to/wgnuplot.exe <UTF-16 input filename> >> >> OK, that question was under-researched, but now I've done my >> homework. Sorry, this is a bit long but I hope I can arouse some >> interest in the topic. > > I don't have any direct insight into this issue other than to note > that the filesytem itself may be an issue. In some contexts, no doubt. But if we set aside exotica such as surrogate pairs, NTFS filenames are UTF-16 to a very good approximation. As such they are easily converted to UTF-8 to permit handling with good old C char * APIs, and easily converted back to UTF-16 for _wfopen() if required. > The following entry from the R developer blog is of interest > > https://developer.r-project.org/Blog/public/2020/05/02/utf-8-support-on-windows/ > > I gather from the discussion there that Windows-10 can be made to > support UTF-8 as a native encoding, calling it "extended ASCII". Interesting, yes, but at this point kinda science fiction. The practical issue at present is whether gnuplot wants to support out-of-codepage UTF-16 filenames on the Windows command line. It's not terribly difficult, as I tried to show. Sorry if I'm being repetitive, but right now if a create, say, a Russian-language filename on Windows and pass it as command-line argument to gnuplot, gnuplot will not be able to open the file because its name cannot be represented in my "system codepage". A program that reads the command line as UTF-16, however, will have no problem opening the file. Allin Cottrell |