From: sfeam <sf...@us...> - 2017-03-24 03:10:55
|
On Thursday, 23 March 2017 08:09:14 PM Allin Cottrell wrote: > Sorry, this is quite ticklish but I'll try to explain it as best I > can. > > I'm not sure, from reading the gnuplot help on "encoding", of the > exact scope and effect of giving a "set encoding XXX" command in a > plot file. > > Here's the context: my program writes a gnuplot command file, > designed to produce PNG output via the pngcairo "terminal", and > among the users of the program are people working on Windows in > Russian. There are two possible non-ASCII elements in the plot file: > > 1) the name of the output file (as in "set output 'OOO'"), which for > MS Windows in Russian will be encoded in CP1251; and > > 2) strings occurring in titles, labels or whatever in the body of > the plot: by default these will be in UTF-8, which is what pngcairo > expects. > > At present I'm sticking a line into the plot file: > > set encoding utf8 > > which I hope is going to tell gnuplot, "Whatever you might think > based on the fact that you're working on Windows in Russian, please > interpret titles/labels as being in UTF-8." That much is fine. It also has the effect, for the png terminal and some others, that when you specify a font by name it will try to find a version of it that uses your specified encoding. > So here's the question: given that the output filename is in CP1251, > is my "set encoding" line liable to interfere with gnuplot's output > routine (for example, such that output cannot be written because > some non-ASCII component of the path is non-existent, if the bytes > are interpreted as UTF-8), or is gnuplot's I/O mechanism separate > and insulated from "set encoding"? Gnuplot does not care what is in the string used as a file name. Linux/unix also does not care what is in the string used as a file name. Any sequence of bytes is a legal filename even if is not printable. Windows - I'm not so sure. There are two ways that it might go wrong on windows that I have heard of, and I suppose they might interact badly. Caveat: I don't use Windows myself, so I'm only repeating what I have seen mentioned elsewhere. (1) Windows filesystems only allow certain encodings for file names, and UTF-8 is not one of the allowed encodings. https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx (2) At least some incarnations of Windows used a magic byte sequence known as BOM to indicate the encoding used by a text file. If your gnuplot script file contains UTF-8 anything, some Windows machines are unhappy if it does not start with BOM. On the other hand if it _does_ start with BOM then strings in the script file that are really CP1251 rather than UTF-8 might (I am guessing) be converted inappropriately. So I think your question is actually a Windows + script file format question rather than anything specific to gnuplot. I doubt that "set encoding" matters, but mixing UTF-8 and CP1251 in the same script file may be intrinsically problematic on Windows. > As you might expect, this is not merely hypothetical: I'm getting an > error report from a Russian Windows user, and I wonder if the fact > that wgnuplot.exe is exiting with a non-zero code when trying to > process a command file written by my program might have something to > do with a text encoding issue. Does the same script work if the file names it refers to are strictly ascii? Ethan |