From: Allin C. <cot...@wf...> - 2017-03-24 21:09:32
|
On Fri, 24 Mar 2017, Ethan A Merritt wrote: > On Friday, 24 March, 2017 14:57:11 Allin Cottrell wrote: >> >> I ran an experiment to try to assess this. Booted Windows 8 (ugh) and >> created a directory named Beauté (that's with an e-acute) on my >> Desktop. I then created two copies of a simple gnuplot script to >> produce a PNG file. Each included the line >> >> set output 'c:/users/cottrell/desktop/Beauté/test.png' >> >> (encoded in cp1251). The two files were identical except that one of >> them included the line >> >> set encoding utf8 >> >> before the "set output" line. (And the accented character in the >> output filename was the only non-ASCII character in the files.) >> >> I then called wgnuplot.exe on the two scripts from the command line in >> a cmd.exe window. The one without "set encoding utf8" worked to >> produce the PNG, the other didn't. To see what was happening I then >> tried opening wgnuplot interactively and using the "load" command to >> run the scripts. The variant without "set encoding" again worked fine; >> the other one gave: >> >> set output 'c:/users/cottrell/desktop/Beaut?/test.png' >> cannot open file; output not changed >> >> (note that in gnuplot's error message echoing the "set output" line >> the e-acute has been changed to a question mark, actually not an >> ASCII question mark but an "unrecognized glyph" symbol). >> >> It therefore seems that "set encoding" has somehow altered gnuplot's >> reading of the bytes in the output filename. > > No, I don't think that is what is happening. > >> (Once again, those bytes >> are identical in the two files.) If gnuplot had simply passed the >> incoming cp1251 bytes to the OS, surely the output file would have >> been opened OK in both cases. > > What seems to be happening is that in syscfg.h on Windows it says > /* The unicode/encoding support requires translation of file names */ > #define fopen win_fopen > > and wmain.c:win_fopen() indeed tries to translate the name from the > current gnuplot encoding into Windows Unicode text. > I think the comment is wrong. File names should *not* be translated, > as you are finding out. The current gnuplot encoding is a separate > thing from the encoding used in the sourcecode of the script. > > I only see this code in the development version, not in the source > for 5.0.5 or 5.0.6. So I guess your bug report is specifically for > the development version? > > I'll defer to the Windows crowd here, but my tentative diagnosis > is that addition of a win_fopen() wrapper for fopen() in 5.1 should > be reverted. Aha, this is very interesting! Yes, I'm using the development version on Windows so your diagnosis seems very plausible. But actually, now I (think) I understand what's going on, I _like_ the idea behind win_fopen. If I've got this right, it would let me standardize on consistently UTF-8 gnuplot script files (including representing Windows paths in UTF-8), and let gnuplot take care of recoding paths on the fly as needed for interaction with the OS. It's ugly and error-prone to mix text encodings in a single file, but I guess that's what you have to do with gnuplot 5.0 if you want (a) to represent titles, labels and so on in UTF-8, but (b) to include Windows filenames that contain non-ASCII characters. It sounds like gnuplot 5.1 could improve on that. I can now try the experiment of keeping "set encoding utf8" but recoding Windows paths to UTF-8 when writing them into a gnuplot script. If that works, I'm happy! (But of course if the win_fopen wrapper is preserved the backward incompatibility needs to be made clear -- though it probably affects rather few people.) Allin Cottrell |