#1270 "set encoding locale" not working on Windows

5.0
closed-fixed
nobody
5
16 hours ago
2013-07-31
Vladimir
No

Hello.
I use gnuplot 4.6 patchlevel 3 on Windows XP x86 SP3.

C:\>gnuplot

        G N U P L O T
        Version 4.6 patchlevel 3    last modified April 2013
        Build System: MS-Windows 32 bit

        Copyright (C) 1986-1993, 1998, 2004, 2007-2013
        Thomas Williams, Colin Kelley and many others

        gnuplot home:     http://www.gnuplot.info
        faq, bugs, etc:   type "help FAQ"
        immediate help:   type "help"  (plot window: hit 'h')

gnuplot changed the codepage of this console from 866 to 1251 to
match the graph window. Some characters might only display correctly
if you change the font to a non-raster type.

Terminal type set to 'wxt'
gnuplot> show locale

        gnuplot LC_CTYPE   Russian_Russia.1251
        gnuplot encoding   default
        gnuplot LC_TIME    Russian_Russia.1251
        gnuplot LC_NUMERIC C

gnuplot> set encoding locale
gnuplot> show locale

        gnuplot LC_CTYPE   Russian_Russia.1251
        gnuplot encoding   default
        gnuplot LC_TIME    Russian_Russia.1251
        gnuplot LC_NUMERIC C

gnuplot>

As you can see my local encoding is cp1251 but after "set encoding locale" encoding not changes to local!
It may cause problems because by gnuplot manual:

====================
...
The set encoding command selects a character encoding. Syntax:

  set encoding {<value>}
  set encoding locale
  show encoding

Valid values are

default - tells a terminal to use its default encoding
...
The command set encoding locale is different from the other options. It attempts to determine the current locale from the runtime environment.
...
====================

I think it means that "default" = "default for terminal"; "locale" = "default for runtime environment".

After that I try execute this script (see "svg-script-in-cp1251.gpt" in attachments):

set encoding locale
set terminal svg
set output 'svg-script-in-cp1251.gpt.svg'
set ylabel 'Синус'
plot sin(x)

Outputed SVG is incorrect. (see "svg-script-in-cp1251.gpt.svg" in attachments).
SVG-parser (in Firefox) crashes on fist char of word "Синус". Inkscape just show message "Failed to load the requested file".

Basically problem is that first string of SVG set utf-8 encoding.

<?xml version="1.0" encoding="utf-8"  standalone="no"?>

But actually encoding of file - cp1251.

If you open outputed SVG by text editor and changes first string...

<?xml version="1.0" encoding="windows-1251"  standalone="no"?>

Firefox and Inkscape open this file perfectly. (just compare "svg-script-in-cp1251.gpt.svg" and "svg-script-in-cp1251.gpt(corrected-manualy).svg" from attachments).

After that I tried some other terminals (cairo, emf and eps) and they are working correctly. (see "emf-script-in-cp1251.gpt", "emf-script-in-cp1251.gpt.emf" and "pdfcairo-script-in-cp1251.gpt", "pdfcairo-script-in-cp1251.gpt.pdf")

So I think that we have 2 potential issues:
1. Encoding not changes from "default" to "local", and that provokes problem. SVG terminal think "Hmmm, default encoding... As for me default encoding is UTF-8". And for other terminals default encoding is cp1251, so they works correctly.
2. Maybe "set encoding" works correctly (but why "show local" prints "gnuplot encoding default" after "set encoding local"?), but SVG terminal incorrectly detects setted encoding.

7 Attachments

Discussion

  • Vladimir
    Vladimir
    2013-07-31

    • labels: svg encoding locale --> svg
     
  • Vladimir
    Vladimir
    2013-07-31

    • labels: svg --> svg, encoding
     
  • Vladimir
    Vladimir
    2013-07-31

    • labels: svg, encoding --> svg, encoding, locale
     
  • Ethan Merritt
    Ethan Merritt
    2013-08-06

    I am not sure I follow all that.
    Does it work to say:
    set encoding cp1251
    set term svg
    set output 'cp1251.svg'
    set ylabel 'Синус'
    plot sin(x)

    If that works, which is what I would expect, then the problem is that "set encoding locale" does not work on your Windows system. I don't know enough about Windows support for locales to fix this in general, but the work-around is to explicitly provide the encoding as above.

     
  • Ethan Merritt
    Ethan Merritt
    2013-08-12

    • summary: Corrupted SVG with 'locale' encoding in Windows --> "set encoding locale" not working on Windows
     
  • Vladimir
    Vladimir
    2013-08-17

    In fact that Gnuplot User Manual don't contain any tips about locales in Windows. And actualy I don't sure about locales on Windows. I set variable LANG=en for gnuwin32. I use 'en' to avoid the Russian characters in console because it produces some problems with console: windows use cp1251 encoding, but Windows's console (cmd.exe) use cp866 encoding by default, like MS DOS with russian localization. What idiot from Microsoft came up with it to do so and why? (I mean that all people are suited by 866 encoding. Why they invent new encoding?) As you may see LANG variable has no effect on gnuplot on my system and gnuplot detects Russian language correctly.
    In previous gnuplot versions I use 'set encoding cp1251' more often, but with every new release larger and larger half of terminals use correct encoding automaticly. It makes me happy.
    After updating (4.6 patchlevel 3) 'show encoding' by default prints

    nominal character encoding is default
    however LC_CTYPE in current locale is Russian_Russia.1251'.
    

    I make small research. Some terminals still require 'set encoding', but some didn't require for making correct plots. I test only userful for me terminals. (I'm not specialist in BeOS picture formats for X11...)

    So you need 'set encoding' on Windows with Russian if you use:

    png
    jpeg
    postscript (I like it so much! It produces most beautiful output. IMHO)
    svg
    

    You don't need 'set encoding' for using:

    pdf (In new release pdf == pdfcairo)
    emf
    eps ( == epscairo)
    pngcairo  ( != png)
    

    I understand that terminals behavior may be various because it writed by various people and it may be writed long ago. But still don't understand that thing: why terminal don't just ask gnuplot about encoding and other configs if terminal supports various encodings and else? Actualy as you may see on my system gnuplot correctly knows encoding and locale. Maybe Windows not allow locales, but who cares it? Gnuplot anyway knows all! (Look at first listing in first post.) Some terminals just don't use it. It fine if you're make plots by hands because you may just set encoding manualy (If you know command 'set encoding'). But it makes you some nervous when you try write an application that creates graphs by using gnuplot, and if you try make it portable. (Converting some like "win-1251", returned by framework to "cp1251" for 'set encoding' and so on isn't so funny). I was hoping that I could simply detect locale by something as

    String locale_string = System.GetLocale().GetLocaleName();
    

    and then just ask gnuplot to use this locale_string on any system with any locale. But as I see now it doesn't work, and it more complex problem. :-)

    As I know it common question for russian newcomers ("I try set label, but got gibberish. What am I doing wrong?"). Maybe in the future developers will be able to smooth out some rough edges with encodings in various terminals on Windows. And I can see progress in this direction in last years with *cairo terminals.

    Thanks for answer. And thanks for gnuplot. It realy very userful for me.

     
  • As it turns out "set encoding locale" only works for utf8 and sjis on any platform. On Windows this is now fixed in the 4.6 and the version 5 branch. Windows might well be the only platform where this is still relevant.

     
    • status: open --> pending-fixed
     
    • status: pending-fixed --> closed-fixed