Console should support UTF-8
Brought to you by:
fabioz
The PyDev console window should support UTF-8 output.
To find out whether it works, do
print u"Martin v. L\xF6wis"
in a script. This currently gives the error
print u"Martin v. L\xF6wis"
UnicodeEncodeError: 'ascii' codec can't encode
character u'\xf6' in position 11: ordinal not in range(128)
This, in turn, is due to sys.stdout.encoding being
None. It should be set to UTF-8 (IMO, or to the Eclipse
default encoding, whereever that comes from), and
decode all incoming bytes from that encoding. Likewise
for sys.stdin.encoding.
Logged In: YES
user_id=617340
Humm... strange... if you do:
print u"Martin v. L\xF6wis".encode('cp1252')
or if you do directly:
print "Martin v. Löwis"
it works ok (I believe that the encoding of the buffer is
the default for the platform)... this seems more like a
python issue to me than an Eclipse issue (don't you think so?)
Cheers,
Fabio
Logged In: YES
user_id=21627
Please try this in a console/terminal window, on Unix or
Linux, or in IDLE.
Python prints Unicode strings by looking a
sys.stdout.encoding, and
the encoding that is there is then used to encode the
Unicode string.
Printing a byte string just literally transmits it to the
terminal,
so it's no suprise that this "works".
The precise procedure that Python uses depends on the
operating system.
On Unix, Python checks whether stdout is a terminal (through
isatty);
if it is, it then uses the locale's charset (by invoking
nl_langinfo(CHARSET))
to find out the terminal's encoding. On Windows, it uses
GetConsoleOutputCP
to determine the encoding of the console window. In IDLE,
IDLE replaces
sys.stdout with something else (so output ends up in IDLE's
shell window),
and arranges to set the encoding on this "something else"
explicitly.
I'm not sure which of these strategies should work best for
PyDev. However,
it's clearly not Python's issue *alone* to figure out the
encoding of
sys.stdout when running in PyDev: Python would need some
mechanism to
find out that it is indeed running in PyDev, or PyDev should
arrange
to setup sys.stdout.encoding explicitly.
In any case, I can't follow your "Works for Me"
interpretation: I
very much doubt that the original example I've given
actually works
for you.
Logged In: YES
user_id=617340
Hummm... yeah, the problem is that the console in Eclipse is
not actually a "real" console... I'll have to take a better
look at the Eclipse API to see if it actually has some way
of setting it... (I've already taken a quick look without
any success).
Logged In: YES
user_id=21627
I see... it seems Java doesn't support creating processes in
a pseudo-terminal at all.
In that case, I think it would be possible to manually set
the encoding of sys.stdout, through PYTHONSTARTUP. The
startup code could be generated on the fly, to match the
console's encoding (which is given through the
DebugPlugin.ATTR_CONSOLE_ENCODING configuration AFAICT). It
would have to create a wrapper for sys.std{in|out|err},
since their encoding attribute is read-only, and any
original PYTHONSTARTUP file would need to be execfile'd.
Logged In: YES
user_id=617340
Originator: NO
Duplicate closed: https://sourceforge.net/tracker/index.php?func=detail&aid=1601848&group_id=85796&atid=577329
Logged In: YES
user_id=617340
Originator: NO
Actually, the PYTHONSTARTUP appears to work only in the interactive console, and I'm not really sure this is the best option... isn't there any way to pass this to interpreter (like python -u)?
-- It would be much better than making this kind of workaround, or python trying to discover which encoding it should use (as sys.stdout.encoding is readonly, there should be an option to set it... or not?)
Logged In: YES
user_id=617340
Originator: NO
Duplicate reference: http://sourceforge.net/tracker/index.php?func=detail&aid=1649056&group_id=85796&atid=577329
Logged In: YES
user_id=617340
Originator: NO
Changing to bug...
Logged In: YES
user_id=617340
Originator: NO
Fixed for 1.3.15
The final solution was using creating a 'sitecustomize.py' which is always added to the pythonpath as the 1st path (and then removed to execute a 'sitecustomize.py' that may be defined by the user).
At this module, the 'sys.setdefaultencoding' can be used, as it's imported just before that method is deleted.
It can be seen at: http://pydev.cvs.sourceforge.net/pydev/org.python.pydev/PySrc/pydev_sitecustomize/