From: Alex T. <al...@tw...> - 2006-02-02 03:10:27
|
If I start up the Python interpreter (i.e. open a DOS shell box, and type "python") I get my Python interpreter. I then type in the following two lines, and get an error : > C:\Documents and Settings\Eleane>python > Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> s = u'a\u2019s' > >>> print s > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' > in position > 1: character maps to <undefined> > >>> If instead I start the PythonCard codeEditor, and start a Shell (F5), and type the same two lines, it works properly. I tried the basic python interpreter adding the imports that are visible within the codeEditor shell, but still get the same problem. > C:\Documents and Settings\Eleane>python > Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import os > >>> import sys > >>> import wx > >>> from PythonCard import dialog, util > >>> s = u'a\u2019s' > >>> print s > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' > in position > 1: character maps to <undefined> > >>> Anyone got any clues ? Does the code editor do something non-obvious that makes this all work right when it opens a shell ? Or is there something additional I could try ? [I don't really care about what the code editor does - just about being able to get my app working, perhaps by doing the same as the codeEditor.] (for now, I'm working around it by doing s = s.encode('ascii', 'replace') which simply replaces all the odd characters by '?'s - ok for the short term, but I do need to figure out a better answer). -- Alex Tweedly http://www.tweedly.net -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.14.25/247 - Release Date: 31/01/2006 |
From: Kenneth P. <ken...@ce...> - 2006-02-02 03:41:24
|
On Thu, Feb 02, 2006 at 03:10:23AM +0000, Alex Tweedly wrote: > If I start up the Python interpreter (i.e. open a DOS shell box, and=20 > type "python") I get my Python interpreter. I then type in the following= =20 > two lines, and get an error : >=20 > >C:\Documents and Settings\Eleane>python > >Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)]=20 > >on win32 > >Type "help", "copyright", "credits" or "license" for more information. > >>>> s =3D u'a\u2019s' > >>>> print s > >Traceback (most recent call last): > > File "<stdin>", line 1, in ? > > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > > return codecs.charmap_encode(input,errors,encoding_map) > >UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019'=20 > >in position > > 1: character maps to <undefined> > >>>> >=20 > If instead I start the PythonCard codeEditor, and start a Shell (F5),=20 > and type the same two lines, it works properly. >=20 > I tried the basic python interpreter adding the imports that are visible= =20 > within the codeEditor shell, but still get the same problem. Unicode in Python drives me batty. Every time I think I understand what's going on, something else strange happens, and I begin to doubt my comprehension. So, it's not just you. :) I get this same behavior on Linux (python 2.3.5). On that platform, I suspect (but haven't had time to prove) that the problem is locale-related. However, even if I figure it out there, I'm not sure how that solution would translate to other platforms like Windows. I suggest writing python-devel with "unicode" in the subject somewhere. Posts with that subject often get quite excellent replies from Martin v. L=F6wis, who has in fact helped me out several times. KEN --=20 Kenneth J. Pronovici <pro...@ie...> http://www.cedar-solutions.com/ |
From: Kenneth P. <ken...@ce...> - 2006-02-02 03:46:27
|
On Wed, Feb 01, 2006 at 09:41:18PM -0600, Kenneth Pronovici wrote: > On Thu, Feb 02, 2006 at 03:10:23AM +0000, Alex Tweedly wrote: > > If I start up the Python interpreter (i.e. open a DOS shell box, and=20 > > type "python") I get my Python interpreter. I then type in the followin= g=20 > > two lines, and get an error : > >=20 > > >C:\Documents and Settings\Eleane>python > > >Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)]= =20 > > >on win32 > > >Type "help", "copyright", "credits" or "license" for more information. > > >>>> s =3D u'a\u2019s' > > >>>> print s > > >Traceback (most recent call last): > > > File "<stdin>", line 1, in ? > > > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > > > return codecs.charmap_encode(input,errors,encoding_map) > > >UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019'= =20 > > >in position > > > 1: character maps to <undefined> > > >>>> > >=20 > > If instead I start the PythonCard codeEditor, and start a Shell (F5),= =20 > > and type the same two lines, it works properly. > >=20 > > I tried the basic python interpreter adding the imports that are visibl= e=20 > > within the codeEditor shell, but still get the same problem. >=20 > Unicode in Python drives me batty. Every time I think I understand > what's going on, something else strange happens, and I begin to doubt my > comprehension. So, it's not just you. :) >=20 > I get this same behavior on Linux (python 2.3.5). On that platform, > I suspect (but haven't had time to prove) that the problem is > locale-related. However, even if I figure it out there, I'm not sure > how that solution would translate to other platforms like Windows. >=20 > I suggest writing python-devel with "unicode" in the subject somewhere. > Posts with that subject often get quite excellent replies from Martin v. > L=F6wis, who has in fact helped me out several times. Ugh, I have Debian on the brain. I meant <pyt...@py...>, not python-devel. KEN --=20 Kenneth J. Pronovici <pro...@ie...> http://www.cedar-solutions.com/ |
From: bartek w. <ba...@re...> - 2006-02-02 10:07:11
|
Citing Alex Tweedly <al...@tw...>: > If I start up the Python interpreter (i.e. open a DOS shell box, and > type "python") I get my Python interpreter. I then type in the following > two lines, and get an error : > > > C:\Documents and Settings\Eleane>python > > Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] > > on win32 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> s = u'a\u2019s' > > >>> print s > > Traceback (most recent call last): > > File "<stdin>", line 1, in ? > > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > > return codecs.charmap_encode(input,errors,encoding_map) > > UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' > > in position > > 1: character maps to <undefined> > > >>> > > If instead I start the PythonCard codeEditor, and start a Shell (F5), > and type the same two lines, it works properly. > > I tried the basic python interpreter adding the imports that are visible > within the codeEditor shell, but still get the same problem. > > > > C:\Documents and Settings\Eleane>python > > Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] > > on win32 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import os > > >>> import sys > > >>> import wx > > >>> from PythonCard import dialog, util > > >>> s = u'a\u2019s' > > >>> print s > > Traceback (most recent call last): > > File "<stdin>", line 1, in ? > > File "C:\Python24\lib\encodings\cp850.py", line 18, in encode > > return codecs.charmap_encode(input,errors,encoding_map) > > UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' > > in position > > 1: character maps to <undefined> > > >>> > > Anyone got any clues ? > Does the code editor do something non-obvious that makes this all work > right when it opens a shell ? > Or is there something additional I could try ? > > [I don't really care about what the code editor does - just about being > able to get my app working, perhaps by doing the same as the codeEditor.] > > (for now, I'm working around it by doing > s = s.encode('ascii', 'replace') > which simply replaces all the odd characters by '?'s - ok for the short > term, but I do need to figure out a better answer). > > This is caused by the fact that, according to PEP 100, when you try to "print" a unicode string u, python implicitly calls u.encode(sys.getdefaultencoding()). The "recommended way" is to use always u.encode("something") when you print unicode, but you can try to look at these posts: http://faassen.n--tree.net/blog/view/weblog/2005/08/02/0 http://www.pycs.net/users/0000323/stories/14.html to find out how to change the default encoding and why is it considered harmful. Hope that helps Bartek |
From: Alex T. <al...@tw...> - 2006-02-02 12:09:48
|
No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.375 / Virus Database: 267.15.0/248 - Release Date: 01/02/2006 |
From: Thomas H. <th...@py...> - 2006-02-02 12:44:19
|
Alex Tweedly <al...@tw...> writes: > If I start up the Python interpreter (i.e. open a DOS shell box, and > type "python") I get my Python interpreter. I then type in the > following two lines, and get an error : > >> C:\Documents and Settings\Eleane>python >> Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit >> (Intel)] on win32 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> s = u'a\u2019s' >> >>> print s >> Traceback (most recent call last): >> File "<stdin>", line 1, in ? >> File "C:\Python24\lib\encodings\cp850.py", line 18, in encode >> return codecs.charmap_encode(input,errors,encoding_map) >> UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' >> in position >> 1: character maps to <undefined> >> >>> There are unicode characters (maybe the correct term is codepoint, but I'm far from an expert) that can not be converted to code page 850, which the console uses in your case (*). It is not that obvious for the apostrophe-like character you have above, but obvious for, say, chinese or japanese characters if you are using a western windows. > If instead I start the PythonCard codeEditor, and start a Shell (F5), > and type the same two lines, it works properly. Is that a graphical shell? If so, it probably understands unicode (obviously). (*) The windows XP console, at least, is able to handle unicode, but afaik Python uses the ansi apis to write to stdout. Thomas |
From: Brian M. <mrb...@gm...> - 2006-02-02 16:36:41
|
You are successfully loading the unicode u'a\u2019s' into the variable, s, with no problem other than displaying in a old-fashioned shell. The best I am able to do in my Emacs Python shell is >>> s.encode('cp1252') 'a\x92s' >>> print s.encode('cp1252') a\222s So, I don't get any error with CP-1252. I think if you want the quote mark to appear as a quote mark, you will have to work in a more modern GUI window, such as the Code Editor Shell. This article "Unicode in Python" by Jason Orendorff helped me understand a similar DOS window Python unicode problem. http://www.jorendorff.com/articles/unicode/python.html Search the article for the word "unfortunately" and read from that point. He concludes with " So in general it is not possible to determine what encoding to use with print. It is therefore better to send Unicode output to files or Unicode-aware GUIs, not to sys.stdout." I have not found anything to contradict this conclusion. I found that from the links collected in another good article, "Unicode Secrets" by Uche Ogbuji May 18, 2005 http://www.xml.com/pub/a/2005/05/18/unicode.html Good luck |