From: Stefan M. <sm...@oe...> - 2011-04-16 09:47:41
|
Hi Günter! Thanks for caring about this. 3 days ago Guenter Milde wrote: > On 2011-04-04, Stefan Merten wrote: > >> The command :: > >> rst2xml.py --debug --traceback --input-encoding=utf-8 --output-encodi>> ng=utf-8 --error-encoding=utf-8 umlaut.rst /dev/null > >> on input file `umlaut.rst` :: > >> äöüÄÖÜß > >> crashes with a misleading error message:: > > Actually, this is a Python bug. It should be fine with Python >= 2.6 I don't think so - at least not the last part:: $ python -V Python 2.6.5 $ svn info URL: svn+ssh://sm...@sv.../svnroot/repos/docutils/trunk ... Revision: 6993 ... Last Changed Rev: 6993 Last Changed Date: 2011-03-20 18:20:36 +0100 (Sun, 20 Mar 2011) crashes. This is before your patch but with Python 2.6.5. > and with the workaround I commited yesterday: Trial:: $ svn update ... Updated to revision 7012. Also crashes:: File "/home/stefan/lib/python/lib/docutils/docutils/statemachine.py", line 212, in run % (self.line_offset, '\n| '.join(self.input_lines))) UnicodeEncodeError: 'ascii' codec can't encode characters in position 51-57: ordinal not in range(128) > Can you please test? Done. I think the problem is not in the Python problem you mentioned but in the code at `docutils/docutils/statemachine.py:212`:: print >>sys.stderr, ( '\nStateMachine.run: input_lines (line_offset=%s):\n| %s' % (self.line_offset, '\n| '.join(self.input_lines))) IMHO the print statement causes the problem:: $ python Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> print >>sys.stderr, "%s" % ( '\n| '.join(u'\xe4'), ) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128) >>> sys.stderr.encoding 'ANSI_X3.4-1968' Yes, I still have LANG=C... With all the logging facilities Docutils has I guess it would be feasible to use them instead of printing things simply out to `sys.stderr`. However, this idiom is quite common. Here are the hits for `sys.stderr` in some core sources:: ./docutils/core.py:238: print >>sys.stderr, '\n::: Runtime settings:' ./docutils/core.py:239: print >>sys.stderr, pprint.pformat(self.settings.__dict__) ./docutils/core.py:241: print >>sys.stderr, '\n::: Document internals:' ./docutils/core.py:242: print >>sys.stderr, pprint.pformat(self.document.__dict__) ./docutils/core.py:244: print >>sys.stderr, '\n::: Transforms applied:' ./docutils/core.py:245: print >>sys.stderr, (' (priority, transform class, ' ./docutils/core.py:247: print >>sys.stderr, pprint.pformat( ./docutils/core.py:253: print >>sys.stderr, '\n::: Pseudo-XML:' ./docutils/core.py:254: print >>sys.stderr, self.document.pformat().encode( ./docutils/core.py:263: print >>sys.stderr, '%s: %s' % (error.__class__.__name__, error) ./docutils/core.py:264: print >>sys.stderr, ("""\ ./docutils/core.py:273: print >>sys.stderr, ('Exiting due to level-%s (%s) system message.' ./docutils/core.py:279: sys.stderr.write( ./docutils/frontend.py:323: default_error_encoding = sys.stderr.encoding or 'ascii' ./docutils/frontend.py:713: sys.stderr.write(self.not_utf8_error % (filename, filename)) ./docutils/utils.py:85: `None` (implies `sys.stderr`; default). ./docutils/utils.py:108: stream = sys.stderr ./docutils/utils.py:143: stream = sys.stderr ./docutils/statemachine.py:210: print >>sys.stderr, ( ./docutils/statemachine.py:218: print >>sys.stderr, ('\nStateMachine.run: bof transition') ./docutils/statemachine.py:228: print >>sys.stderr, ( ./docutils/statemachine.py:236: print >>sys.stderr, ( ./docutils/statemachine.py:248: print >>sys.stderr, ( ./docutils/statemachine.py:261: print >>sys.stderr, ( ./docutils/statemachine.py:285: print >>sys.stderr, \ ./docutils/statemachine.py:442: print >>sys.stderr, ( ./docutils/statemachine.py:450: print >>sys.stderr, ( ./docutils/statemachine.py:457: print >>sys.stderr, ( ./docutils/statemachine.py:491: print >>sys.stderr, '%s: %s' % (type, value) ./docutils/statemachine.py:492: print >>sys.stderr, 'input line %s' % (self.abs_line_number()) ./docutils/statemachine.py:493: print >>sys.stderr, ('module %s, line %s, function %s' ./docutils/io.py:238: print >>sys.stderr, '%s: %s' % (error.__class__.__name__, ./docutils/io.py:240: print >>sys.stderr, ('Unable to open source file for ' ./docutils/io.py:330: print >>sys.stderr, '%s: %s' % (error.__class__.__name__, ./docutils/io.py:332: print >>sys.stderr, ('Unable to open destination file for writing' ./docutils/io.py:370: print >>sys.stderr, '%s: %s' % (error.__class__.__name__, ./docutils/io.py:372: print >>sys.stderr, ('Unable to open destination file for writing ' I guess all these places need to be fixed :-( . Grüße Stefan |