Bugs item #3481980, was opened at 2012-01-30 20:34
Message generated for change (Comment added) made by milde
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=422030&aid=3481980&group_id=38414
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Toshio Kuratomi (abadger1999)
Assigned to: Nobody/Anonymous (nobody)
Summary: docutils 0.8.1 throws unicode error on non-ascii cwd
Initial Comment:
I received the following bug from our package of docutils: https://bugzilla.redhat.com/show_bug.cgi?id=785622
Reproduced by doing the following:
mkdir /var/tmp/café
cd /var/tmp/café
touch test.rst default.css
rst2html --stylesheet-path=default.css test.rst
The problem is that on python2 we're combining a unicode string (from command line parsing) with a byte str (from the current working directory). I'll attach a patch that addresses this in a minimally invasive manner. However, I think that there are still latent bugs in the code as on unix systems, filenames are byte str with a few bytes marked as illegal. That means that a user could have a current working directory where part of the path was encoded in latin-1 and part in utf-8, for instance. This may throw an exception when converting it to unicode or it may return a unicode string that doesn't actually represent the bytes that are on disk (and so will fail to find the file when it attempts to read it). Fixing that would require a re-architecting of the file handling in all of docutils, though.
----------------------------------------------------------------------
>Comment By: Günter Milde (milde)
Date: 2012-02-01 15:57
Message:
The Patch download link returns (for me) an error-page, instead of a file,
so I paste my version of a patch here.
The change in frontend.py fixes the reported case by use of unicode strings
for the cwd.
The change in utils/__init__.py lets utils.relative_path() work for the
case source=None, target=u'unicode'.
Users with mixed encodings should try Python 3 (>= 3.1) which introduces
the "surrogateescape" encoding error handler to deal with undecodable
bytes in paths.
Index: utils/__init__.py
===================================================================
--- utils/__init__.py (Revision 7326)
+++ utils/__init__.py (Arbeitskopie)
@@ -457,7 +457,8 @@
If there is no common prefix, return the absolute path to `target`.
"""
- source_parts = os.path.abspath(source or 'dummy_file').split(os.sep)
+ source_parts = os.path.abspath(source or type(target)('dummy_file')
+ ).split(os.sep)
target_parts = os.path.abspath(target).split(os.sep)
# Check first 2 parts because '/dir'.split('/') == ['', 'dir']:
if source_parts[:2] != target_parts[:2]:
Index: frontend.py
===================================================================
--- frontend.py (Revision 7326)
+++ frontend.py (Arbeitskopie)
@@ -184,7 +184,7 @@
`OptionParser.relative_path_settings`.
"""
if base_path is None:
- base_path = os.getcwd()
+ base_path = os.getcwdu()
for key in keys:
if key in pathdict:
value = pathdict[key]
@@ -619,7 +619,7 @@
"""Store positional arguments as runtime settings."""
values._source, values._destination = self.check_args(args)
make_paths_absolute(values.__dict__, self.relative_path_settings,
- os.getcwd())
+ os.getcwdu())
values._config_files = self.config_files
return values
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=422030&aid=3481980&group_id=38414
|