|
From: <mi...@us...> - 2022-07-04 21:06:32
|
Revision: 9098
http://sourceforge.net/p/docutils/code/9098
Author: milde
Date: 2022-07-04 21:06:30 +0000 (Mon, 04 Jul 2022)
Log Message:
-----------
Don't use on Pythons default encoding if "input_encoding" setting is None.
If the "input_encoding" setting is None, the encoding of an
input source should be auto-detected with a heuristic.
Under Python 3, auto-detection was only used if reading a file
with Python's default encoding failed.
Resulting problems:
* EncodingWarning (PEP 597) in Python >= 3.10.
* Optional BOM not dropped from utf-8 encoded files.
* If the usrers locale sets an 8-bit encoding,
files in other 8-bit encodings, UTF-8, and UTF-16 show
character mix-up (mojibake) (self-declared encoding ignored as
reading in 8-bit encoding does not lead to errors).
Modified Paths:
--------------
trunk/docutils/HISTORY.txt
trunk/docutils/docutils/io.py
Modified: trunk/docutils/HISTORY.txt
===================================================================
--- trunk/docutils/HISTORY.txt 2022-07-04 21:06:22 UTC (rev 9097)
+++ trunk/docutils/HISTORY.txt 2022-07-04 21:06:30 UTC (rev 9098)
@@ -44,6 +44,7 @@
- New function `error_string()`
obsoletes `utils.error_reporting.ErrorString`.
- Class `ErrorOutput` moved here from `utils/error_reporting`.
+ - Don't use on Pythons default encoding if "input_encoding" setting is None.
* docutils/parsers/__init__.py
Modified: trunk/docutils/docutils/io.py
===================================================================
--- trunk/docutils/docutils/io.py 2022-07-04 21:06:22 UTC (rev 9097)
+++ trunk/docutils/docutils/io.py 2022-07-04 21:06:30 UTC (rev 9098)
@@ -358,7 +358,7 @@
if source_path:
try:
self.source = open(source_path, mode,
- encoding=self.encoding,
+ encoding=self.encoding or 'utf-8-sig',
errors=self.error_handler)
except OSError as error:
raise InputError(error.errno, error.strerror, source_path)
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|