|
From: <mi...@us...> - 2022-07-04 21:06:41
|
Revision: 9099
http://sourceforge.net/p/docutils/code/9099
Author: milde
Date: 2022-07-04 21:06:38 +0000 (Mon, 04 Jul 2022)
Log Message:
-----------
Fix handling of UTF-16 encoded source without trailing newline.
Decoding a UTF-16 encoded source with BOM after auto-detection of the
encoding failed.
The newline normalization in `docutils.FileInput.read()`
produced invalid UTF-16 because it added one byte
(binary ASCII newline).
Postponing the newline normalization after the decoding step solves
this problem.
Modified Paths:
--------------
trunk/docutils/HISTORY.txt
trunk/docutils/docutils/io.py
Modified: trunk/docutils/HISTORY.txt
===================================================================
--- trunk/docutils/HISTORY.txt 2022-07-04 21:06:30 UTC (rev 9098)
+++ trunk/docutils/HISTORY.txt 2022-07-04 21:06:38 UTC (rev 9099)
@@ -45,6 +45,7 @@
obsoletes `utils.error_reporting.ErrorString`.
- Class `ErrorOutput` moved here from `utils/error_reporting`.
- Don't use on Pythons default encoding if "input_encoding" setting is None.
+ - Fix error when reading of UTF-16 encoded source without trailing newline.
* docutils/parsers/__init__.py
Modified: trunk/docutils/docutils/io.py
===================================================================
--- trunk/docutils/docutils/io.py 2022-07-04 21:06:30 UTC (rev 9098)
+++ trunk/docutils/docutils/io.py 2022-07-04 21:06:38 UTC (rev 9099)
@@ -383,8 +383,6 @@
if self.source is sys.stdin:
# read as binary data to circumvent auto-decoding
data = self.source.buffer.read()
- # normalize newlines
- data = b'\n'.join(data.splitlines()+[b''])
else:
data = self.source.read()
except (UnicodeError, LookupError):
@@ -393,14 +391,14 @@
b_source = open(self.source_path, 'rb')
data = b_source.read()
b_source.close()
- # normalize newlines
- data = b'\n'.join(data.splitlines()+[b''])
else:
raise
finally:
if self.autoclose:
self.close()
- return self.decode(data)
+ data = self.decode(data)
+ # normalise newlines
+ return '\n'.join(data.splitlines()+[''])
def readlines(self):
"""
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|