Hello,
in Fedora we started with rebuilding Python packages with preleases of Python 3.11, currently it is 2nd alpha.
Docutils doesn't build because 3.11 adds support for null characters in the csv module, which breaks a test. See reproducer below.
import csv
from docutils.parsers.rst.directives import tables
with open('utf-16.csv', 'rb') as f: csv_data = f.read()
...
csv_data = str(csv_data, 'latin1').splitlines()
reader = csv.reader([tables.CSVTable.encode_for_csv(line + '\n') for line in csv_data])
next(reader)
Python 3.11:
['þÿ\x00"\x00T\x00r\x00e\x00a\x00t\x00"\x00', '\x00 \x00"\x00Q\x00u\x00a\x00n\x00t\x00i\x00t\x00y\x00"\x00', '\x00 \x00"\x00D\x00e\x00s\x00c\x00r\x00i\x00p\x00t\x00i\x00o\x00n\x00"\x00']
Python 3.10:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_csv.Error: line contains NUL</module></stdin>
The change was introduced in this commit: https://github.com/python/cpython/commit/b454e8e4df73bc73bc1a6f597431f171bfae8abd
Thank you for the report.
Could you attach a full test log?
mine breaks because reading utf-16.csv with null bytes does not break
so there is no exception data::
Last edit: engelbert gruber 2021-11-27
Could you re-try with r8909?
made it pass in r8910
reading a utf-16 file in latin1 gives funny results.
maybe breaking processing of the file would be better than producing garbled output
Last edit: engelbert gruber 2021-11-27
In the test, this is intentional. In praxi, utf-16 is recognized by the BOM and correctly decoded, if the "input_encoding" setting is left at its default (None).
See [feature-requests:#92].
Related
Feature Requests:
#92Fixed in [r8910]
Related
Commit: [r8910]
Fixed in release 0.19.
Thank you for reporting and testing.