From: Günter M. <mi...@us...> - 2024-08-07 15:57:44
|
> The only writer that makes runtime use of the output encoding setting is LaTeX. **Output** encoding defaults to "utf-8" for all writers since several years. Most writers honour the "output-encoding" setting and encode the output file accordingly. So, you may use "rst2html5 --output-encoding=ASCII:xmlcharrefreplace" to have a pure ASCII file. The HTML, XML, and LaTeX writers also specify the used encoding in the file, the LaTeX writer also provides replacements for Unicode characters that are not encodable if a legacy output encoding is selected. There should be no cases of `output_encoding == None`. **Input** encoding was "auto-select" with fallback to utf-8 and "locale encoding" until 0.21. After the discussion last year the transition to utf-8 started: 0.22 uses "utf-8" as input encoding default, we will remove the input encoding auto-detection code in Docutils 1.0. The offending test case, "test_fallback_no_utf8()" is more trouble than help and removed in [r9864]. --- **[bugs:#490] EncodingWarnings in io module** **Status:** open-fixed **Created:** Fri Jun 28, 2024 03:34 PM UTC by Jason R. Coombs **Last Updated:** Thu Aug 01, 2024 08:24 PM UTC **Owner:** nobody When running the [distutils](https://github.com/pypa/distutils) tests with `PYTHONWARNDEFAULTENCODING=1`, two warnings are emitted: ``` distutils/tests/test_check.py::TestCheck::test_check_restructuredtext /Users/jaraco/code/pypa/distutils/.tox/py/lib/python3.12/site-packages/docutils/io.py:381: EncodingWarning: 'encoding' argument not specified self.source = open(source_path, mode, distutils/tests/test_check.py::TestCheck::test_check_restructuredtext /Users/jaraco/code/pypa/distutils/.tox/py/lib/python3.12/site-packages/docutils/io.py:151: EncodingWarning: UTF-8 Mode affects locale.getpreferredencoding(). Consider locale.getencoding() instead. fallback = locale.getpreferredencoding(do_setlocale=False) ``` Docutils should honor [PEP 597](https://peps.python.org/pep-0597/) and address these warnings (and possibly others). In my experience, adding `encoding='utf-8'` to any io operation is the best approach - it's straight-up compatible with the default on non-Windows systems and usually honoring the Unix convention is suitable if not preferable on Windows. Not only that, but that behavior will become the default in Python 3.15 or so. --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |