From: Guenter M. <mi...@us...> - 2022-06-09 22:46:00
|
On 2022-06-09, Adam Turner wrote: > Using Python 3.10's ``-X warn_default_encoding`` argument to Python, we > can see a large number of places where the default encoding is used. On > posix systems this is now UTF-8 following PEP 538 [1], but on Windows a > non-unicode codepage can be used. > The attached patch fixes the majority of these instances. Thank you for the patch. After reading PEP 597, I agree that we should specify the intended encoding where appropriate. This means for every instance of open() without explicit encoding, we have to decide whether to use "ascii", "utf-8", or `io.locale_encoding` (the latter is equivalent to the value "locale" introduced in Py 3.10). Unfortunately, the patch mixes added "encoding" arguments with the change of "utf8" to "utf-8" in many cases. * Is there a reason to prefer 'utf-8'? We have currently 36 instances of 'utf8' vs. 19 instances of 'utf-8' in the library code and tests. The "codecs" documentation names "utf8" and "utf-8" as aliases for "utf_8". * Separating the encoding name normalization from new arguments would make it easier to check whether the new-specified encoding is correct. Günter |