From: Guenter M. <mi...@us...> - 2022-11-03 23:34:11
|
is still too fast... On 2022-11-02, Adam Turner wrote: >> Generally, if a file is created or maintained by an active member of the >> Docutils developer team, I'd prefer an "ask before change" (except for >> emergencies and small typos). > Sounds good! ... >> (In the long run, I'd like to replace the "homegrown" docutils.io classes >> with standard objects, but this is a separate topic.) r9208 ... r9211: "In the long run" is not next day! Please ask before such changes. Replacing the "FileInput/StringInput/FileOutput" classes can currently only be done in cases, where we are sure not to introduce backwards incompatibilities. For the "include" and "csv-table" directives, the changes definitely change the behaviour for the "input_encoding" default value! For the other changes, I am not sure. >>>> r9167 >> Introduces a backwards incompatible API change! (see below) > Partially reverted in [r9202] (see below) So we are half way. Please no new patches on top before this is sorted out. ... >> The naming of the `core.publish_string()` API function pre-dates Python3 >> and uses the term "string" as a superordinate for "Unicode strings" >> (`str`/`unicode`) and "encoded strings" (`bytes`/`str`). >> The documentation is explicit about the return value >> (defaulting to 'utf-8' encoded `bytes` instances). >> The function is part of the "first level API": intended for use by >> 3rd-party code. Changing the return value has the potential of many >> breaks. >> Please revert the changes to core.py and io.py. ... > I have reverted the change (thank you) but introduced a deprecation for > returning non-(unicode) strings from "publish_string". I feel this is a > reasonable step towards having a more intuitive API. IMV, the name `publish_string` uses the generic notation of a "string" as "sequence of alphanumeric text or other characters, either as a literal constant or as some kind of variable", cf. https://en.wikipedia.org/wiki/String_(computer_science). > Naming of "publish_bytes" and "BytesOutput" I am not that set on, we > can change to "BinaryXXX" if preferred. I don't think we need a new custom "BytesOutput" class when in the long run we will use standard io classes. Regarding the "core.publish_string()" function, I see three possibilities: 1. Just keep as is. This is core API behaviour, so we need a very strong indication that the advantages outweigh the hassle. 2. Add a new boolean argument: "encode". With ``publish_string(encode=False)``, the "output_encoding" setting would be ignored and the return value is a `bytes` object. With ``publish_string(encode=True)``, the behaviour is as currently, encoding with "output_encoding" or returning `bytes` or `str` object (the latter for ``output_encoding == "unicode"``). The default value could change from True to False in Docutils 2.0. 3. Deprecate¹ "publish_string()" in favour of new, separate "publish_unicode_str()" and "publish_bytes()" functions. ¹PendingDeprecationWarning now, DeprecationWarning in Docutils 1.0. >>>> r9178 >> 8 lines of doctest code seems overly verbose in the context of an >> "overview of the Docutils architecture". > I tend to agree, "core.publish_string" would be nicer, but there is no > way in Docutils to disable applying the default transforms. >> It can still be useful for development: testing the reST parser (including >> the option to generate input/output pairs for parser test scripts). >> I'd move it to tools/dev and remove the EasyDialogs part. > Done in [r9201] So now the "hacking" guide can again refer to "quicktest.py". BTW: the fact that it "has not been substantively modified since 2006" is no argument for the move (the `cat` shell command, say, has not been substantially modified for ages, too but moving it to a developers only path would be bad.) Rather, "quicktest.py" has never been installed into the "PATH" in setup.py and is a tool for developers rather than end-users. r9207 There is an established format in the HISTORY file: General changes, followed by changes listed in alphabetical order of affected file (or directory for related changes in one directory) Name patch authors only for "external" patches (where the name differs from the committer in the SVN log). This is what I found at first glance, still not convinced this is all. Günter |