Re: [Docutils-develop] Release 0.20

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Dear Adam,

On 2023-04-06, Adam Turner wrote:

...

> ---------------------------------------------------------------------

>>> I think we should keep the ``publish_bytes()`` function in either case.

>> I don't see a convincing use case for ``publish_bytes()`` and would
>> prefer to keep the "core" interface as small as sensible.

Rationale: 

``publish_bytes()`` makes sense alongside ``publish_str()``.

However, ``publish_str()`` and the existing ``publish_string()`` are so
close that confusion is to be expected.

> The main use(s) here would be for publishing binary formats (e.g. ODT)
> to a ``bytes`` object in memory rather than writing to disk, or for
> when call-sites use a non-unicode ``output_encoding`` setting. 

IMV, the extended ``publish_string()`` providing

    publish_string(..., auto_encode=False) --> OutString
    publish_string(..., auto_encode=True) --> bytes

with `OutString` beeing 100% compatible with `str` and easily convertible
to bytes via ``bytes(result)`` can cater for such needs. 

(Also, is should not be too surprising that `publish_string` returns a
`bytes` instance if the user tells it to "auto_encode".)

> If it is to be removed, perhaps we could provide a recipie in the
> documentation for how to manage publishing to an in-memory byte
> sequence.

This is part of the `publish_string()` docstring in my patch::

    If `auto_encode` is True, the output is encoded according to the
    `output_encoding`_ setting; the return value is a `bytes` instance
    (unless `output_encoding`_ is "unicode",
    cf. `docutils.io.StringOutput.write()`).

Thanks,
Günter