As noted at https://sourceforge.net/p/docutils/patches/186/?page=1#897a/547e/ef2d by @milde,
we should open a new ticket for the command line tool review
This is a tracker issue for this, and to allow discussion.
I'll briefly re-outline my argument to (eventually) drop the rst*
front-end tools, and only export docutils-cli
(or python -m docutils
).
I think a single front-end tool significantly simplifies a lot of things -- the docutils-cli wrapper is not complex, which gives it significant points in favour in my book.
Most usage of Docutils today is programmatic, and not via the command line tools (see the table at the bottom of this post - it shows all the projects that have a full dependency on Docutils with over 500k downloads in the last month. Of those 8, none use the command line tools)
I also suspect (although the data does not exist) that most command line uses of the Docutils tools will be rst2html(5). This is already the default in docutils-cli, so it is a drop-in replacement.
...
My proposal isn't to remove them [the rst2 front-end tools] with no recourse, but to deprecate over a period of time, clearly marking identical drop-in commands at runtime to affected users. ... We cannot know how many people would be affected with local random scripts, but it is a two-second change.
Many users will also run with old or pinned versions of Docutils, and part of updating is seeing the changelog. If Debian or other redistributors already make changes, they could decide to keep shell aliases from rst2* to the new docutils-cli based invocations.
(quotes taken from https://sourceforge.net/p/docutils/patches/186/#897a and https://sourceforge.net/p/docutils/patches/186/?page=1#897a/547e )
A
working on the commandline means tipping less by using completion
if i type rstTABTAB the list of all writers shows up
if there is only docutils-cli i have to read the documentation.
if there happens to be a new writer with rst.... i will be notified by the completion result,
if there is only docutils-cli i have to read the documentation.
of course new readers wont show up for rstTABTAB
I think this partly speaks to the issue -- the tab completion functionality only works "by accident", and doesn't support readers/parsers.
It seems it might be possible to add custom bash autocomplete rules (https://caliban.org/bash/#completion) -- would this be an acceptable workaround?
A
While I am in favour of revising and updating Docutils' command line
entry points, I don't think we should drop the number down to one.
Can you elaborate a bit on what would become a lot simpler?
The generic front-end is one order of magnitude more complex because of
the two-stage command line parsing with the set of valid tags depending
on the components selected.
Even help output depends on the "component" tags. Due to the open nature
(allowing for plug-in components), a man page will always need to refer
to external documentation, while, e.g,
man rst2html
lists all availablecommand line options (at least on Debian).
We need to care for "command line users" if their number is
non-negligible -- independent of the number of users depending on the
programmatic interface.
The number of users/projects using Docutils via the command line interface
cannot be estimated by looking at Python projects.
Unfortunately, it is rather hard to find out how many non-Python projects
uses "rst2html.py" in their
Makefile
or another form of build tool chain.The first answer to Explain Python entry points?
even cites Doctils as
... a great example of entry-point use: it will install something like a
half-dozen useful commands for converting Python documentation to other
formats.
(even if Docutils currently does not use the "console-scripts mechanism" to
provide cli entry points).
...
While the actual re-typing (or drag-and-drop) of the command may be that
fast, this is not the case for the complete task of finding out and
approaching the right spot where to apply the change in a complex build
chain.
A hard learned lesson from Docutils releases is to never underestimate
the number of users/project managers that don't read the changelog (nor
the announcements in the RELEASE-NOTES) yet depend on a stable Docutils
for a stable system.
I am pro change for instances where the current
naming is unfortunate or may stand in the way.
buildhtml.py
is too generic, it may stand in the way.Debian calls it
rst-buildhtml
. I could imaginedocutils-buildhtml
orleaving it in the tools for individual installing.
docutils-cli.py
is too long. This name was selected because a namingthe file for the generic front end tool "docutils.py" is misleading.
With "entry points" it is possible to use
docutils
as front-end commandwithout the need for a file "docutils.py".
python3 -m docutils
currently results in the error:'docutils' is a package and cannot be directly executed
It could be made more helpful, we know, a user typing
python -m ...
wantsto execute a command line tool (or just wants to know more about docutils).
rst2
is established as the start of Docutils' front-end names forconversion from reStructuredText to something. I would like to keep this
prefix as "ours". (After all, Docutils is the reference implementation of the
rST format.)
Ease of discovery is important. TAB completion is a powerfull means here.
Additional parser or readers may add their own entry points, cf.
https://github.com/executablebooks/MyST-Parser/issues/347#issuecomment-1003717830
Rarely used and diagnostic tools may not need automatic installation into
the binary PATH. Here, it may help to diagnose which tools are installed by
pip docutils
vs. OS-specific package managers.Debian installs the following 13:
rst2html
rst2html4
rst2html5
rst2latex
rst2man
rst2odt
rst2odt_prepstyles
rst2pseudoxml
rst2s5
rst2xetex
rst2xml
rst-buildhtml
rstpep2html
Dropping the
.py
fromrst2*.py
commands may be considered.+1 shorter and more command-like names
-1 backwards incompatible, an unknown number of users need to change their scripts.
Currently deep in Docutils' internals (everywhere that takes a
settings_spec
or usesself.settings
sort of assumes working as a command line programme. However, a lot of usage (programmatic, through Sphinx or other methods) entirely use the default values for things. By moving to a single front end I would argue it is not only a cleaner user story, but it might enable refactoring to move the CLI usages of Docutils to a higher level.Currently we need to do awful things to subclass and patch either
optparse.OptionParser
orargparse.ArgumentParser
. This is really unusual, and for developers coming from a more "normal" command line application, it can take a while to understand this part of the internals of Docutils.I didn't go into detail intentionally so as not to spark a debate about these parts, but I do think (eventually) simplifying these interactions can lead to a cleaner codebase.
I don't think you can get away from this though without a combinatorial explosion of readers, writers, and parsers. Say we have two useful CLI readers (standalone/pep), three parsers (rst/recommonmark/myst), and 6 useful writers (html5/html4/latex/xetex/man/xml) that is 36 distinct front-end tools we should be providing.
I will admit ignorance on how man pages work.
docutils-cli --writer xetex --help
, though, will always give the correct help output. This is also the version we should be promoting, not least as it works cross platform (if my patch with entrypoints is merged!).Of course -- sorry if my post came across as callous in any way towards frontend tool users. I suppose what I don't want is to be in a situation where we are not making real improvements based on hypothetical situations. It might be useful to find ways of proxying for CLI usage -- bugs filed recently with us/redistributors, usages in public archives ( https://grep.app or similar ), etc.
True. However by the above methods we can get an estimate, surely? There are a lot of people who commit random things to GitHub / GitLab / whatever!
This is why I proposed to go about it by emitting warnings during deprecation, before total removal. We also need to consider the support that this project offers -- if a downstream user has integrated Docutils into a complex tool chain and cannot maintain it, we shouldn't be responsible for that.
Fair enough -- though perhaps another route we could go down in the deprecation notices are to say "pin version XX". There is no best solution here -- all change will break someone's workflow (XKCD 1172!), but we should be working to make the upgrade path as easy as possible.
Ahh, I was under the impression that
buildhtml
was an internal tool for building the website. Would it be reasonable to formally retire it from public use, and reccommend Sphinx as an alternative?+1
Did you see my suggestion on using custom shell autocompletion functions? I believe that this would allow for tab completion with the reader/parser/writer flags.
If we use what I propsed in one of my changesets to reimplement the
rst2
commands in terms ofdocutils-cli
, it would be entirely possible to deprecate therst2
commands but just keep them forever. This would also mean that the simplifications I proposed at the top of this message wouldn't be blocked (I think).Concrete proposal:
docutils
orpython -m docutils
where we currently referencerst2
rst2*
commands in terms ofdocutils-cli
docutils-cli
.py
aliases for a while)rst2*
commands, but with no removal dateA
Even with Sphinx, some features can only be customised from a
docutils.conf configuration file.
The
sttings_spec
anddocument.settings
are Docutils abstraction fromthe different configuration ways (config-files/command line/programmatic).
Using
document.settings
should be possible without too much thinkingabout the actual source of the setting value.
An overview for programmatic use of the "settings" framework is given in
https://docutils.sourceforge.io/docs/api/runtime-settings.html#runtime-settings-processing-from-applications
(best read alongside
pydoc3 -b
output for the mentioned functions/classes).We already have the
docutils.core.publish_*
"convenience functions" asa high-level API for custom front-ends (both command-line and programmatic).
"docutils-cli" is more complex because here we want the components to be
configurable from the command line or config file. I am working on moving
the complexity to a library function that can be re-used by other
"script" entry-points in need of configurable components. This will become
an extension or addition to
docutils.core.publish_cmdline()
.(It may also become simpler once "optparse" is replaced with "argparse".)
Yes, indeed. Docutils has an elaborated configuration framework which
actually predates the "optparse" module. Later development of "Optik"
into "optparse" and then "argparse" implemented some of the abstractions and
enhancements offered by Docutils in a different way.
But (in contrast to developers working on the optparse->argparse
transition :) "normal" developers using the "docutils" package don't need
to care about the details here. They can use the high-level API offered
by SettingsSpec / settings and get the command line and config file
processing for free (
docutils.frontend
is only the "workhorse",docutils.core
is the high-level interface).I agree that there is room for improvement in this API, but I don't think
getting rid of the simple front-ends in favour of one complex front end
will be of much help in this quest.
I suggest moving this thread of the discussion over to [bugs:#441].
However, only some of the combinations will be of common interest.
We should try to find the right balance -- IMO, both extremes are
sub-optimal.
Docutils will not include dedicated front-end tools for 3rd-party
parsers/writers/... ("pycmark2..." shall be provided by "pycmark" etc).
One idea is to have two packages at pypi: "docutils-core", say,
without dedicated front-end tools (but supporting
python -m docutils
)and "docutils" providing a sensible set of front-end tools.
Another idea is to auto-install a small default set (rst2html,
rst2latex, ...) and keep a rich set in
/tools
so that every user mayinstall (copy, symlink or write alias commands in ~/.bashrc or
~/.profile) the tools the want "by hand".
rst2odt_prepstyles.py
is a rarely used auxiliary script.I propose to move it to
docutils/writers/odtwriter/
(alongside the stylefile(s) it prepares).
...
That is a possibility. However, it only works with some shells (bash) so it
is not for all users.
I am just working on a way to disentangle frontend.OptionParser and
frontend.ConfigParser but this is a topic for [bugs:#441].
Related
Feature Requests: #110
Last edit: Günter Milde 2023-06-26
Note I'm not proposing getting rid of the config, just loosening the direct relationship between the CLI-parsing part of Docutils and the settings/config part of Docutils.
Hmm, perhaps we are talking at cross purposes. I'm talking about utility functions such as "take some RST and turn in into docutils nodes" (from halfway down https://github.com/sphinx-doc/sphinx/issues/8039, ignore the emotive language).
What the user probably wanted is
docutils.core.publish_doctree(user_input_text).children
, but it is pretty hard to know this without knowing the internals of Docutils. A function namedget_nodes_from_rst
(or suchlike) would be a useful helper.There is currently a great degree of useage of random internal bits of Docutils, I think partially due to that these "medium level" helpers don't exist (sorry if I wasn't clear in what I meant here in the post above).
I would challenge this, I would find this very surprising behaviour if a config file (in one of at least three places, or controlled by an environment file) populated defaults to the components being used. Given it also adds a lot of complexity, I'm not sure it is worth keeping?
The main challenge I had here was that subclasses can filter
settings_spec
(throughfilter_settings_spec
). I've never seen this implemented in the way Docutils does it before -- ifsettings_spec
tuples were treated as immutable, then it would be much easier to e.g. construct the parser object first and then useparser.add_argument
as "intended".I'll try another analogy (why not!) . When I'm using
ffmpeg
, it is "simple" to me as the end user to know that if I want to use different input or output encodings, I just pass the relevant flag. All I need to learn is the name of the base command, and that I pass the codec I want to-c:a
and-c:v
. In this way it is "simpler" to remember and use as the number of commands goes up (and allows using aliases, which the per-format tools don't).The implementation might be somewhat more complex (although I would argue not much), but end-user simplicity is what counts.
If you're not conviced I'll drop the issue for now, I do think it would be good to at least unify the back-end implementations of the front-end tools.
I don't think this is a good idea -- it increases confusion as there are two packages, but the "core" maintains all the complexity of needing to parse CLI stuff. Maybe later, if the core (or CLI) become more distinct.
+1
Will reply on 441 for the 441 things.
A
I agree that it would be an improvement to implement config-file
processing without dependency on "optparse" or "argparse" (cf. [bugs:#441]).
Well documented utility functions are helpful, too.
However, getting rid of the
rst*
front-end tools does not simplify this task:it does not matter if
docutils.core.publish_cmdline()
is called by oneor several command line front-end scripts.
For end-user convenience, I see benefits in both, a generic, flexible CLI
and simple scripts for the common tasks (rst2html, rst2latex, ...).
Proposal:
Keep the "*.py" scripts in
tools/
for backwards compatibility and asexamples for users wanting to create their own front-ends.
Use "entry points" [patches:#186] to install front-end scripts in
the binary PATH:
docutils
: generic front end(as "docutils-cli.py" is not installed in 0.18 [bugs:#447],
we can change to a shorter name already in 0.19).
rst2*
: drop the.py
suffix (after a transition period).Eventually stop installing rarely used tools.
Related
Bugs:
#447Feature Requests: #110
Patches:
#186Last edit: Günter Milde 2022-12-01
I agree with the sentiment here.
Sounds good, I believe that setting
entry_points
andscripts
will install both, allowing for the transition period.A
For the user, the component settings (reader, parser, writer) are
handled similar to all other settings: the "factory default" can be
customized either in a configuration file or on the command line:
Pro
Simple command call with user preferences set in a config file,
no need to type
--writer=myfavourite
with every call.Consistent handling of settings.
Con
Requires 2-stage parsing of the config files.
(The command line must be parsed twice either way.)
Last edit: Günter Milde 2022-05-12
I don't think two stage parsing is that much of a downside, and the implementation of this on patches#186 seems resaonable, so I withdraw my objection.
A
Proposal
docutils
(done in 0.19).rst2*.py
scripts torst2*
entry points for 0.21 or later.docutils --writer=*
as stable alternative torst2*.py
(documented in RELEASE-NOTES since Docutils 0.20).tools/
directory of the repository and source package.Rationale:
rst2html
andrst2html.py
in the binary path would complicate command line use,The attached patch provides a set of functions that are required for the
rst2*
entry points.It could go into Docutils 0.20.
Commit [r9408] implemented the switch from installing
rst2*.py
scripts into the binary PATH to "console-scripts" entry point definitions."Console-scripts" entry points are now:
*
docutils
: the generic front end*
rst2*
(without extension.py
): specific end-user applications.The corresponding scripts in
tools/
are kept in the repository and source distribution as examples for custom front-ends and possible use by distribution packagers.This should also clear the way to replace "setup.py" with a TOML config file [patches:#186].
Related
Commit: [r9408]
Patches:
#186