From: Schimon J. <sc...@fe...> - 2025-07-25 08:29:36
|
Good day. Is it feasible to set the routine (i.e. default) element to "span", or any other element, instead of element "p"? >>> from docutils.core import publish_parts >>> publish_parts(source="This an XHTML text with a `link <https://movim.eu>`_ for XMPP PubSub", writer_name="xhtml")["fragment\ "] '<p>This an XHTML text with a <a class="reference external" href="https://movim.eu">link</a> for XMPP PubSub</p>\n' Kind regards, Schimon |
From: Guenter M. <mi...@us...> - 2025-07-26 08:07:02
|
On 2025-07-25, Schimon Jehudah via Docutils-users wrote: > Good day. > Is it feasible to set the routine (i.e. default) element to "span", or any > other element, instead of element "p"? Isn't this what XSLT is made for? >>>> from docutils.core import publish_parts >>>> publish_parts(source="This an XHTML text with a `link <https://movim.eu>`_ for XMPP PubSub", writer_name="xhtml")["fragment\ > "] > '<p>This an XHTML text with a <a class="reference external" href="https://movim.eu">link</a> for XMPP PubSub</p>\n' When processing Docutils output with XSLT, you may consider using Docutils native XML format as starting point. HTML is missing several features of Docutils documents (e.g. footnotes) that must be emulated. Starting from Docutils XML saves you from reverse engineering. See https://docutils.sourceforge.io/docs/ref/docutils.dtd and https://docutils.sourceforge.io/docs/ref/doctree.html. :: from docutils.core import publish_string publish_string(source="Text with link: https://example.org", writer_name="xml", settings_overrides={"indents": True, "output_encoding": "unicode"}) With the upcoming Docutils 0.22, you will be able to re-read the processed Docutils XML with the "xml" parser and export to all supported formats. If you want HTML output from Docutils, you may consider the more modern "html5" writer. Both, the "xhtml" writer and the "html5" writer emit HTML that is also valid XML. However the "xhtml" writer (an alias for "html4css1") is less semantic in its output. (On the other hand, the "xhtml" writer is more stable, because it is mainly kept for backwards compatibility.) Günter |
From: Schimon J. <sc...@fe...> - 2025-07-26 18:23:15
|
On Sat, 26 Jul 2025 08:06:50 -0000 (UTC) Guenter Milde via Docutils-users <doc...@li...> wrote: > On 2025-07-25, Schimon Jehudah via Docutils-users wrote: > > Good day. > > > Is it feasible to set the routine (i.e. default) element to "span", > > or any other element, instead of element "p"? > > Isn't this what XSLT is made for? > I meant, that the produced output of docutils utilizes the tag "p". > > >>>> from docutils.core import publish_parts > >>>> publish_parts(source="This an XHTML text with a `link > >>>> <https://movim.eu>`_ for XMPP PubSub", > >>>> writer_name="xhtml")["fragment\ > > "] > > '<p>This an XHTML text with a <a class="reference external" > > href="https://movim.eu">link</a> for XMPP PubSub</p>\n' > > When processing Docutils output with XSLT, you may consider using > Docutils native XML format as starting point. HTML is missing several > features of Docutils documents (e.g. footnotes) that must be > emulated. Starting from Docutils XML saves you from reverse > engineering. > > See https://docutils.sourceforge.io/docs/ref/docutils.dtd > and https://docutils.sourceforge.io/docs/ref/doctree.html. > > :: > > from docutils.core import publish_string > publish_string(source="Text with link: https://example.org", > writer_name="xml", > settings_overrides={"indents": True, > "output_encoding": "unicode"}) > > With the upcoming Docutils 0.22, you will be able to re-read the > processed Docutils XML with the "xml" parser and export to all > supported formats. > This is done separately. Docutils is utilized to convert reStructuredText to XHTML, and then LXML is utilized to incorporate that output to XSLT. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/interface/http.py https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/utility/rst.py https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xml.py Should XML output be utilized, would links be realized with the tag "a"? > If you want HTML output from Docutils, you may consider the more > modern "html5" writer. Both, the "xhtml" writer and the "html5" > writer emit HTML that is also valid XML. However the "xhtml" writer > (an alias for "html4css1") is less semantic in its output. > (On the other hand, the "xhtml" writer is more stable, because it is > mainly kept for backwards compatibility.) > > Günter > I will try the option of XHTML. Thank you, Schimon. |
From: Guenter M. <mi...@us...> - 2025-07-27 18:58:14
|
Dear Schimon, On 2025-07-26, Schimon Jehudah via Docutils-users wrote: > On Sat, 26 Jul 2025 08:06:50 -0000 (UTC) > Guenter Milde > wrote: >> On 2025-07-25, Schimon Jehudah via Docutils-users wrote: >> > Is it feasible to set the routine (i.e. default) element to "span", >> > or any other element, instead of element "p"? >> Isn't this what XSLT is made for? > I meant, that the produced output of docutils utilizes the tag "p". The Docutils HTML writers use the HTML tag <p> for native <pararagraph> elements, because both are the basic block-level elements. https://docutils.sourceforge.io/docs/ref/doctree.html#paragraph Using a HTML <span> instead would mean using a HTML inline element for a native block element and prevent paragraph separation. This is something you may do in a custom HTML writer or as a post-processing with an XSLT stylesheet rule but it seems wrong for standard HTML output. What is your problem with paragraphs? Is it about <p> in table cells, list items etc.? >> When processing Docutils output with XSLT, you may consider using >> Docutils native XML format as starting point. HTML is missing several >> features of Docutils documents (e.g. footnotes) that must be >> emulated. Starting from Docutils XML saves you from reverse >> engineering. >> See https://docutils.sourceforge.io/docs/ref/docutils.dtd >> and https://docutils.sourceforge.io/docs/ref/doctree.html. >> :: >> from docutils.core import publish_string >> publish_string(source="Text with link: https://example.org", >> writer_name="xml", >> settings_overrides={"indents": True, >> "output_encoding": "unicode"}) >> With the upcoming Docutils 0.22, you will be able to re-read the >> processed Docutils XML with the "xml" parser and export to all >> supported formats. > This is done separately. > Docutils is utilized to convert reStructuredText to XHTML, and then > LXML is utilized to incorporate that output to XSLT. Does this mean you don't transform docutils output *with* some XSLT stylesheet but transform it *into* an XSLT template? >> If you want HTML output from Docutils, you may consider the more >> modern "html5" writer. Both, the "xhtml" writer and the "html5" >> writer emit HTML that is also valid XML. However the "xhtml" writer >> (an alias for "html4css1") is less semantic in its output. >> (On the other hand, the "xhtml" writer is more stable, because it is >> mainly kept for backwards compatibility.) > I will try the option of XHTML. Mind, that "xhtml" and "html5" are both aliases for the "html5_polyglot" writer. It produces HTML5 that is also valid XML. https://docutils.sourceforge.io/docs/user/html.html#html5 For clarity and brevity, I prefer to call this writer "html5" and its output "HTML" or "HTML5". The term "XHTML" is a bit ambiguous, because the legacy `"html4css1" writer`__ (with aliases "html", "html4" and "xhtml10") produces `XHTML 1 Transitional`__. __ https://docutils.sourceforge.io/docs/user/html.html#html4css1 __ https://www.w3.org/TR/xhtml1/ Regards, Günter |
From: Schimon J. <sc...@fe...> - 2025-07-28 01:15:04
|
Günter. Good day. Thank you for your elaboration on this concern. I have responded further. On Sun, 27 Jul 2025 18:57:56 -0000 (UTC) Guenter Milde via Docutils-users <doc...@li...> wrote: > Dear Schimon, > > On 2025-07-26, Schimon Jehudah via Docutils-users wrote: > > On Sat, 26 Jul 2025 08:06:50 -0000 (UTC) > > Guenter Milde > wrote: > >> On 2025-07-25, Schimon Jehudah via Docutils-users wrote: > > >> > Is it feasible to set the routine (i.e. default) element to > >> > "span", or any other element, instead of element "p"? > > >> Isn't this what XSLT is made for? > > > I meant, that the produced output of docutils utilizes the tag "p". > > > > The Docutils HTML writers use the HTML tag <p> for native > <pararagraph> elements, because both are the basic block-level > elements. > https://docutils.sourceforge.io/docs/ref/doctree.html#paragraph > > Using a HTML <span> instead would mean using a HTML inline element > for a native block element and prevent paragraph separation. > This is something you may do in a custom HTML writer or as a > post-processing with an XSLT stylesheet rule but it seems wrong for > standard HTML output. > Yes. I can so that. I can process it again afterwards. > What is your problem with paragraphs? Is it about <p> in table cells, > list items etc.? > Yes. I suppose, because I attach the output to an XSLT stylesheet, ass a part of customizing XSLT stylesheets and make them as usuable as so called "template engines" (e.g. Jinja2). This is the raw code of the XSLT stylesheet. <nav id="xslt-navigation-bottom"/> This is the processed code of the XSLT stylesheet. <nav id="xslt-navigation-bottom"> <p> <a class="reference external" href="/about" rel="noopener noreferrer">About</a> <a class="reference external" href="/about/rss" rel="noopener noreferrer">Atom</a> <a class="reference external" href="/v" rel="noopener noreferrer">V</a> <a class="reference external" href="/about/xmpp" rel="noopener noreferrer">XMPP</a> <a class="reference external" href="/help" rel="noopener noreferrer">Help</a> </p> </nav> The element "p" is not intended to be included. https://journal.woodpeckersnest.eu/ So, I resorted to create a CSS rule. nav p { all: unset; } https://journal.woodpeckersnest.eu/css/stylesheet.css As you suggested, post-processing would be a good solution. > > >> When processing Docutils output with XSLT, you may consider using > >> Docutils native XML format as starting point. HTML is missing > >> several features of Docutils documents (e.g. footnotes) that must > >> be emulated. Starting from Docutils XML saves you from reverse > >> engineering. > > >> See https://docutils.sourceforge.io/docs/ref/docutils.dtd > >> and https://docutils.sourceforge.io/docs/ref/doctree.html. > > >> :: > > >> from docutils.core import publish_string > >> publish_string(source="Text with link: https://example.org", > >> writer_name="xml", > >> settings_overrides={"indents": True, > >> "output_encoding": > >> "unicode"}) > > >> With the upcoming Docutils 0.22, you will be able to re-read the > >> processed Docutils XML with the "xml" parser and export to all > >> supported formats. > > > > This is done separately. > > > Docutils is utilized to convert reStructuredText to XHTML, and then > > LXML is utilized to incorporate that output to XSLT. > > Does this mean you don't transform docutils output *with* some XSLT > stylesheet but transform it *into* an XSLT template? > Yes. Precisely, as I have detailed. > > >> If you want HTML output from Docutils, you may consider the more > >> modern "html5" writer. Both, the "xhtml" writer and the "html5" > >> writer emit HTML that is also valid XML. However the "xhtml" writer > >> (an alias for "html4css1") is less semantic in its output. > >> (On the other hand, the "xhtml" writer is more stable, because it > >> is mainly kept for backwards compatibility.) > > > I will try the option of XHTML. > > Mind, that "xhtml" and "html5" are both aliases for the > "html5_polyglot" writer. It produces HTML5 that is also valid XML. > https://docutils.sourceforge.io/docs/user/html.html#html5 > > For clarity and brevity, I prefer to call this writer "html5" and its > output "HTML" or "HTML5". > > The term "XHTML" is a bit ambiguous, because the legacy `"html4css1" > writer`__ (with aliases "html", "html4" and "xhtml10") produces > `XHTML 1 Transitional`__. > > __ https://docutils.sourceforge.io/docs/user/html.html#html4css1 > __ https://www.w3.org/TR/xhtml1/ > Until recently, I have utilized "html". I will try "xhtml10" also. I only need a valid XML output. > > Regards, > > Günter > > Thank you, Schimon |
From: Guenter M. <mi...@us...> - 2025-07-28 09:46:16
|
On 2025-07-28, Schimon Jehudah wrote: > On Sun, 27 Jul 2025 Guenter Milde wrote: >> On 2025-07-26, Schimon Jehudah wrote: >> > On Sat, 26 Jul 2025 Guenter Milde wrote: >> >> On 2025-07-25, Schimon Jehudah wrote: >> >> > Is it feasible to set the routine (i.e. default) element to >> >> > "span", or any other element, instead of element "p"? ... >> This is something you may do in a custom HTML writer or as a >> post-processing with an XSLT stylesheet rule but it seems wrong for >> standard HTML output. > Yes. I can so that. I can process it again afterwards. >> What is your problem with paragraphs? Is it about <p> in table cells, >> list items etc.? > Yes. I suppose, because I attach the output to an XSLT stylesheet, as > a part of customizing XSLT stylesheets and make them as usuable as so > called "template engines" (e.g. Jinja2). > This is the raw code of the XSLT stylesheet. ><nav id="xslt-navigation-bottom"/> > This is the processed code of the XSLT stylesheet. ><nav id="xslt-navigation-bottom"> > <p> > <a class="reference external" href="/about" rel="noopener noreferrer">About</a> > <a class="reference external" href="/about/rss" rel="noopener noreferrer">Atom</a> > <a class="reference external" href="/v" rel="noopener noreferrer">V</a> > <a class="reference external" href="/about/xmpp" rel="noopener noreferrer">XMPP</a> > <a class="reference external" href="/help" rel="noopener noreferrer">Help</a> > </p> ></nav> > The element "p" is not intended to be included. > https://journal.woodpeckersnest.eu/ > So, I resorted to create a CSS rule. > nav p { > all: unset; > } > https://journal.woodpeckersnest.eu/css/stylesheet.css > As you suggested, post-processing would be a good solution. This could be merged with the post processing to wrap text spans. After all, the output of Docutils HTML writers is valid XML but no XSLT ;) ... >> Mind, that "xhtml" and "html5" are both aliases for the >> "html5_polyglot" writer. It produces HTML5 that is also valid XML. >> https://docutils.sourceforge.io/docs/user/html.html#html5 >> For clarity and brevity, I prefer to call this writer "html5" and its >> output "HTML" or "HTML5". >> The term "XHTML" is a bit ambiguous, because the legacy `"html4css1" >> writer`__ (with aliases "html", "html4" and "xhtml10") produces >> `XHTML 1 Transitional`__. >> __ https://docutils.sourceforge.io/docs/user/html.html#html4css1 >> __ https://www.w3.org/TR/xhtml1/ > Until recently, I have utilized "html". > I will try "xhtml10" also. Using the Docutils writer name "html" or "xhtml10" will make no difference, both refer to the legacy "html4css1" writer. > I only need a valid XML output. Using the writer names "html5" or "xhtml" will also result in valid XML output. The difference is a more semantic output (more CSS styling instead of hard-coded layout) and better support for advanced features like SVG images, math formulas, and video images. Regards, Günter |
From: Schimon J. <sc...@fe...> - 2025-07-28 12:45:00
|
Günter. Good afternoon. On Mon, 28 Jul 2025 09:45:57 -0000 (UTC) Guenter Milde via Docutils-users <doc...@li...> wrote: > On 2025-07-28, Schimon Jehudah wrote: > > On Sun, 27 Jul 2025 Guenter Milde wrote: > >> On 2025-07-26, Schimon Jehudah wrote: > >> > On Sat, 26 Jul 2025 Guenter Milde wrote: > >> >> On 2025-07-25, Schimon Jehudah wrote: > > >> >> > Is it feasible to set the routine (i.e. default) element to > >> >> > "span", or any other element, instead of element "p"? > > ... > > >> This is something you may do in a custom HTML writer or as a > >> post-processing with an XSLT stylesheet rule but it seems wrong for > >> standard HTML output. > > > Yes. I can so that. I can process it again afterwards. > > > >> What is your problem with paragraphs? Is it about <p> in table > >> cells, list items etc.? > > > Yes. I suppose, because I attach the output to an XSLT stylesheet, > > as a part of customizing XSLT stylesheets and make them as usuable > > as so called "template engines" (e.g. Jinja2). > > > This is the raw code of the XSLT stylesheet. > > ><nav id="xslt-navigation-bottom"/> > > > This is the processed code of the XSLT stylesheet. > > ><nav id="xslt-navigation-bottom"> > > <p> > > <a class="reference external" href="/about" rel="noopener > > noreferrer">About</a> <a class="reference external" > > href="/about/rss" rel="noopener noreferrer">Atom</a> <a > > class="reference external" href="/v" rel="noopener > > noreferrer">V</a> <a class="reference external" href="/about/xmpp" > > rel="noopener noreferrer">XMPP</a> <a class="reference external" > > href="/help" rel="noopener noreferrer">Help</a> </p> > ></nav> > > > The element "p" is not intended to be included. > > > https://journal.woodpeckersnest.eu/ > > > So, I resorted to create a CSS rule. > > > nav p { > > all: unset; > > } > > > https://journal.woodpeckersnest.eu/css/stylesheet.css > > > As you suggested, post-processing would be a good solution. > > This could be merged with the post processing to wrap text spans. > After all, the output of Docutils HTML writers is valid XML but no > XSLT ;) > Of course. I only add that code as static code for XSLT, as a part of its role as a templating engine. > ... > > >> Mind, that "xhtml" and "html5" are both aliases for the > >> "html5_polyglot" writer. It produces HTML5 that is also valid XML. > >> https://docutils.sourceforge.io/docs/user/html.html#html5 > > >> For clarity and brevity, I prefer to call this writer "html5" and > >> its output "HTML" or "HTML5". > > >> The term "XHTML" is a bit ambiguous, because the legacy > >> `"html4css1" writer`__ (with aliases "html", "html4" and > >> "xhtml10") produces `XHTML 1 Transitional`__. > > >> __ https://docutils.sourceforge.io/docs/user/html.html#html4css1 > >> __ https://www.w3.org/TR/xhtml1/ > > > > Until recently, I have utilized "html". > > > I will try "xhtml10" also. > > Using the Docutils writer name "html" or "xhtml10" will make no > difference, both refer to the legacy "html4css1" writer. > > > I only need a valid XML output. > > Using the writer names "html5" or "xhtml" will also result in valid > XML output. The difference is a more semantic output (more CSS styling > instead of hard-coded layout) and better support for advanced features > like SVG images, math formulas, and video images. > I only need the HTML code, which are then embedded into XSLT stylesheets. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/utility/rst.py from docutils.core import publish_parts html_content_str = publish_parts(source=rst_content_str, writer_name="xhtml") html_content_str["fragment"] > > Regards, > > Günter > Best, Schimon |