From: Sidney C. <si...@ji...> - 2021-05-26 19:55:42
|
Hello all, I am using docutils through Sphinx, and I am running into an issue there, which I think should best be solved in the RsT parsing stage. Ideally, this could be addressed by a directive, or perhaps there is some other way that any of you can point me to. Anyway here's the issue: When parsing RsT, nested sections are created by the RsT. It will add (or change) hierarchical levels when it encounters heading lines. And a hierarchical level is valid up to the next header line. (If my superficial understanding of the RsT source is correct, this is mostly handled in the docutils.parsers.rst.RSTState.check_subsection() method.) Okay. While this model is easy to understand and use, it does leave some types of reasonable document structures out of reach. In particular, if I want to "pop" a heading level to the next higher (or any higher) level without introducing a new header, I am stuck. As an example, suppose I have this text: *section level 1===========* *First lorem ipsum etc etc.* *Section level 2--------------------Second lorem ipsum etc etc.* *Alea iacta est!* Now in the current parser, the "Alea iacta est" paragraph will sit at section level #2. In pseudo-XML, the parse tree will be: *<section level="1">* * <p>First lorem ipsum etc. etc.</p>* * <section level="2">* * <p>Second lorem ipsum etc. etc.</p>* * <p>Alea iacta est!</p>* * </section>* *</section>* What I want to do however is end the "section level 2" before "Alea iacta est", and resume parsing section level 1. So I want my parse tree to be (again in pseudo-xml): *<section level="1">* * <p>First lorem ipsum etc. etc.</p>* * <section level="2">* * <p>Second lorem ipsum etc. etc.</p>* * </section>* * <p>Alea iacta est!</p>* *</section>* I have a few questions.... First, is this already somehow possible? Is there some syntax that directs the parser to produce the second parse tree? Second, if not: is there a fundamental reason why the second parse-tree would be incompatible with the document model of RsT? (If that's the case, why?) Third, if it is possible in principle, but not yet in practice, is there a way to implement this without touching the parser; say, by adding a docutils Directive? E.g. something that would work like this? *section level 1===========* *First lorem ipsum etc etc.* *Section level 2--------------------Second lorem ipsum etc etc.* *.. pop-heading-level* * :levels: 1* *Alea iacta est!* Any insight into this will be much appreciated. Kind regards, Sidney |
From: David G. <go...@py...> - 2021-05-27 01:53:50
|
I suspect that you may need a sidebar or topic in place of your section level 2: https://docutils.sourceforge.io/docs/ref/rst/directives.html#topic Or maybe a rubric: https://docutils.sourceforge.io/docs/ref/rst/directives.html#rubric IMHO, a "popped" section is meaningless and you don't need it. Some questions come to mind: Why do this? What is the purpose? What would this "popped" section mean? What does it represent? How would this be represented on the screen or page? How would the reader *know* that the section level had been "popped"? Have you ever seen this in the real world? Please show us an example or three. If you absolutely insist, if you must "pop" the section level, here's an ugly workaround: Level 1 ======= Level 2 ---------- \ ==== That last title is a backslash followed by a space. It produces a level-1 section with no title. You could make it explicit with a title of "title omitted", apply a class to it, add some CSS, and have it disappear completely. Use at your own risk. David Goodger <https://david.goodger.org> On Wed, May 26, 2021 at 3:55 PM Sidney Cadot <si...@ji...> wrote: > Hello all, > > I am using docutils through Sphinx, and I am running into an issue there, > which I think should best be solved in the RsT parsing stage. Ideally, this > could be addressed by a directive, or perhaps there is some other way that > any of you can point me to. Anyway here's the issue: > > When parsing RsT, nested sections are created by the RsT. It will add (or > change) hierarchical levels when it encounters heading lines. And a > hierarchical level is valid up to the next header line. > > (If my superficial understanding of the RsT source is correct, this is > mostly handled in the docutils.parsers.rst.RSTState.check_subsection() > method.) > > Okay. While this model is easy to understand and use, it does leave some > types of reasonable document structures out of reach. In particular, if I > want to "pop" a heading level to the next higher (or any higher) level > without introducing a new header, I am stuck. > > As an example, suppose I have this text: > > > > > *section level 1===========* > > > *First lorem ipsum etc etc.* > > > > > > *Section level 2--------------------Second lorem ipsum etc etc.* > > *Alea iacta est!* > Now in the current parser, the "Alea iacta est" paragraph will sit at > section level #2. In pseudo-XML, the parse tree will be: > > > *<section level="1">* > > * <p>First lorem ipsum etc. etc.</p>* > * <section level="2">* > > * <p>Second lorem ipsum etc. etc.</p>* > > * <p>Alea iacta est!</p>* > > * </section>* > *</section>* > > What I want to do however is end the "section level 2" before "Alea iacta > est", and resume parsing section level 1. > > So I want my parse tree to be (again in pseudo-xml): > > > *<section level="1">* > > * <p>First lorem ipsum etc. etc.</p>* > * <section level="2">* > > * <p>Second lorem ipsum etc. etc.</p>* > > * </section>* > > * <p>Alea iacta est!</p>* > > *</section>* > I have a few questions.... > > First, is this already somehow possible? Is there some syntax that directs > the parser to produce the second parse tree? > > Second, if not: is there a fundamental reason why the second parse-tree > would be incompatible with the document model of RsT? (If that's the case, > why?) > > Third, if it is possible in principle, but not yet in practice, is there a > way to implement this without touching the parser; say, by adding a > docutils Directive? E.g. something that would work like this? > > > > > *section level 1===========* > > > *First lorem ipsum etc etc.* > > > > > > *Section level 2--------------------Second lorem ipsum etc etc.* > > *.. pop-heading-level* > > * :levels: 1* > > > *Alea iacta est!* > > Any insight into this will be much appreciated. > > Kind regards, Sidney > > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. > |
From: Sidney C. <si...@ji...> - 2021-05-27 08:30:40
|
Hi David, > Why do this? What is the purpose? > What would this "popped" section mean? What does it represent? It is a continuation of a previous section that was interrupted by a subsection. This is not something that is usually encountered in rendered text (if at all). The reason it would be useful is that it allows tree transform operations to have a better understanding of the structure of a document, and to adopt their behavior on the "level" they reside at. The specific use-case for me is the toctree directive in sphinx. It is not strange to have it as the last directive on an RsT page: Main heading ========== Bla bla Subheading ========= Bla bla Subsubheading =========== Bla bla bla .. toctree:: In this example, the toctree directive is processed at to lowest-level subtree level in the parse tree: <sec main><sec sub><sec subsub>TOCTREE</sec></sec></sec> Whereas I would want to be able to achieve this: <sec main><sec sub><sec subsub></sec></sec>TOCTREE</sec> SoI want the toctree directive to be executed at a level of my choosing, rather than at the level induced by its incidental place after an unknown number of sub-heading levels. There is a quite a bit of discussion about this on the sphinx side; the current recommended practice is to either have subheadings in a page; or a toctree; but not both. Which, frankly, is silly. People are trying to solve this by patching up the 'toctree' semantics. However, I feel the core of the problem is that the toctree directive (especially at the end of the page) defaults to the last-active heading level; I cannot currently have it there and associate it to a higher (most often: the topmost) indentation level. > How would this be represented on the screen or page? How would the reader *know* that the section level had been "popped"? That's a presentation matter. It is conceivable to make a presentation where the body texts at different levels have a different color, or different indentation. In that case, having this ability would matter. But that is pretty far fetched of course. It is not something that happens (often, if at all) in typesetting. Presentation isn't my main concern though. Having the ability for tree transforms and directives to correctly known the sub-heading nesting structure of a document seems useful enough, and that is hard to do now, especially at the end of a document where the nesting level is just whatever is active at the time. > If you absolutely insist, if you must "pop" the section level, here's an ugly workaround: The workaround does not do what I ask though; it introduces a new section; it doesn't continue the previously open section. Also, It introduces another heading (even if its text is empty) which I don't want. I may be able to patch around that in HTML with CSS; but not in other back-ends. Now I may not be able to convince you that what I try to do is useful; but I am still interested to find out if it is *possible*, eg with a custom directive, and how I would go about that. Any pointers by you or anyone else would be appreciated. Best, Sidney On Thu, May 27, 2021 at 3:53 AM David Goodger <go...@py...> wrote: > I suspect that you may need a sidebar or topic in place of your section > level 2: > https://docutils.sourceforge.io/docs/ref/rst/directives.html#topic > Or maybe a rubric: > https://docutils.sourceforge.io/docs/ref/rst/directives.html#rubric > > IMHO, a "popped" section is meaningless and you don't need it. > > Some questions come to mind: > > Why do this? What is the purpose? > > What would this "popped" section mean? What does it represent? > > How would this be represented on the screen or page? How would the reader > *know* that the section level had been "popped"? > > Have you ever seen this in the real world? Please show us an example or > three. > > If you absolutely insist, if you must "pop" the section level, here's an > ugly workaround: > > Level 1 > ======= > > Level 2 > ---------- > > \ > ==== > > > That last title is a backslash followed by a space. It produces a level-1 > section with no title. You could make it explicit with a title of "title > omitted", apply a class to it, add some CSS, and have it disappear > completely. > Use at your own risk. > > David Goodger > <https://david.goodger.org> > > > On Wed, May 26, 2021 at 3:55 PM Sidney Cadot <si...@ji...> wrote: > >> Hello all, >> >> I am using docutils through Sphinx, and I am running into an issue there, >> which I think should best be solved in the RsT parsing stage. Ideally, this >> could be addressed by a directive, or perhaps there is some other way that >> any of you can point me to. Anyway here's the issue: >> >> When parsing RsT, nested sections are created by the RsT. It will add (or >> change) hierarchical levels when it encounters heading lines. And a >> hierarchical level is valid up to the next header line. >> >> (If my superficial understanding of the RsT source is correct, this is >> mostly handled in the docutils.parsers.rst.RSTState.check_subsection() >> method.) >> >> Okay. While this model is easy to understand and use, it does leave some >> types of reasonable document structures out of reach. In particular, if I >> want to "pop" a heading level to the next higher (or any higher) level >> without introducing a new header, I am stuck. >> >> As an example, suppose I have this text: >> >> >> >> >> *section level 1===========* >> >> >> *First lorem ipsum etc etc.* >> >> >> >> >> >> *Section level 2--------------------Second lorem ipsum etc etc.* >> >> *Alea iacta est!* >> Now in the current parser, the "Alea iacta est" paragraph will sit at >> section level #2. In pseudo-XML, the parse tree will be: >> >> >> *<section level="1">* >> >> * <p>First lorem ipsum etc. etc.</p>* >> * <section level="2">* >> >> * <p>Second lorem ipsum etc. etc.</p>* >> >> * <p>Alea iacta est!</p>* >> >> * </section>* >> *</section>* >> >> What I want to do however is end the "section level 2" before "Alea iacta >> est", and resume parsing section level 1. >> >> So I want my parse tree to be (again in pseudo-xml): >> >> >> *<section level="1">* >> >> * <p>First lorem ipsum etc. etc.</p>* >> * <section level="2">* >> >> * <p>Second lorem ipsum etc. etc.</p>* >> >> * </section>* >> >> * <p>Alea iacta est!</p>* >> >> *</section>* >> I have a few questions.... >> >> First, is this already somehow possible? Is there some syntax that >> directs the parser to produce the second parse tree? >> >> Second, if not: is there a fundamental reason why the second parse-tree >> would be incompatible with the document model of RsT? (If that's the case, >> why?) >> >> Third, if it is possible in principle, but not yet in practice, is there >> a way to implement this without touching the parser; say, by adding a >> docutils Directive? E.g. something that would work like this? >> >> >> >> >> *section level 1===========* >> >> >> *First lorem ipsum etc etc.* >> >> >> >> >> >> *Section level 2--------------------Second lorem ipsum etc etc.* >> >> *.. pop-heading-level* >> >> * :levels: 1* >> >> >> *Alea iacta est!* >> >> Any insight into this will be much appreciated. >> >> Kind regards, Sidney >> >> _______________________________________________ >> Docutils-users mailing list >> Doc...@li... >> https://lists.sourceforge.net/lists/listinfo/docutils-users >> >> Please use "Reply All" to reply to the list. >> > |
From: Guenter M. <mi...@us...> - 2021-05-27 11:26:08
|
On 2021-05-26, Sidney Cadot wrote: > I am using docutils through Sphinx, and I am running into an issue there, > which I think should best be solved in the RsT parsing stage. ... > ... if I want to "pop" a heading level to the next higher (or any > higher) level without introducing a new header, I am stuck. ... > So I want my parse tree to be (again in pseudo-xml): > *<section level="1">* > * <p>First lorem ipsum etc. etc.</p>* > * <section level="2">* > * <p>Second lorem ipsum etc. etc.</p>* > * </section>* > * <p>Alea iacta est!</p>* > *</section>* > I have a few questions.... > First, is this already somehow possible? Is there some syntax that directs > the parser to produce the second parse tree? This is not only impossible (with standard rST syntax), it is also an invalid Doctils document tree. https://docutils.sourceforge.io/docs/ref/docutils.dtd https://docutils.sourceforge.io/docs/ref/doctree.html#element-hierarchy > Second, if not: is there a fundamental reason why the second parse-tree > would be incompatible with the document model of RsT? (If that's the case, > why?) See the link below. > Third, if it is possible in principle, but not yet in practice, is there a > way to implement this without touching the parser; say, by adding a > docutils Directive? E.g. something that would work like this? A similar request was filed to the issue tracker. https://sourceforge.net/p/docutils/feature-requests/74 It turned out to be about an intermediate structure (adding sections by a directive). In this case, the resulting doctree would be valid but telling the section-adding-directive where to add these sections seems a better solution than changing rST syntax or adding a section-closing directive in Docutils. Günter |
From: Sidney C. <si...@ji...> - 2021-05-27 12:12:56
|
Hi Guenter, This is not only impossible (with standard rST syntax), it is also an > invalid Doctils document tree. > https://docutils.sourceforge.io/docs/ref/docutils.dtd > https://docutils.sourceforge.io/docs/ref/doctree.html#element-hierarchy Ok, that settles it. The restriction seems strange and somewhat arbitrary to my programmer's eye, but it is there alright. A similar request was filed to the issue tracker. > https://sourceforge.net/p/docutils/feature-requests/74 > It turned out to be about an intermediate structure (adding sections by a > directive). In this case, the resulting doctree would be valid but telling > the section-adding-directive where to add these sections seems a better > solution than changing rST syntax or adding a section-closing directive in > Docutils. > Yes, that seems very much related to what I was thinking about. As it turns out handling this stuff fully at the sphinx level has its own set of challenges. Especially the toctree stuff and how it interacts with the section headers seems quite badly designed I am sorry to say, to the point that the general recommendation seems to be: don't use them in the same document -- which is annoying (and hard to defend from a usability perspective). I had hoped that some alternative solution with help from the docutils level could be useful; but I guess this will need to be fully fixed at the sphinx level after all. Thanks, Sidney |
From: Guenter M. <mi...@us...> - 2021-06-18 09:40:56
|
Dear Sidney, On 2021-05-27, Sidney Cadot wrote: >> This is not only impossible (with standard rST syntax), it is also an >> invalid Doctils document tree. >> https://docutils.sourceforge.io/docs/ref/docutils.dtd >> https://docutils.sourceforge.io/docs/ref/doctree.html#element-hierarchy > Ok, that settles it. The restriction seems strange and somewhat arbitrary > to my programmer's eye, but it is there alright. With more input and searching in the Docutils sources, I have to correct myself on both accounts. Details below. > > >> Second, if not: is there a fundamental reason why the second > > >> parse-tree would be incompatible with the document model of RsT? > > >> (If that's the case, why?) While your second example would be an invalid Docutils document, there is no need to have a valid document tree before the parsing and transformations are completed. In other words, a transient document tree state like <document source="/tmp/simple.rst"> <section ids="first-section" names="first\ section"> <title>First section</title> <paragraph>First lorem ipsum</paragraph> </section> ← place next element here! is not invalid per se. it depends on what would be the next element: * if the next element is <section level="1"> or <section level="2">, fine. * if the next element is *not* a <section> or a <section> with incompatible level (outside 1, ..., level_of_the_closed_section + 1), the final document tree is invalid. > > >> Third, if it is possible in principle, but not yet in practice, is > > >> there a way to implement this without touching the parser; say, by > > >> adding a docutils Directive? This should be possible -- directives can do "anything" that is possible in Python: As a new section can close the preceding section, there must be a way to do this programatically. The solution is described in the docstring for docutils.parser.rst.states.check_subsection(): Check for a valid subsection header. Return 1 (true) or None (false). When a new section is reached that isn't a subsection of the current section, back up the line count (use ``previous_line(-x)``), then ``raise EOFError``. The current StateMachine will finish, then the calling StateMachine can re-examine the title. This will work its way back up the calling chain until the correct section level isreached. @@@ Alternative: Evaluate the title, store the title info & level, and back up the chain until that level is reached. Store in memo? Or return in results? :Exception: `EOFError` when a sibling or supersection encountered. It should be possible to "close" a section by a directive raising `EOFError` and the following rST examples would generate valid documents:: A section --------- .. close-section:: Another section --------------- as well as :: A section --------- .. close-section:: .. directive-that-generates-a-section-with-compatible-level:: OTOH, such a directive will certainly not become part of the Docutils, because input like :: A section --------- Section content .. close-section:: Anything other than the preceding examples. would generate an invalid document. Therfore, my suggestion would be to incorporate the closing into the "directive-that-generates-a-section-with-compatible-level", e.g. so that you may write, e.g, :: A section --------- A subsection ~~~~~~~~~~~~ Subsection content. .. directive-that-generates-a-section-with-compatible-level:: :section-level: 1 Caveats: * This is not part of the API but an implementation detail that may change in future. * I did not test. * I don't know about side-effects. > A similar request was filed to the issue tracker. >> https://sourceforge.net/p/docutils/feature-requests/74 >> It turned out to be about an intermediate structure (adding sections by a >> directive). In this case, the resulting doctree would be valid but telling >> the section-adding-directive where to add these sections seems a better >> solution than changing rST syntax or adding a section-closing directive in >> Docutils. > Yes, that seems very much related to what I was thinking about. > As it turns out handling this stuff fully at the sphinx level has its own > set of challenges. Especially the toctree stuff and how it interacts with > the section headers seems quite badly designed I am sorry to say, to the > point that the general recommendation seems to be: don't use them in the > same document -- which is annoying (and hard to defend from a usability > perspective). I had hoped that some alternative solution with help from the > docutils level could be useful; but I guess this will need to be fully > fixed at the sphinx level after all. I am not familiar with the toctree and Sphinx extensions, so I cannot recommend here besides the general adwise to balance the gain in usability with added complexity. I hope this helps a bit, Günter |