Re: [Docutils-develop] Re: Notation for continued paragraphs?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

(Replying to multiple messages from David A.)

[David Goodger]
 >> Practically though, mixed text and block-level elements are *far*
 >> more difficult to implement.  At some point, the compound structure
 >> has to be flattened.

[David Abrahams]
 > Without some explanation, it's hard to see why that's true.
 > Especially since some output formats *don't* require that
 > flattening.

Eventually, they all do, at some point in the processing.  Document
models are at different conceptual levels, where the lowest level is
that of typography (glyphs and page/screen coordinates).  I'd say that
the Docutils model is at a higher level than HTML's model, higher than
TeX's model, but (in some ways) it is lower than DocBook's model.
Very high-level models may support compound paragraphs, but
lower-level ones don't.  In the end, a document becomes a sequence of
blocks of text, with structure suggested by typeface, size, style,
indentation, borders, etc.

 >>> Recursion is difficult for you?
 >>
 >> Please drop the sarcasm.
 >
 > I don't mean to be sarcastic; sorry if it came off that way.

I guess I'd assumed you'd seen enough Docutils code to realize that
recursion is an essential technique used ubiquitously in Docutils'
code and document model.  I shouldn't have assumed or barked, sorry.

 > Recursion really is difficult for some people, and you haven't (yet)
 > given me enough information to imagine another reason nested blocks
 > should be so painful.

I'll try.  Imagine walking a document tree.  You're in a paragraph.
OK, paragraphs start a new text block, with space above and below,
with no extra left/right margins.  Paragraphs directly contain text,
which flows from one line to the next.  Next you encounter an element
within the paragraph.  In the current model, it can only be an inline
element, which doesn't disturb the flow or require any block-level
changes.  If we allowed literal blocks embedded within the text of
paragraphs, we'd have to check each element's context to see what to
do.

Look at it from the literal block processing code's point of view.  Is
it a block-level literal block (inside a "section", say) or an
"inline" literal block (inside a "paragraph")?  If the latter, the
paragraph containing the literal block would essentially have to be
split in two: an initial paragraph fragment before the literal block,
and a continuation fragment after.

Depending on the output we're generating, we might have to split
the paragraph.  But we're not processing the paragraph directly any
more.  So the paragraph would have to carefully examine its children
and transform accordingly.

 > It seems easy enough to define "paragraph" to be a structural
 > element rather than a text element.  Just like a section, a chapter,
 > or whatever.

Then, in Docutils' document model, a "paragraph" couldn't directly
contain text.  The Docutils document model is simple in that way,
intentionally and by design.  Either an element may contain other
block-level elements, or it may directly contain text, but never both.

Easy in theory, not so easy in practice.

If you wanted a paragraph model like you describe, we'd have to have a
lower level element ("text", say) to directly contain the text:

<paragraph>
     <text>
         some text
     <literal_block>
         in the middle of the paragraph
     <text>
         more text

After I wrote this, your reply to Felix's message arrived:

 > I think I've been suggesting something like:
 >
 > <paragraph>
 >     <text>
 >         foo bar
 >     <literal_block>
 >         mumble
 >     <text>
 >         baz

But that's exactly the model Felix proposed, but with
<paragraph>/<text> instead of <compound>/<paragraph>.  I prefer
Felix's model (<compound>/<paragraph>), because the vast majority of
paragraphs are not compound.

 > Is that very different in spirit?

Yes, because <text> wouldn't be needed in the vast majority of cases.

 > Of course, my suggestion would be a more abrupt transition for
 > existing docutils source.

And we try to avoid abrupt transitions like this.

 > Certainly there are lots of cases where this sort of thing comes up.
 > Not nesting paragraphs within paragraphs (conceptually), but nesting
 > textual blocks within paragraphs: equations, quotations, code
 > examples... So it seems like a reasonable way to capture existing
 > practice.

I'd be willing to add the <compound>/<paragraph> model.

 > I understand.  Just to be clear, I don't think compound paragraphs
 > are needed, or neccessarily realistic.  I *do* think textual block
 > elements within a paragraph are probably needed.

That's what *I* mean by "compound paragraphs": paragraphs containing
block-level elements.  That won't happen in Docutils.

-- 
David Goodger <http://python.net/~goodger>