From: David Abrahams <dave@bo...> - 2004-03-01 21:04:08
I can't be at the sprint because I have to be on the other side of the
world (literally) at another conference. That's why I sprinted by
myself, several weeks ago, on the problem of supporting nested inline
markup. This is just a small and general appeal that someone hold a
sprint session on that code (in the "nesting" branch of CVS). All of
the changes are in docutils/docutils/parsers/rst/states.py and in the
corresponding tests. Dave G. has been understandably busy, and I'd
hate to see my work relegated to the dustbin of history...
From: David Goodger <goodger@py...> - 2004-03-02 06:47:45
David Abrahams wrote:
> I'd hate to see my work relegated to the dustbin of history...
It won't be, don't worry. It may take a while, but eventually (if
nobody beats me to it) I **will** get to it. The topic is already
listed for potential sprinting, and I've added a note about the branch
I would gladly support anyone who wants to sprint on this topic.
-- David Goodger
From: David Abrahams <dave@bo...> - 2004-03-03 17:07:01
David Goodger <goodger@...> writes:
> David Abrahams wrote:
>> I'd hate to see my work relegated to the dustbin of history...
> It won't be, don't worry. It may take a while, but eventually (if
> nobody beats me to it) I **will** get to it. The topic is already
> listed for potential sprinting, and I've added a note about the branch
OK. Do you understand the basic principle behind what I did?
Wow, I now wish I had commented better. Here goes a small attempt at
When searching for a nested inline markup end-string, you also have to
be willing to find a new start-string, so the end-string pattern will
be based on the one used to find start strings. We break the current
pattern up with comments '(?##)' so we can find the places to make
Because of the way you want to respond to malformed input:
a *b **c d* e** f
a *(b **c d)* e** f
^^----------------- no match
a *b **(c d* e)** f
^----------------- no match
it's neccessary at point 'c' to be willing to find an end-string for
*any* of the inline markup nesting levels that are currently open, so
that d* can close off the outer level of nesting.
That's why you'll see allends being built as an expression like
y z? | z
x (yz?|z)? | (yz?|z)
w ( x(yz?|z)?|(yz?|z) ) | ( x(yz?|z)?|(yz?|z) )
etc. Unfortunately very complicated, but this is the only way to say:
match 'w?x?y?z?' but don't match the empty string.
using Python regular expressions, AFAICT.
Unfortunately, I can't remember right now why the endpattern is being
added to the pattern for "whole" constructs (between part2 and
Probably the best way to understand the reason for any given decision
I made is to reverse the decision, run the test suite, and see what
breaks. I think the answers will become clear immediately. As a
first step I would run the suite, satisfy myself that all the failures
I saw should be expected change-of-behavior in a system supporting
nested markup, and fix the tests so that they pass.
I also recommend liberal use of the _debug_match method for
understanding what's going on. It makes things so much easier.