Re: [Python-markdown-discuss] Limitations of inlinePatterns

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yuri Takhteyev wrote:
>>  Of course, both should work, so we may need a new approach to
>>  the inlinePatterns. Any ideas?
> What I tried at the time
> was storing a sting which uses a special Unicode character to mark the
> positions where the nodes are supposed to be included.  I.e., if "⊙"
> is the special character, we could store something like:
>     ["A **⊙**  currently does not work.", <link>]

If your Unicode "character" were instead "%s", you could put the doms in a list, 
and repeatedly string-interpolate them...

i.e. you would end up with (using pretend dom syntax):
values = ("A %s currently does not work",
   ((dom("b","%s")),
    (dom("a", {'href':'index.html'}, "foo"))
   ))

and then you could loop through, doing something like:
template = values[0]
substitutions = values[1]
for subs in substitutions:
     template %= subs

and, as long as your %s were escaped the correct number of times, you should be 
good to go.  (If you went down that road, I might suggest using a dictionary, so 
that it was easier to see what was going on.  The data in that case (if you just 
used strings) would look more like:
values = ("A %(bold)s currently does not work",
   ({'bold':"<b>%(code)s</b>"),
    {'code':"<a href=`index.html`>foo</a>"}
   ))

Where each processor got to choose its own namespace.

 > This would allow us to run REs (if we are careful) and still get the
 > dom tree in the end.

Hmmm...  Thinking about that a little started me wondering...  If you end up 
with stuff in the wrong order it still wouldn't work.  Unless you ran the inline 
parsers on the data of each substitution, which is probably a good idea, come to 
think of it.  (And then override that method in the CodeProcessor to not call 
Markdown on its internal data.)

> Another possibility is to only use dom trees for high-level elements
> (lists, code blocks, quotes, etc), and do reduce inline patterns to
> simple REs (each run on one element of the larger tree at a time).

The nice property that you lose here is that you can't guarantee you'll always 
generate valid html/xml.  Of course, you might not care about that, since 
Markdown will include any old stuff from the user, but if you cared, using dom 
trees gives you that guarantee.

> I don't have time at the moment for such a major overhaul (this would
> basically be Python-Markdown 2.0), but if someone else does then I
> think this is the way to go.  I am also pretty sure that this would
> give us a sizeable performance boost.

You've got to love performance boosts.  :)

Later,
Blake.