Re: [Python-markdown-discuss] GSoC progress

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yuri Takhteyev wrote:
>
> The profit is from switching to something simpler, faster and more
> accurate, while maintaining the extensibility.  My guess is that
> changing inline patterns to be text-in-text-out will make them simpler
> (some will just become regular expressions), easier to understand,
> will likely be faster, and will allow us to handle them in the same
> way as all other implementations do.  (We might just borrow their
> regular expressions in some cases.)
>
>   

But there is on big minus, we won't get valid DOM document. And I don't 
think that this will give big performance boost, since we already 
have(in new implementation) string(and not list) processing mechanism.

> The current implementation (using DOM) creates a few serious issues
> for inline patterns, which I don't think can be solved in any easy
> way.  We are relying on regular expressions to match patterns, but
> those cannot span multiple nodes.  This means, for instance, that we
> can either make **[foo](/foo.html]** work, or [**foo**](foo.html), but
> not both.  If we run the link pattern first, then the first string
> turns into ("**", <a dom node>, "**").  We now cannot run a regular
> expression across this.  If we apply the **...** pattern first, then
> the second expression becomes ("[", <a dom node>, "](foo.html)") and
> now we cannot match the link pattern.  Which is why my suggestion
> (which I haven't had time to implement) has been to switch to simple
> text-in-text-out implementation of the patterns:
>   

I solved this issue already, now all of those examples works:

   [*test*](http://example.com)
   *[test](http://example.com)*
   **[*test*](http://example.com)**
   __*[test](http://example.com)*__

And we still have valid DOM tree. You can try it from repository.