From: Artem <ne...@gm...> - 2008-07-02 00:12:37
|
Yuri Takhteyev wrote: > > The profit is from switching to something simpler, faster and more > accurate, while maintaining the extensibility. My guess is that > changing inline patterns to be text-in-text-out will make them simpler > (some will just become regular expressions), easier to understand, > will likely be faster, and will allow us to handle them in the same > way as all other implementations do. (We might just borrow their > regular expressions in some cases.) > > But there is on big minus, we won't get valid DOM document. And I don't think that this will give big performance boost, since we already have(in new implementation) string(and not list) processing mechanism. > The current implementation (using DOM) creates a few serious issues > for inline patterns, which I don't think can be solved in any easy > way. We are relying on regular expressions to match patterns, but > those cannot span multiple nodes. This means, for instance, that we > can either make **[foo](/foo.html]** work, or [**foo**](foo.html), but > not both. If we run the link pattern first, then the first string > turns into ("**", <a dom node>, "**"). We now cannot run a regular > expression across this. If we apply the **...** pattern first, then > the second expression becomes ("[", <a dom node>, "](foo.html)") and > now we cannot match the link pattern. Which is why my suggestion > (which I haven't had time to implement) has been to switch to simple > text-in-text-out implementation of the patterns: > I solved this issue already, now all of those examples works: [*test*](http://example.com) *[test](http://example.com)* **[*test*](http://example.com)** __*[test](http://example.com)*__ And we still have valid DOM tree. You can try it from repository. |