|
From: Artem Y. <ne...@gm...> - 2008-07-19 13:28:19
|
I deleted Pattern.contentGroup attribute, that I previously added,
because I solved this problem using regexps. I also created new regexp
for links, it partly solves ticket#4 because it works fine, until you'll
try to insert nested parenthesis in link, in Perl implementation it's
solved with recursive regexp, but I don't see any way of doing it in
Python using regexps.
Perl code:
$g_nested_parens = qr{
(?> # Atomic matching
[^()\s]+ # Anything other than parens
or whitespace
|
\(
(??{ $g_nested_parens }) # Recursive set of nested brackets
\)
)*
}x;
Now link regexp works for angled links too, so I deleted
LINK_ANGLED_PATTERN from patterns list. Now link regexp is quite
complicated, so what about of using re.VERBOSE flag? I tried it without
any changes to regexps, but it's not working, seems that Python goes to
infinite loop after adding this flag.
Also I created aggregated regexp for STRONG_RE and STRONG_2_RE and
aggregated regexp for STRONG_EM_RE and STRONG_EM_2_RE. So STRONG_2_RE
and STRONG_EM_2_RE can be deleted form patterns list.
Now I'm gathering different issues/bugs, I think I'll post it on Monday
for discussion, which of them we want to fix.
Another thing I plan to do - port extensions to ElementTree.
Maybe some refactoring. For instance class CorePatterns scheduled for
refactoring, but now I don't have an idea what can be a better
replacement for it.
|