|
From: Yuri T. <yu...@si...> - 2008-06-06 05:57:55
|
> I thought about ElementTree, but there is a problem with entities
> escaping and I I didn't find any beautiful solution yet.
Sure. I am not saying we should switch to ElementTree. Just that
this might be worth considering at this point. Sticking with Nanodom
and fixing the discrepancies from minidom API might be less work.
> Thanks for the suggestion. Do you mean that the first round of parsing
> should be without regexps?
No, I didn't mean to say that we should avoid regexps in parsing -
just that this wouldn't be a simple substitution. Avoiding regular
expression would be neither necessary nor feasible. I would actually
prefer that for now you keep the current parsing code as is for now.
Right now Markdown class has a single _transform() method which takes
markdown source and returns HTML. _transform() calls
_processSection() on the source, which is itself recursive. I
wouldn't mind getting rid of this recursion, but let's not worry about
this right now. Instead, let's extricate _handleInline from all of
this. That is, instead of the single call to _transform, I would
rather have two functions:
markdown_to_tree (markdown_source) - takes markdown, returns a Nanodom
tree, WITHOUT applying inline patterns
apply_inline_patterns (nanodom_tree) - takes a nano-dom tree and
applies inline patterns to all nodes that need it (returning either
the modified tree or a copy of it).
So, one would be able to do conversion with:
m = Markdown()
return m.apply_inline_patterns(m.markdown_to_tree(my_source)).to_xml()
Or maybe attach the second function as a method to NanoDom:
return m.markdown_to_tree(my_source).apply_inline_patterns().to_xml()
What this would gain us is two things. First we'll have better
separation of code into two areas. I think this will make it easier
to read and maintain. This will put us in a good position to change
how inline patterns are handled. Second, this will give the caller
more options: they can do stuff to the tree before applying inline
patterns. They could also come up with their own way of handling
inline patterns.
- yuri
--
http://sputnik.freewisdom.org/
|