From: Yuri T. <yu...@si...> - 2008-06-06 05:57:55
|
> I thought about ElementTree, but there is a problem with entities > escaping and I I didn't find any beautiful solution yet. Sure. I am not saying we should switch to ElementTree. Just that this might be worth considering at this point. Sticking with Nanodom and fixing the discrepancies from minidom API might be less work. > Thanks for the suggestion. Do you mean that the first round of parsing > should be without regexps? No, I didn't mean to say that we should avoid regexps in parsing - just that this wouldn't be a simple substitution. Avoiding regular expression would be neither necessary nor feasible. I would actually prefer that for now you keep the current parsing code as is for now. Right now Markdown class has a single _transform() method which takes markdown source and returns HTML. _transform() calls _processSection() on the source, which is itself recursive. I wouldn't mind getting rid of this recursion, but let's not worry about this right now. Instead, let's extricate _handleInline from all of this. That is, instead of the single call to _transform, I would rather have two functions: markdown_to_tree (markdown_source) - takes markdown, returns a Nanodom tree, WITHOUT applying inline patterns apply_inline_patterns (nanodom_tree) - takes a nano-dom tree and applies inline patterns to all nodes that need it (returning either the modified tree or a copy of it). So, one would be able to do conversion with: m = Markdown() return m.apply_inline_patterns(m.markdown_to_tree(my_source)).to_xml() Or maybe attach the second function as a method to NanoDom: return m.markdown_to_tree(my_source).apply_inline_patterns().to_xml() What this would gain us is two things. First we'll have better separation of code into two areas. I think this will make it easier to read and maintain. This will put us in a good position to change how inline patterns are handled. Second, this will give the caller more options: they can do stuff to the tree before applying inline patterns. They could also come up with their own way of handling inline patterns. - yuri -- http://sputnik.freewisdom.org/ |