|
From: Yuri T. <qar...@gm...> - 2008-06-04 17:55:04
|
> - Some code refactoring, there are a lot of ideas about it in Greg > Wilson's review, and I as well have some ideas. > For instance, in current Markdown DOM implementation there are some > differences from standard DOM libraries, for example usually in > Element.replaceChild the first argument is newNode and the second is > oldNode, in Markdown implementation first is oldNode and the second is > newNode. Usually Element's parent property name is "parentNode", but > here it's just "parent" etc. > I realize that it'll break some extensions, but I think it'll help > people in future to avoid reading code if they already know some DOM > library. It's ok to break backwards compatibility with extensions, if this buys us enough. Similarly, we can think of dropping Python 2.3 support - again if this really buys us something big. In particular, as far as the tree representation goes, we can consider a couple of things, in particular: 1. Stick with NanoDOM, fixing the problems you mentioned 2. Switch to ElementTree Let's discuss those options. (Or other.) (Artem: can you make a page for this project on the wiki, to keep track of the questions and the decisions, and then start separate threads for each question, though probably not all at the same time?) > - There is something to do with Inline patterns, I didn't decide yet > what is the best way to fix it. That was discussed in list and there are > some ideas. I thought of writing syntax/lexical parser instead of > current Inline Patterns mechanism, but I think it'll be a bit slower. I agree with Waylan, this should be the focus. I would avoid trying to do serious parsing. I think this part might be best done using straight regular expressions, and we might even be able to "steal" a lot of codes from Trent's markdown2. In other words, my suggestion is that the first round of parsing should turn the document into a tree of blocks, where nodes in the tree represent individual simple paragraphs, list items, block quotes, code segments, block level HTML elements, etc. The client will then ideally be able to get this tree back if they want. The second round of parsing would then simply go through this tree and run a different set of regular expressions on each node depending on the type of the node. If Python had a good PEG implementation, it perhaps would make sense to consider rewriting markdown using PEG. But at this point I think it's premature. > - I'll try to boost performance, I think choosing the right way of > inline patterns modification is the best way to boost Markdown. Yes, let's first do inline patterns, and see what this gives us. Another thing that could be done is avoiding excessive recursion in block parsing. But let's do one thing at a time. > - An extension for Crunchy to load files using the Markdown syntax. Sounds good, but don't get too sidetracked on this. Also, we need to make sure there is someone on Crunchy project who is actually interested in this and will make use of it. > - Test suite extension Yes, in fact, I would consider doing this _first_. I.e., it would be good to put together a unified test suite that combines all of our and Trent's tests and gives us a better idea of where the two implementations stand relative to each other. BTW, I would urge you to make sure that your modified version of python markdown passes all of our tests (and an increasing number of Trent's) at least once a week. Let's avoid the "Version 2" problem. > - Some additional documentation, maybe adding more examples about > writing extension modules. Let's put this off till later. If we'll be making serious changes, it makes more sense to work on the documentation after the work is done. But it would be good if you could at least document any changes that you end up making before the end of the summer. > Also I wrote to Django community asking if they need something special, > but they said that nothing Django-specific, but API stability. Someone > suggested adding markdown extras, but Waylan said that it was already > almost done. Yes, let's assume that the basic API will stay the same. > As I understand code must be compatible with python 2.3, 2.4, 2.5, isn't > it? We can reconsider this decision, if there is a good reason. We just shouldn't take it lightly. - yuri -- http://sputnik.freewisdom.org/ |