From: Waylan L. <wa...@gm...> - 2008-11-05 03:56:57
|
I'm happy to report that I just pushed a Definition List extension [1] to the repo. I know this has been one of the most requested extensions. After Yuri's refactor of the core into a separate MarkdownParser, it was much easier to do. I haven't written docs for it yet, but I have included tests [2], which show how it works. See PHP Markdown Extra's docs [3] for the hows and whys. Other than the included tests, I haven't really tested it much, so please run it through the paces. I'd like to say we have everything PHP Markdown Extra has when we release 2.0 and this was the biggest hurdle. Now we just need to wrap up the edge cases I'm sure I missed. Any bug reports, patches, test cases, comments, and criticisms are welcome. [1]: http://gitorious.org/projects/python-markdown/repos/mainline/blobs/master/markdown_extensions%2Fdef_list.py [2]: http://gitorious.org/projects/python-markdown/repos/mainline/trees/master/tests/extensions-x-def_list [3]: http://michelf.com/projects/php-markdown/extra/#def-list ps: I believe the only other things we're missing from extra are abbreviations and markdown inside html. If I'm wrong, let me know and I'll add it to my todo list. I have an old abbreviations extension which needs to be refactored to run with the many recent changes to markdown. I haven't even looked at markdown inside html yet. -- ---- Waylan Limberg wa...@gm... |
From: John S. <jo...@sz...> - 2008-11-05 08:14:27
|
On Tue, Nov 4, 2008 at 10:56 PM, Waylan Limberg <wa...@gm...> wrote: > I'm happy to report that I just pushed a Definition List extension [1] > to the repo. I know this has been one of the most requested > extensions. After Yuri's refactor of the core into a separate > MarkdownParser, it was much easier to do. Woot! I've been eager to have this as well. > I haven't written docs for it yet, but I have included tests [2], > which show how it works. See PHP Markdown Extra's docs [3] for the > hows and whys. Other than the included tests, I haven't really tested > it much, so please run it through the paces. I'd like to say we have > everything PHP Markdown Extra has when we release 2.0 and this was the > biggest hurdle. Now we just need to wrap up the edge cases I'm sure I > missed. That would be awesome. > Any bug reports, patches, test cases, comments, and criticisms are welcome. > > [1]: http://gitorious.org/projects/python-markdown/repos/mainline/blobs/master/markdown_extensions%2Fdef_list.py > [2]: http://gitorious.org/projects/python-markdown/repos/mainline/trees/master/tests/extensions-x-def_list > [3]: http://michelf.com/projects/php-markdown/extra/#def-list I haven't tried testing it yet, but I'll definitely be trying it out soon. But I do have a couple other comments. The main markdown parser marked some methods with __'s... which is supposed to be the indicator that the method is private. So it's a little bit disturbing that DefListParser is drilling into those internals. Perhaps it's indicative that those methods should be exposed more, or that we need an extra hook point into the code? It looks like in _MarkdownParser__processParagraph() that self.__processHR() and self.__processHeader() need to be self._MarkdownParser__processHR() and self._MarkdownParser__processHeader(). By default, python will try calling self._DefListParser__processHR() and self._DefListParser__processHeader(), which don't exist. It looks like the only difference in your process paragraph is that it also watches for 'dd'. Is there some way to either expose that, or just add this edge to the actual MarkdownParser? I realize that markdown doesn't natively produce those tags. But it seems like a shame to have to edit the same code in two places if you find a bug (if you remember the other one exists!). > ps: I believe the only other things we're missing from extra are > abbreviations and markdown inside html. If I'm wrong, let me know and > I'll add it to my todo list. I have an old abbreviations extension > which needs to be refactored to run with the many recent changes to > markdown. I haven't even looked at markdown inside html yet. I think it'd be nice for the table extension to function like PHP markdown's implementation. But that's just a nicety. :-) -John |
From: Waylan L. <wa...@gm...> - 2008-11-05 13:45:55
|
On Wed, Nov 5, 2008 at 3:14 AM, John Szakmeister <jo...@sz...> wrote: > > The main markdown parser marked some methods with __'s... which is > supposed to be the indicator that the method is private. So it's a > little bit disturbing that DefListParser is drilling into those > internals. Perhaps it's indicative that those methods should be > exposed more, or that we need an extra hook point into the code? That's a valid point and one I've thought about as well. Unfortunately, I don't have a good answer yet. Part of my motivation in doing this extension was to raise these questions and see how things go. > > It looks like in _MarkdownParser__processParagraph() that > self.__processHR() and self.__processHeader() need to be > self._MarkdownParser__processHR() and > self._MarkdownParser__processHeader(). By default, python will try > calling self._DefListParser__processHR() and > self._DefListParser__processHeader(), which don't exist. Ah, right. It occurred to after I went to bed last night that I never tested to make sure all other markdown syntax was still parsd correctly. I'll get on that next. > It looks > like the only difference in your process paragraph is that it also > watches for 'dd'. Is there some way to either expose that, or just > add this edge to the actual MarkdownParser? I realize that markdown > doesn't natively produce those tags. But it seems like a shame to > have to edit the same code in two places if you find a bug (if you > remember the other one exists!). I had similar thoughts, but that was the easiest way to get things working. At the same time, I don't look forward to maintaining this in parallel with the markdown core. Certainly valid concerns. > > I think it'd be nice for the table extension to function like PHP > markdown's implementation. But that's just a nicety. :-) > Oh yeah, I forgot about that one. Thanks for the comments John. You've given me some things to think about and reminded me of a few others. -- ---- Waylan Limberg wa...@gm... |
From: Waylan L. <wa...@gm...> - 2008-11-05 14:57:55
|
On Wed, Nov 5, 2008 at 8:45 AM, Waylan Limberg <wa...@gm...> wrote: >> >> It looks like in _MarkdownParser__processParagraph() that >> self.__processHR() and self.__processHeader() need to be >> self._MarkdownParser__processHR() and >> self._MarkdownParser__processHeader(). By default, python will try >> calling self._DefListParser__processHR() and >> self._DefListParser__processHeader(), which don't exist. > > Ah, right. It occurred to after I went to bed last night that I never > tested to make sure all other markdown syntax was still parsd > correctly. I'll get on that next. > I just pushed the fix for this and added a test. Thanks John. -- ---- Waylan Limberg wa...@gm... |
From: Waylan L. <wa...@gm...> - 2008-11-05 21:04:43
|
On Tue, Nov 4, 2008 at 10:56 PM, Waylan Limberg <wa...@gm...> wrote: > ps: I believe the only other things we're missing from extra are > abbreviations and markdown inside html. If I'm wrong, let me know and > I'll add it to my todo list. I have an old abbreviations extension > which needs to be refactored to run with the many recent changes to > markdown. I haven't even looked at markdown inside html yet. > I just pushed the Abbreviation Extension. Note that this is a simple extension that does nothing more than PHP Markdown Extra. If you want the extra features of my old extension (including the one-of-a-kind inline abbrs defs developed by Seemant Kulleen or loading abbrs from external files) then you can always write your own. Feel free to use my old one as a base. What's really interesting to me, is that with both these recent extensions, I had to do minor patches to markdown.py itself (which were committed separately). However, in each case, the patch should have happened regardless of the specific extension. I'll call it testing the extension API. -- ---- Waylan Limberg wa...@gm... |
From: Yuri T. <qar...@gm...> - 2008-11-10 09:37:34
|
>> The main markdown parser marked some methods with __'s... which is >> supposed to be the indicator that the method is private. So it's a >> little bit disturbing that DefListParser is drilling into those >> internals. Perhaps it's indicative that those methods should be >> exposed more, or that we need an extra hook point into the code? > > That's a valid point and one I've thought about as well. > Unfortunately, I don't have a good answer yet. Part of my motivation > in doing this extension was to raise these questions and see how > things go. What worries me more, actually, is the fact that the extension ends up duplicating much of the code in markdown.py and then substitutes the parser. Substituting the parser should be a no-no for extensions. (Note that if several extensions do this, then one will break another.) Can you explain what changes you are trying to make to the parser? Perhaps we can find a way to make the parser a little more flexible, so that def_list extension will just need to add a hander to it, rather than having to swap in a new parser. - yuri -- http://sputnik.freewisdom.org/ |
From: Waylan L. <wa...@gm...> - 2008-11-10 14:23:11
|
On Mon, Nov 10, 2008 at 4:37 AM, Yuri Takhteyev <qar...@gm...> wrote: >>> The main markdown parser marked some methods with __'s... which is >>> supposed to be the indicator that the method is private. So it's a >>> little bit disturbing that DefListParser is drilling into those >>> internals. Perhaps it's indicative that those methods should be >>> exposed more, or that we need an extra hook point into the code? >> >> That's a valid point and one I've thought about as well. >> Unfortunately, I don't have a good answer yet. Part of my motivation >> in doing this extension was to raise these questions and see how >> things go. > > What worries me more, actually, is the fact that the extension ends up > duplicating much of the code in markdown.py and then substitutes the > parser. Substituting the parser should be a no-no for extensions. > (Note that if several extensions do this, then one will break > another.) I am very aware of this problem and don't particularly like it myself. However, the nature of the definition list syntax makes it much easier to do within the core than elsewhere. Currently, the only way to add a new block type to the core is to override the parseChunk method and add a call to your code within the various if statements in that method. And then the paragraph code is not really generic enough to the point that many other block types deffer to it to do the right thing. > Can you explain what changes you are trying to make to the parser? > Perhaps we can find a way to make the parser a little more flexible, > so that def_list extension will just need to add a hander to it, > rather than having to swap in a new parser. Actually, over the last few days I've put together a completely new core processor which works very differently. Like the other plugable parts of markdown, it loops through a list (OrderedDict actually) of BlockProcessors and parses the source text one block at a time. An extension could easily add, remove, replace individual pieces of the parser without the problems we currently have. Additionally, it runs faster (by about 1 second for 1000 iterations) than the current code and uses less recursion. It also overcomes some of the parsing differances unique to the python implementation (i.e.: <p>s in lists match pl and php behavior) although an extension could easily provide the current behavior instead. It's not quite ready yet, but in the next few days I'll post something here for others to review. -- ---- Waylan Limberg wa...@gm... |