Thread: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

Brought to you by: qaramazov, waylanhl

python-markdown-discuss

[Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Waylan L. <wa...@gm...> - 2008-11-05 03:56:57

I'm happy to report that I just pushed a Definition List extension [1]
to the repo. I know this has been one of the most requested
extensions. After Yuri's refactor of the core into a separate
MarkdownParser, it was much easier to do.

I haven't written docs for it yet, but I have included tests [2],
which show how it works. See PHP Markdown Extra's docs [3] for the
hows and whys. Other than the included tests, I haven't really tested
it much, so please run it through the paces. I'd like to say we have
everything PHP Markdown Extra has when we release 2.0 and this was the
biggest hurdle. Now we just need to wrap up the edge cases I'm sure I
missed.

Any bug reports, patches, test cases, comments, and criticisms are welcome.

[1]: http://gitorious.org/projects/python-markdown/repos/mainline/blobs/master/markdown_extensions%2Fdef_list.py
[2]: http://gitorious.org/projects/python-markdown/repos/mainline/trees/master/tests/extensions-x-def_list
[3]: http://michelf.com/projects/php-markdown/extra/#def-list

ps: I believe the only other things we're missing from extra are
abbreviations and markdown inside html. If I'm wrong, let me know and
I'll add it to my todo list. I have an old abbreviations extension
which needs to be refactored to run with the many recent changes to
markdown. I haven't even looked at markdown inside html yet.

-- 
----
Waylan Limberg
wa...@gm...

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: John S. <jo...@sz...> - 2008-11-05 08:14:27

On Tue, Nov 4, 2008 at 10:56 PM, Waylan Limberg <wa...@gm...> wrote:
> I'm happy to report that I just pushed a Definition List extension [1]
> to the repo. I know this has been one of the most requested
> extensions. After Yuri's refactor of the core into a separate
> MarkdownParser, it was much easier to do.

Woot!  I've been eager to have this as well.

> I haven't written docs for it yet, but I have included tests [2],
> which show how it works. See PHP Markdown Extra's docs [3] for the
> hows and whys. Other than the included tests, I haven't really tested
> it much, so please run it through the paces. I'd like to say we have
> everything PHP Markdown Extra has when we release 2.0 and this was the
> biggest hurdle. Now we just need to wrap up the edge cases I'm sure I
> missed.

That would be awesome.

> Any bug reports, patches, test cases, comments, and criticisms are welcome.
>
> [1]: http://gitorious.org/projects/python-markdown/repos/mainline/blobs/master/markdown_extensions%2Fdef_list.py
> [2]: http://gitorious.org/projects/python-markdown/repos/mainline/trees/master/tests/extensions-x-def_list
> [3]: http://michelf.com/projects/php-markdown/extra/#def-list

I haven't tried testing it yet, but I'll definitely be trying it out
soon.  But I do have a couple other comments.

The main markdown parser marked some methods with __'s... which is
supposed to be the indicator that the method is private.  So it's a
little bit disturbing that DefListParser is drilling into those
internals.  Perhaps it's indicative that those methods should be
exposed more, or that we need an extra hook point into the code?

It looks like in _MarkdownParser__processParagraph() that
self.__processHR() and self.__processHeader() need to be
self._MarkdownParser__processHR() and
self._MarkdownParser__processHeader().  By default, python will try
calling self._DefListParser__processHR() and
self._DefListParser__processHeader(), which don't exist.  It looks
like the only difference in your process paragraph is that it also
watches for 'dd'.  Is there some way to either expose that, or just
add this edge to the actual MarkdownParser?  I realize that markdown
doesn't natively produce those tags.  But it seems like a shame to
have to edit the same code in two places if you find a bug (if you
remember the other one exists!).

> ps: I believe the only other things we're missing from extra are
> abbreviations and markdown inside html. If I'm wrong, let me know and
> I'll add it to my todo list. I have an old abbreviations extension
> which needs to be refactored to run with the many recent changes to
> markdown. I haven't even looked at markdown inside html yet.

I think it'd be nice for the table extension to function like PHP
markdown's implementation.  But that's just a nicety. :-)

-John

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Waylan L. <wa...@gm...> - 2008-11-05 13:45:55

On Wed, Nov 5, 2008 at 3:14 AM, John Szakmeister <jo...@sz...> wrote:
>
> The main markdown parser marked some methods with __'s... which is
> supposed to be the indicator that the method is private.  So it's a
> little bit disturbing that DefListParser is drilling into those
> internals.  Perhaps it's indicative that those methods should be
> exposed more, or that we need an extra hook point into the code?

That's a valid point and one I've thought about as well.
Unfortunately, I don't have a good answer yet. Part of my motivation
in doing this extension was to raise these questions and see how
things go.

>
> It looks like in _MarkdownParser__processParagraph() that
> self.__processHR() and self.__processHeader() need to be
> self._MarkdownParser__processHR() and
> self._MarkdownParser__processHeader().  By default, python will try
> calling self._DefListParser__processHR() and
> self._DefListParser__processHeader(), which don't exist.

Ah, right. It occurred to after I went to bed last night that I never
tested to make sure all other markdown syntax was still parsd
correctly. I'll get on that next.

> It looks
> like the only difference in your process paragraph is that it also
> watches for 'dd'.  Is there some way to either expose that, or just
> add this edge to the actual MarkdownParser?  I realize that markdown
> doesn't natively produce those tags.  But it seems like a shame to
> have to edit the same code in two places if you find a bug (if you
> remember the other one exists!).

I had similar thoughts, but that was the easiest way to get things
working. At the same time, I don't look forward to maintaining this in
parallel with the markdown core. Certainly valid concerns.

>
> I think it'd be nice for the table extension to function like PHP
> markdown's implementation.  But that's just a nicety. :-)
>
Oh yeah, I forgot about that one.

Thanks for the comments John. You've given me some things to think
about and reminded me of a few others.

-- 
----
Waylan Limberg
wa...@gm...

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Waylan L. <wa...@gm...> - 2008-11-05 14:57:55

On Wed, Nov 5, 2008 at 8:45 AM, Waylan Limberg <wa...@gm...> wrote:
>>
>> It looks like in _MarkdownParser__processParagraph() that
>> self.__processHR() and self.__processHeader() need to be
>> self._MarkdownParser__processHR() and
>> self._MarkdownParser__processHeader().  By default, python will try
>> calling self._DefListParser__processHR() and
>> self._DefListParser__processHeader(), which don't exist.
>
> Ah, right. It occurred to after I went to bed last night that I never
> tested to make sure all other markdown syntax was still parsd
> correctly. I'll get on that next.
>

I just pushed the fix for this and added a test. Thanks John.

-- 
----
Waylan Limberg
wa...@gm...

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Waylan L. <wa...@gm...> - 2008-11-05 21:04:43

On Tue, Nov 4, 2008 at 10:56 PM, Waylan Limberg <wa...@gm...> wrote:
> ps: I believe the only other things we're missing from extra are
> abbreviations and markdown inside html. If I'm wrong, let me know and
> I'll add it to my todo list. I have an old abbreviations extension
> which needs to be refactored to run with the many recent changes to
> markdown. I haven't even looked at markdown inside html yet.
>

I just pushed the Abbreviation Extension. Note that this is a simple
extension that does nothing more than PHP Markdown Extra. If you want
the extra features of my old extension (including the one-of-a-kind
inline abbrs defs developed by Seemant Kulleen or loading abbrs from
external files) then you can always write your own. Feel free to use
my old one as a base.

What's really interesting to me, is that with both these recent
extensions, I had to do minor patches to markdown.py itself (which
were committed separately). However, in each case, the patch should
have happened regardless of the specific extension. I'll call it
testing the extension API.

-- 
----
Waylan Limberg
wa...@gm...

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Yuri T. <qar...@gm...> - 2008-11-10 09:37:34

>> The main markdown parser marked some methods with __'s... which is
>> supposed to be the indicator that the method is private.  So it's a
>> little bit disturbing that DefListParser is drilling into those
>> internals.  Perhaps it's indicative that those methods should be
>> exposed more, or that we need an extra hook point into the code?
>
> That's a valid point and one I've thought about as well.
> Unfortunately, I don't have a good answer yet. Part of my motivation
> in doing this extension was to raise these questions and see how
> things go.

What worries me more, actually, is the fact that the extension ends up
duplicating much of the code in markdown.py  and then substitutes the
parser.  Substituting the parser should be a no-no for extensions.
(Note that if several extensions do this, then one will break
another.)

Can you explain what changes you are trying to make to the parser?
Perhaps we can find a way to make the parser a little more flexible,
so that def_list extension will just need to add a hander to it,
rather than having to swap in a new parser.

- yuri

-- 
http://sputnik.freewisdom.org/

Re: [Python-markdown-discuss] Announcing Definition List Extension - call for testing

From: Waylan L. <wa...@gm...> - 2008-11-10 14:23:11

On Mon, Nov 10, 2008 at 4:37 AM, Yuri Takhteyev <qar...@gm...> wrote:
>>> The main markdown parser marked some methods with __'s... which is
>>> supposed to be the indicator that the method is private.  So it's a
>>> little bit disturbing that DefListParser is drilling into those
>>> internals.  Perhaps it's indicative that those methods should be
>>> exposed more, or that we need an extra hook point into the code?
>>
>> That's a valid point and one I've thought about as well.
>> Unfortunately, I don't have a good answer yet. Part of my motivation
>> in doing this extension was to raise these questions and see how
>> things go.
>
> What worries me more, actually, is the fact that the extension ends up
> duplicating much of the code in markdown.py  and then substitutes the
> parser.  Substituting the parser should be a no-no for extensions.
> (Note that if several extensions do this, then one will break
> another.)

I am very aware of this problem and don't particularly like it myself.
However, the nature of the definition list syntax makes it much easier
to do within the core than elsewhere. Currently, the only way to add a
new block type to the core is to override the parseChunk method and
add a call to your code within the various if statements in that
method. And then the paragraph code is not really generic enough to
the point that many other block types deffer to it to do the right
thing.

> Can you explain what changes you are trying to make to the parser?
> Perhaps we can find a way to make the parser a little more flexible,
> so that def_list extension will just need to add a hander to it,
> rather than having to swap in a new parser.

Actually, over the last few days I've put together a completely new
core processor which works very differently. Like the other plugable
parts of markdown, it loops through a list (OrderedDict actually) of
BlockProcessors and parses the source text one block at a time. An
extension could easily add, remove, replace individual pieces of the
parser without the problems we currently have. Additionally, it runs
faster (by about 1 second for 1000 iterations) than the current code
and uses less recursion. It also overcomes some of the parsing
differances unique to the python implementation (i.e.: <p>s in lists
match pl and php behavior) although an extension could easily provide
the current behavior instead.

It's not quite ready yet, but in the next few days I'll post something
here for others to review.

-- 
----
Waylan Limberg
wa...@gm...