From: Waylan L. <wa...@gm...> - 2010-10-17 15:11:58
|
dtyschenko, Thanks for you report on Ticket 80[1] for Python-Markdown. I'm responding to your report via the list [2] as I'm not sure how or if we should fix the reported problem. At first I was a little confused by your report as we don't do anything with comments - if they exist in the source they are pulled out as rawhtml and never see ElementTree. Then I realized you must be using a third party extension which create html comments and inserts them into the tree. That is certainly something we want to support and allow you to do. However, it appears you are also using the html4 serializer - which we borrowed (with permission) from the ElementTree 1.3 Preview (still in alpha since 2007 AFAICT). Any changes to that serializer would ideally be changes made upstream as well. In fact, your proposed fix, to also import parts of ElementTree when using cElementTree feels wrong and defeats the purpose of cElementTree in the first place. In fact, I see in the notes to ElementTree 1.3 there is a note [3] that it only works with cElementTree 1.0.6 or later. What version are you using? I'm going to say that that is a limitation of the html4 serializer. If you are using cElementTree, you must be using version 1.0.6 or later. In other words, this is a bug in cElmentTree, not a bug in Markdown. Even if you are using a version greater than 1.0.6, I'd say this is still a bug in cElementTree. Of course, if anyone has any better suggestions, I'm open to entertain them - and we accept patches. Oh, and regarding your final comment regarding `markdown.isBlockLevel` choking in Comment nodes, I'll patch that. We definitely should support using Comment nodes there. [1]: http://www.freewisdom.org/projects/python-markdown/Tickets/000080 [2]: https://lists.sourceforge.net/mailman/listinfo/python-markdown-discuss [3]: http://effbot.org/zone/elementtree-13-intro.htm#performance Waylan Limberg |
From: Waylan L. <wa...@gm...> - 2010-10-17 15:49:22
|
On Sun, Oct 17, 2010 at 11:11 AM, Waylan Limberg <wa...@gm...> wrote: [snip] > > I'm going to say that that is a limitation of the html4 serializer. If > you are using cElementTree, you must be using version 1.0.6 or later. > In other words, this is a bug in cElmentTree, not a bug in Markdown. > Even if you are using a version greater than 1.0.6, I'd say this is > still a bug in cElementTree. I should point out that anyone can provide there own serializers, so if the above is really not an option, you can always use your own serializer that does what you want. I'm a little hesitant to do so with our markdown.html4 module because that is under a different license than the rest of the project and copyrighted to the author of ElementTree. > Of course, if anyone has any better suggestions, I'm open to entertain > them - and we accept patches. > > Oh, and regarding your final comment regarding `markdown.isBlockLevel` > choking in Comment nodes, I'll patch that. We definitely should > support using Comment nodes there. Hmm, now I'm looking at the code and all isBlockLevel does it take a tag, which is a string. Perhaps elsewhere in the code, we are assuming every ElementTree element has a tag and that is where the exception is being raised. Like in markdown.treeprocessors.PrettifyTreeprocessor. Should we be checking that an element has a tag first and behave differently if not? Oh, and in the most recent development codebase, its at `markdown.util.isBlockLevel`. You can get the code here: http://gitorious.org/python-markdown -- ---- \X/ /-\ `/ |_ /-\ |\| Waylan Limberg |
From: Dmitry T. <dty...@gm...> - 2010-10-18 08:40:59
|
> At first I was a little confused by your report as we don't do > anything with comments - if they exist in the source they are pulled > out as rawhtml and never see ElementTree. Then I realized you must be > using a third party extension which create html comments and inserts > them into the tree. That is certainly something we want to support and > allow you to do. I'm writing my own. You know this annoying bug in one popular browser, when empty block height will be not zero until you put some content inside, even if this content is comment. > However, it appears you are also using the html4 serializer - which we > borrowed (with permission) from the ElementTree 1.3 Preview (still in > alpha since 2007 AFAICT). Any changes to that serializer would ideally > be changes made upstream as well. > In fact, your proposed fix, to also import parts of ElementTree when > using cElementTree feels wrong and defeats the purpose of cElementTree > in the first place. In fact, I see in the notes to ElementTree 1.3 > there is a note [3] that it only works with cElementTree 1.0.6 or > later. What version are you using? 1.0.6 It's not mine idea to import ElementTree.Comment, that is what cElementTree is doing. Problem is in checking that node is Comment. There is no such thing as nodeType like in DOM. So ElementTree putting into tag field link to ElementTree.Comment function and then making conditions like: if node.tag is ElementTree.Comment: Even if node was created using cElementTree implementation node.tag will point to ElementTree.Comment. That is why i was thinking that ElementTree.Comment implementation is called inside. If you look at ElementTree.Comment you'll see why node.tag is ElementTree.Comment: def Comment(text=None): element = Element(Comment) element.text = text return element > Hmm, now I'm looking at the code and all isBlockLevel does it take a > tag, which is a string. Perhaps elsewhere in the code, we are assuming > every ElementTree element has a tag and that is where the exception is > being raised. Like in markdown.treeprocessors.PrettifyTreeprocessor. > Should we be checking that an element has a tag first and behave > differently if not? As you now know node.tag isn't always a string, it can be function! So isBlockLevel is raising TypeError exception. So to fix these bugs the question should be answered: what is the propper way to check that node is Comment? |
From: Waylan L. <wa...@gm...> - 2010-10-29 15:07:24
Attachments:
fix_comment.diff
|
Ok, I think I have a working patch for this. I've attached a diff. Dimitry, any possibility you could test this against your code before I commit it? On Mon, Oct 18, 2010 at 4:40 AM, Dmitry Tyschenko <dty...@gm...> wrote: >> At first I was a little confused by your report as we don't do >> anything with comments - if they exist in the source they are pulled >> out as rawhtml and never see ElementTree. Then I realized you must be >> using a third party extension which create html comments and inserts >> them into the tree. That is certainly something we want to support and >> allow you to do. > > I'm writing my own. You know this annoying bug in one popular browser, > when empty block height will be not zero until you put some content > inside, even if this content is comment. > >> However, it appears you are also using the html4 serializer - which we >> borrowed (with permission) from the ElementTree 1.3 Preview (still in >> alpha since 2007 AFAICT). Any changes to that serializer would ideally >> be changes made upstream as well. > >> In fact, your proposed fix, to also import parts of ElementTree when >> using cElementTree feels wrong and defeats the purpose of cElementTree >> in the first place. In fact, I see in the notes to ElementTree 1.3 >> there is a note [3] that it only works with cElementTree 1.0.6 or >> later. What version are you using? > > 1.0.6 > > It's not mine idea to import ElementTree.Comment, that is what > cElementTree is doing. > > Problem is in checking that node is Comment. There is no such thing as > nodeType like in DOM. So ElementTree putting into tag field link to > ElementTree.Comment function and then making conditions like: > > if node.tag is ElementTree.Comment: > > Even if node was created using cElementTree implementation node.tag > will point to ElementTree.Comment. That is why i was thinking that > ElementTree.Comment implementation is called inside. If you look at > ElementTree.Comment you'll see why node.tag is ElementTree.Comment: > > def Comment(text=None): > element = Element(Comment) > element.text = text > return element > >> Hmm, now I'm looking at the code and all isBlockLevel does it take a >> tag, which is a string. Perhaps elsewhere in the code, we are assuming >> every ElementTree element has a tag and that is where the exception is >> being raised. Like in markdown.treeprocessors.PrettifyTreeprocessor. >> Should we be checking that an element has a tag first and behave >> differently if not? > > As you now know node.tag isn't always a string, it can be function! So > isBlockLevel is raising TypeError exception. > > So to fix these bugs the question should be answered: what is the > propper way to check that node is Comment? > -- ---- \X/ /-\ `/ |_ /-\ |\| Waylan Limberg |
From: Waylan L. <wa...@gm...> - 2010-10-30 01:44:20
|
After a little more testing, I have committed this. The good news is I have added tests, so while we don't use comments within markdown itself, we shouldn't break this in the future. Thanks for the report. The latest is available via git. On Fri, Oct 29, 2010 at 11:06 AM, Waylan Limberg <wa...@gm...> wrote: > Ok, I think I have a working patch for this. I've attached a diff. > Dimitry, any possibility you could test this against your code before > I commit it? > > On Mon, Oct 18, 2010 at 4:40 AM, Dmitry Tyschenko <dty...@gm...> wrote: >>> At first I was a little confused by your report as we don't do >>> anything with comments - if they exist in the source they are pulled >>> out as rawhtml and never see ElementTree. Then I realized you must be >>> using a third party extension which create html comments and inserts >>> them into the tree. That is certainly something we want to support and >>> allow you to do. >> >> I'm writing my own. You know this annoying bug in one popular browser, >> when empty block height will be not zero until you put some content >> inside, even if this content is comment. >> >>> However, it appears you are also using the html4 serializer - which we >>> borrowed (with permission) from the ElementTree 1.3 Preview (still in >>> alpha since 2007 AFAICT). Any changes to that serializer would ideally >>> be changes made upstream as well. >> >>> In fact, your proposed fix, to also import parts of ElementTree when >>> using cElementTree feels wrong and defeats the purpose of cElementTree >>> in the first place. In fact, I see in the notes to ElementTree 1.3 >>> there is a note [3] that it only works with cElementTree 1.0.6 or >>> later. What version are you using? >> >> 1.0.6 >> >> It's not mine idea to import ElementTree.Comment, that is what >> cElementTree is doing. >> >> Problem is in checking that node is Comment. There is no such thing as >> nodeType like in DOM. So ElementTree putting into tag field link to >> ElementTree.Comment function and then making conditions like: >> >> if node.tag is ElementTree.Comment: >> >> Even if node was created using cElementTree implementation node.tag >> will point to ElementTree.Comment. That is why i was thinking that >> ElementTree.Comment implementation is called inside. If you look at >> ElementTree.Comment you'll see why node.tag is ElementTree.Comment: >> >> def Comment(text=None): >> element = Element(Comment) >> element.text = text >> return element >> >>> Hmm, now I'm looking at the code and all isBlockLevel does it take a >>> tag, which is a string. Perhaps elsewhere in the code, we are assuming >>> every ElementTree element has a tag and that is where the exception is >>> being raised. Like in markdown.treeprocessors.PrettifyTreeprocessor. >>> Should we be checking that an element has a tag first and behave >>> differently if not? >> >> As you now know node.tag isn't always a string, it can be function! So >> isBlockLevel is raising TypeError exception. >> >> So to fix these bugs the question should be answered: what is the >> propper way to check that node is Comment? >> > > > > -- > ---- > \X/ /-\ `/ |_ /-\ |\| > Waylan Limberg > -- ---- \X/ /-\ `/ |_ /-\ |\| Waylan Limberg |
From: Dmitry T. <dty...@gm...> - 2010-11-02 16:14:23
|
> Which means when using > cElementTree, we need both cElementTree.Comment and > ElementTree.Comment available. Crazy! :-) > Personally, I consider this a bug in cElementTree (it works with the > builtin serializers because cElementTree actually uses ElementTree's > serializers and they test again ElementTree.Comment) as it makes it > ridiculously difficult to test for comment nodes in our own code when > we are supporting both ElementTree and cElementTree at the same time. I agree with you. It "defeats the purpose of cElementTree in the first place". > However, given the current rate of ElementTree's development right > now, I would expect that a fix would take a long time to propagate to > all our users, so I've just committed what should be a working > workaround. Thanks, now I can remove my workarounds. > Please let me know if you run into any problems with it. No problems discovered. |