From: Artem Y. <se...@sp...> - 2009-01-11 00:32:59
|
On Sat, Jan 10, 2009 at 11:27 PM, Waylan Limberg <wa...@gm...> wrote: > On Sat, Jan 10, 2009 at 7:32 AM, Eric Abrahamsen <gi...@gm...> wrote: > >> We had talked at one point of having markdown import > >> lxml rather than ElementTree if it was available. Don't remember why > >> did decided not to. The list archives would answer that. However, if > >> you could provide a patch that works - I'll likely commit it. > > > > Looking through the mailing list archives, it looks like things stopped > at > > "yes, that would be a good idea". As far as I can tell there wasn't any > > further action. I tried adding lxml to the import cascade in > > etree_loader.py, and it imports okay, but fails tests (I can provide more > > details if necessary). > > Yeah, I seem to recall there being some differences between the two. I > wasn't the one who wrote that code, so the details aren't as clear to > me. Perhaps the final decision was made by the core devs off list. As far as I remember, there were some problems with several tests, and no great performance boost in compare with cElementTree(only 4%), so we decided to go with standard cElementTree/ElementTree. Eric, If you want HTML output you can use ElementTree 1.3 [1] tree.write("out.html", method="html") [1]: http://effbot.org/zone/elementtree-13-intro.htm > > > > Then I realized that the lxml etree implementation is > > essentially unrelated to the lxml.html.xhtml_to_html function. Before > > messing with things further, I want to make sure this the right way: I'm > > thinking of adding a "html" keyword argument to the Markdown class > > definition. If it's set to true we try to import lxml.html.xhtml_to_html. > If > > that fails we log a warning and then ignore it. if it succeeds, run > > xhtml_to_html right after the treeprocessors (I guess, would have to > > experiment). Does this seem generally sound? > > The general idea is good, except that all xhtml_to_html does (afaict) > is remove the xml namespace for each element - which we never set to > begin with. What we need is something to convert from ``<br />`` to > ``<br>`` and the like. The only way I'm aware of would be to build up > the tree with lxml.html from the start. But that would require > everyone have lxml installed or we maintain 2 versions of the code. > Neither is a practical option. > > However, I would love to be proved wrong on that assessment. > > As an aside, for a quick-fix, you could write a postprocessor which > simply does something like ``text.replace(" />", ">")``. However, > there are a few edge cases where that won't quite cut it. Therefore, > we don't offer it as a builtin option. However, it may be good enough > for many peoples needs. > > -- > ---- > Waylan Limberg > wa...@gm... > > > ------------------------------------------------------------------------------ > Check out the new SourceForge.net Marketplace. > It is the best place to buy or sell services for > just about anything Open Source. > http://p.sf.net/sfu/Xq1LFB > _______________________________________________ > Python-markdown-discuss mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss > |