Re: [Python-markdown-discuss] GSoC progress

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 4-Jul-08, at 10:05 AM, Artem Yunusov wrote:
> Yuri Takhteyev wrote:
>> Interesting.  It looks like lxml is way way faster than ElementTree.
>> Also, the website for lxml seems to suggest that ElementTree has some
>> serious problems in handling unicode
>> (http://codespeak.net/lxml/compatibility.html, third bullet).  This
>> really worries me, more so than performance.  This may not affect us,
>> but we need to make sure that ElementTree can handle unicode properly
>> if we would be using it.  However, it looks like lxml is included  
>> with
>> nothing at this point, and would require building stuff from C, which
>> may raise the bar for using markdown...
> lxml supports ElementTree API, so we could write something like this:
> try:
>   from lxml import etree
>   print "running with lxml.etree"
> ...
>        except ImportError:
>           print "Failed to import ElementTree from any known place"
>
> We can suggest to use lxml, but by default cElementTree will be  
> used on
> python 2.5
I'd agree that this is the best way to go.
 From what I've read and heard, lxml is faster/better, but it's also  
not standard and I went through hell trying to install it about a  
month ago (I don't think I succeeded either...)

Also, as far as ElementTree's handling of Unicode: in the twelve  
months I was working on DrProject, which uses ElementTree for all  
sorts of things, I can't remember any problems giving it Unicode (and  
part of my work was getting 100% Unicode support).