|
From: David W. <wo...@cs...> - 2008-07-04 15:05:01
|
On 4-Jul-08, at 10:05 AM, Artem Yunusov wrote: > Yuri Takhteyev wrote: >> Interesting. It looks like lxml is way way faster than ElementTree. >> Also, the website for lxml seems to suggest that ElementTree has some >> serious problems in handling unicode >> (http://codespeak.net/lxml/compatibility.html, third bullet). This >> really worries me, more so than performance. This may not affect us, >> but we need to make sure that ElementTree can handle unicode properly >> if we would be using it. However, it looks like lxml is included >> with >> nothing at this point, and would require building stuff from C, which >> may raise the bar for using markdown... > lxml supports ElementTree API, so we could write something like this: > try: > from lxml import etree > print "running with lxml.etree" > ... > except ImportError: > print "Failed to import ElementTree from any known place" > > We can suggest to use lxml, but by default cElementTree will be > used on > python 2.5 I'd agree that this is the best way to go. From what I've read and heard, lxml is faster/better, but it's also not standard and I went through hell trying to install it about a month ago (I don't think I succeeded either...) Also, as far as ElementTree's handling of Unicode: in the twelve months I was working on DrProject, which uses ElementTree for all sorts of things, I can't remember any problems giving it Unicode (and part of my work was getting 100% Unicode support). |