Re: [Python-markdown-discuss] GSoC progress

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yuri Takhteyev wrote:
> Interesting.  It looks like lxml is way way faster than ElementTree.
> Also, the website for lxml seems to suggest that ElementTree has some
> serious problems in handling unicode
> (http://codespeak.net/lxml/compatibility.html, third bullet).  This
> really worries me, more so than performance.  This may not affect us,
> but we need to make sure that ElementTree can handle unicode properly
> if we would be using it.  However, it looks like lxml is included with
> nothing at this point, and would require building stuff from C, which
> may raise the bar for using markdown...
>   
lxml supports ElementTree API, so we could write something like this:

try:
  from lxml import etree
  print "running with lxml.etree"
except ImportError:
  try:
    # Python 2.5
    import xml.etree.cElementTree as etree
    print "running with cElementTree on Python 2.5+"
  except ImportError:
    try:
      # Python 2.5
      import xml.etree.ElementTree as etree
      print "running with ElementTree on Python 2.5+"
    except ImportError:
      try:
        # normal cElementTree install
        import cElementTree as etree
        print "running with cElementTree"
      except ImportError:
        try:
          # normal ElementTree install
          import elementtree.ElementTree as etree
          print "running with ElementTree"
        except ImportError:
          print "Failed to import ElementTree from any known place"

We can suggest to use lxml, but by default cElementTree will be used on 
python 2.5
I didn't get what the real problem with unicode is, there are some 
general words at lxml site, and I think if the problem had been quite 
serious, ElementTree wouldn't have included in standard Python library.
I tried some test with russian unicode data - didin't find any problems 
yet, but I think this issue need more proper investigation.