Thread: [Python-markdown-discuss] GSoC ElementTree support

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I added ElementTree support, the results are:
cElementTree ~13% faster than NanoDOM, and uses memory in 4.5 times less.
lxml is ~4% faster than cElementTree, but cElementTree wins in memory 
usage(two times less)
ElementTree little bit faster then NanoDOM, and ElementTree also wins in 
memory usage(2.5 times less)

Concerning the html/xtml output, I discovered that this option  supports 
only by new versions of ElementTree(1.3) and lxlm(2.0), so it won't be 
available for now on standard Python 2.5 ElementTree. Maybe we can do it 
optional.

There is one problem with lxml: misc/boldlinks test cause such error:

  File "etree.pyx", line 693, in etree._Element.text.__set__
  File "apihelpers.pxi", line 344, in etree._setNodeText
  File "apihelpers.pxi", line 648, in etree._utf8
AssertionError: All strings must be XML compatible, either Unicode or ASCII

I suppose that is because in this test we trying to assign to el.text 
data, that contains placeholders, and maybe by some reason lxlm treats 
placeholders values(u'\u0001' and u'\u0002') as not unicode or ascii.

New markdown with ElementTree:

construction:0.000000:0.000000
amps-and-angle-encoding:0.070000:0.000000
auto-links:0.080000:0.000000
backlash-escapes:0.200000:0.000000
blockquotes-with-dode-blocks:0.030000:0.000000
hard-wrapped:0.010000:0.000000
horizontal-rules:0.160000:0.000000
inline-html-advanced:0.030000:0.000000
inline-html-comments:0.030000:0.000000
inline-html-simple:0.140000:0.000000
links-inline:0.050000:0.000000
links-reference:0.070000:0.000000
literal-quotes:0.030000:0.000000
markdown-documentation-basics:0.440000:0.000000
markdown-syntax:1.980000:1908736.000000
nested-blockquotes:0.030000:0.000000
ordered-and-unordered-list:0.310000:0.000000
strong-and-em-together:0.040000:0.000000
tabs:0.040000:0.000000
tidyness:0.040000:0.000000

New markdown with cElementTree:

construction:0.000000:0.000000
amps-and-angle-encoding:0.050000:135168.000000
auto-links:0.070000:0.000000
backlash-escapes:0.190000:0.000000
blockquotes-with-dode-blocks:0.020000:0.000000
hard-wrapped:0.020000:0.000000
horizontal-rules:0.140000:0.000000
inline-html-advanced:0.020000:0.000000
inline-html-comments:0.030000:0.000000
inline-html-simple:0.120000:0.000000
links-inline:0.050000:0.000000
links-reference:0.060000:0.000000
literal-quotes:0.020000:0.000000
markdown-documentation-basics:0.410000:274432.000000
markdown-syntax:1.810000:1138688.000000
nested-blockquotes:0.020000:0.000000
ordered-and-unordered-list:0.260000:0.000000
strong-and-em-together:0.030000:0.000000
tabs:0.040000:0.000000
tidyness:0.030000:0.000000

New markdown with lxml:

construction:0.000000:0.000000
amps-and-angle-encoding:0.060000:0.000000
auto-links:0.070000:147456.000000
backlash-escapes:0.170000:135168.000000
blockquotes-with-dode-blocks:0.020000:0.000000
hard-wrapped:0.010000:0.000000
horizontal-rules:0.140000:0.000000
inline-html-advanced:0.030000:0.000000
inline-html-comments:0.030000:0.000000
inline-html-simple:0.120000:0.000000
links-inline:0.060000:0.000000
links-reference:0.080000:0.000000
literal-quotes:0.030000:0.000000
markdown-documentation-basics:0.370000:450560.000000
markdown-syntax:1.750000:2011136.000000
nested-blockquotes:0.020000:0.000000
ordered-and-unordered-list:0.250000:0.000000
strong-and-em-together:0.030000:0.000000
tabs:0.040000:0.000000
tidyness:0.030000:0.000000

New markdown with NanoDOM:

construction:0.000000:0.000000
amps-and-angle-encoding:0.060000:0.000000
auto-links:0.070000:0.000000
backlash-escapes:0.220000:135168.000000
blockquotes-with-dode-blocks:0.020000:0.000000
hard-wrapped:0.020000:0.000000
horizontal-rules:0.150000:0.000000
inline-html-advanced:0.030000:0.000000
inline-html-comments:0.030000:0.000000
inline-html-simple:0.140000:0.000000
links-inline:0.050000:0.000000
links-reference:0.080000:0.000000
literal-quotes:0.030000:0.000000
markdown-documentation-basics:0.450000:868352.000000
markdown-syntax:2.080000:5160960.000000
nested-blockquotes:0.020000:0.000000
ordered-and-unordered-list:0.290000:0.000000
strong-and-em-together:0.030000:0.000000
tabs:0.040000:0.000000
tidyness:0.030000:0.000000

Thread: [Python-markdown-discuss] GSoC ElementTree support

python-markdown-discuss