Re: [Python-markdown-discuss] GSoC ElementTree support

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yuri Takhteyev wrote:
>> Concerning the html/xtml output, I discovered that this option  supports
>> only by new versions of ElementTree(1.3) and lxlm(2.0), so it won't be
>> available for now on standard Python 2.5 ElementTree. Maybe we can do it
>> optional.
> Again, I wouldn't worry too much about this.  If someone wants HTML
> output, converting XHTML to HTML4 should be easy enough.

Is this still true if you have inline not-necessarily-legal-XML blocks? 
    (i.e. will it still be easy to convert:
**Foo**
<br>
blah blah blah
'bar'
?)

>> There is one problem with lxml: misc/boldlinks test cause such error:
>>  File "etree.pyx", line 693, in etree._Element.text.__set__
>>  File "apihelpers.pxi", line 344, in etree._setNodeText
>>  File "apihelpers.pxi", line 648, in etree._utf8
>> AssertionError: All strings must be XML compatible, either Unicode or ASCII
>>
>> I suppose that is because in this test we trying to assign to el.text
>> data, that contains placeholders, and maybe by some reason lxlm treats
>> placeholders values(u'\u0001' and u'\u0002') as not unicode or ascii.
> We could re-think our choice of placeholders if we know that this is
> the reason.  But it sounds like elementTree is the way to go.

What if we went with the BOM character (oxFEFF) as the replacement? 
It's legal unicode, and _extremely_ unlikely to occur in the middle of 
text.  The only thing to watch out for is having it occur at the start 
of the file.

Later,
Blake.