|
From: Blake W. <bw...@la...> - 2008-07-08 11:40:24
|
Yuri Takhteyev wrote:
>> Concerning the html/xtml output, I discovered that this option supports
>> only by new versions of ElementTree(1.3) and lxlm(2.0), so it won't be
>> available for now on standard Python 2.5 ElementTree. Maybe we can do it
>> optional.
> Again, I wouldn't worry too much about this. If someone wants HTML
> output, converting XHTML to HTML4 should be easy enough.
Is this still true if you have inline not-necessarily-legal-XML blocks?
(i.e. will it still be easy to convert:
**Foo**
<br>
blah blah blah
'bar'
?)
>> There is one problem with lxml: misc/boldlinks test cause such error:
>> File "etree.pyx", line 693, in etree._Element.text.__set__
>> File "apihelpers.pxi", line 344, in etree._setNodeText
>> File "apihelpers.pxi", line 648, in etree._utf8
>> AssertionError: All strings must be XML compatible, either Unicode or ASCII
>>
>> I suppose that is because in this test we trying to assign to el.text
>> data, that contains placeholders, and maybe by some reason lxlm treats
>> placeholders values(u'\u0001' and u'\u0002') as not unicode or ascii.
> We could re-think our choice of placeholders if we know that this is
> the reason. But it sounds like elementTree is the way to go.
What if we went with the BOM character (oxFEFF) as the replacement?
It's legal unicode, and _extremely_ unlikely to occur in the middle of
text. The only thing to watch out for is having it occur at the start
of the file.
Later,
Blake.
|