|
From: Yuri T. <qar...@gm...> - 2007-09-16 10:10:49
|
Thanks for reporting this. I will look into it.
- yuri
On 9/12/07, Kent Johnson <ke...@td...> wrote:
> Hi,
>
> Markdown 1.6b doesn't work with UTF-8-encoded text. It fails with a
> UnicodeDecodeError in removeBOM():
>
> In [3]: import markdown
> In [4]: text =3D u'\xe2'.encode('utf-8')
> In [6]: print text
> =E2
> In [7]: print markdown.markdown(text)
> ------------------------------------------------------------
> Traceback (most recent call last):
> File "<ipython console>", line 1, in <module>
> File
> "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac=
kages/markdown.py",
> line 1722, in markdown
> return md.convert(text)
> File
> "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac=
kages/markdown.py",
> line 1614, in convert
> self.source =3D removeBOM(self.source, self.encoding)
> File
> "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac=
kages/markdown.py",
> line 74, in removeBOM
> if text.startswith(bom):
> <type 'exceptions.UnicodeDecodeError'>: 'ascii' codec can't decode byte
> 0xc3 in position 0: ordinal not in range(128)
>
> The problem is that the BOM being tested is unicode so to execute
> text.startswith(bom)
> Python tries to convert text to Unicode using the default encoding
> (ascii). This fails because the text is not ascii.
>
> I'm trying to understand what the encoding parameter is for; it doesn't
> seem to do much. There also seems to be some confusion with the use of
> encoding in markdownFromFile() vs markdown(); the file is converted to
> Unicode on input so I don't understand why the same encoding parameter
> is passed to markdown()?
>
> ISTM the encoding passed to markdown should match the encoding of the
> text passed to markdown, and the values in the BOMS table should be in
> the encoding of the key, not in unicode. Then the __unicode__() method
> should actually decode. Or is the intent that the text passed to
> markdown() should always be ascii or unicode?
>
> I can put together a patch if you like but I wanted to make sure that I
> am not missing some grand plan...
>
> Kent
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Python-markdown-discuss mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss
>
--=20
Yuri Takhteyev
Ph.D. Candidate, UC Berkeley School of Information
http://takhteyev.org/, http://www.freewisdom.org/
|