From: Yuri T. <qar...@gm...> - 2007-09-16 10:10:49
|
Thanks for reporting this. I will look into it. - yuri On 9/12/07, Kent Johnson <ke...@td...> wrote: > Hi, > > Markdown 1.6b doesn't work with UTF-8-encoded text. It fails with a > UnicodeDecodeError in removeBOM(): > > In [3]: import markdown > In [4]: text =3D u'\xe2'.encode('utf-8') > In [6]: print text > =E2 > In [7]: print markdown.markdown(text) > ------------------------------------------------------------ > Traceback (most recent call last): > File "<ipython console>", line 1, in <module> > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac= kages/markdown.py", > line 1722, in markdown > return md.convert(text) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac= kages/markdown.py", > line 1614, in convert > self.source =3D removeBOM(self.source, self.encoding) > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac= kages/markdown.py", > line 74, in removeBOM > if text.startswith(bom): > <type 'exceptions.UnicodeDecodeError'>: 'ascii' codec can't decode byte > 0xc3 in position 0: ordinal not in range(128) > > The problem is that the BOM being tested is unicode so to execute > text.startswith(bom) > Python tries to convert text to Unicode using the default encoding > (ascii). This fails because the text is not ascii. > > I'm trying to understand what the encoding parameter is for; it doesn't > seem to do much. There also seems to be some confusion with the use of > encoding in markdownFromFile() vs markdown(); the file is converted to > Unicode on input so I don't understand why the same encoding parameter > is passed to markdown()? > > ISTM the encoding passed to markdown should match the encoding of the > text passed to markdown, and the values in the BOMS table should be in > the encoding of the key, not in unicode. Then the __unicode__() method > should actually decode. Or is the intent that the text passed to > markdown() should always be ascii or unicode? > > I can put together a patch if you like but I wanted to make sure that I > am not missing some grand plan... > > Kent > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Python-markdown-discuss mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss > --=20 Yuri Takhteyev Ph.D. Candidate, UC Berkeley School of Information http://takhteyev.org/, http://www.freewisdom.org/ |