From: Waylan L. <wa...@gm...> - 2012-12-10 16:19:29
|
On Mon, Dec 10, 2012 at 10:48 AM, Brian Neal <bg...@gm...> wrote: > On Mon, Dec 10, 2012 at 9:16 AM, Brian Neal <bg...@gm...> wrote: >> You can also set the environment variable PYTHONIOENCODING if you find >> yourself needing to do a lot of redirection or piping. > > Having said that, I went back to the script that I wrote that was > having this problem. I fixed it by using unicode everywhere but then > picking an encoding on output, something like: > > print my_string.encode('utf-8') > > It does look to me like Markdown is trying to do something like this > if you run it as a module. Perhaps Waylan can chime in here. > Yes, I have striven to have the code do the same thing regardless of what version of python is being used. Within the markdown.FromFile method, input is decoded to unicode (regardless of import source) and then forwarded to markdown, and the output is encoded (regardless of output method: file, stdout, etc) and written out as bytes (even in python3). In both instances (encoding and decoding) the character encoding used is the same user defined encoding (or defaults to uft8 if not defined). There were a few edge cases where the code was failing to take the encoding into account (PYTHONENCODING would have made a difference), however, I don't think that is the case anymore. Although perhaps the fallback default should perhaps be set to PYTHONENCODING rather than hardcoded to utf8. The relevant code is here: https://github.com/waylan/Python-Markdown/blob/master/markdown/__init__.py#L324 -- ---- \X/ /-\ `/ |_ /-\ |\| Waylan Limberg |