|
From: Waylan L. <wa...@gm...> - 2012-12-10 16:19:29
|
On Mon, Dec 10, 2012 at 10:48 AM, Brian Neal <bg...@gm...> wrote:
> On Mon, Dec 10, 2012 at 9:16 AM, Brian Neal <bg...@gm...> wrote:
>> You can also set the environment variable PYTHONIOENCODING if you find
>> yourself needing to do a lot of redirection or piping.
>
> Having said that, I went back to the script that I wrote that was
> having this problem. I fixed it by using unicode everywhere but then
> picking an encoding on output, something like:
>
> print my_string.encode('utf-8')
>
> It does look to me like Markdown is trying to do something like this
> if you run it as a module. Perhaps Waylan can chime in here.
>
Yes, I have striven to have the code do the same thing regardless of
what version of python is being used. Within the markdown.FromFile
method, input is decoded to unicode (regardless of import source) and
then forwarded to markdown, and the output is encoded (regardless of
output method: file, stdout, etc) and written out as bytes (even in
python3). In both instances (encoding and decoding) the character
encoding used is the same user defined encoding (or defaults to uft8
if not defined).
There were a few edge cases where the code was failing to take the
encoding into account (PYTHONENCODING would have made a difference),
however, I don't think that is the case anymore. Although perhaps the
fallback default should perhaps be set to PYTHONENCODING rather than
hardcoded to utf8.
The relevant code is here:
https://github.com/waylan/Python-Markdown/blob/master/markdown/__init__.py#L324
--
----
\X/ /-\ `/ |_ /-\ |\|
Waylan Limberg
|