Re: [Rest2web-develop] unicode problems
Brought to you by:
mjfoord
|
From: Michael F. <fuz...@vo...> - 2006-08-06 16:24:40
|
martin f krafft wrote:
> Michael,
>
> I see restutils.encode uses the string encode function. I don't
> think this is what you want.
>
> < madduck> so i am baffled
> < madduck> >>> type('bla'.encode('utf-8'))
> < madduck> <type 'str'>
> < cracki> encode returns 8 bit
> < madduck> or even worse,
> < madduck> >>> type(u'bla'.encode('utf-8'))
> < madduck> <type 'str'>
> < cracki> you want unicode("bla")
> < cracki> encode encodes to binary representations
> < madduck> what's the point of "encode('utf-8')" then?
> < cracki> unicode("foo", "utf-8")
> < cracki> encode(u"someunicodestr", "weirdencoding") transforms to a
> binary representation
> < cracki> in memory, unicode strings are multibyte, constant width
>
> Please also see
> http://docs.python.org/tut/node5.html#SECTION005130000000000000000
> http://www.reportlab.com/i18n/python_unicode_tutorial.html
>
>
Another good tutorial on Unicode :
http://www.pyzine.com/Issue008/Section_Articles/article_Encodings.html
:-)
> The reason I am posting this is because I am getting an error
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
> position 2555: ordinal not in range(128)
>
> This is due to a file that says "Zürich", and the file itself is
> UTF-8, as is the template:
>
> lapse:~/phd/web> head -15 imprint.txt [390]
> restindex
> encoding: utf8
> template-encoding:
> /restindex
> [...]
> 8050 Zürich
>
> The exception is thrown in line 75 of embedded_code.py:
>
> template = template.replace(occ, value)
>
> when template holds the template text just after body had been
> filled in with the result from the imprint.txt transformed to HTML.
> Template is a str, not a unicode object, which is the root of all
> evil.
>
> Am I doing something wrong?
>
I think it is the other way round, by the time they are rendered they
should all be byte-strings rather than unicode.
Anyway, I'm going round in circles trying to chase this one down.
Can you try it with an explicit 'output-encoding' of 'utf-8' and see if
you have the same problem.
Thanks
Michael
http://www.voidspace.org.uk/python/index.shtml
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> ------------------------------------------------------------------------
>
> _______________________________________________
> Rest2web-develop mailing list
> Res...@li...
> https://lists.sourceforge.net/lists/listinfo/rest2web-develop
>
|