Adam Chodorowski wrote:
> [swedish mappings]
>=20
>> Thanks Adam. But Python modules should be 7-bit clean; please convert t=
he
>> accented characters to "\uXXXX" Unicode escapes (and the strings to "u''=
"
>> Unicode strings). 8-bit strings will work as long as the modules aren't
>> transfered across platforms, but they'll fail if poorly converted. Best=
to
>> leave nothing to chance.
>=20
> Have to say that this is the first project I've seen to require that; mos=
t
> other projects assume iso-8859-1 if nothing else is specified.
A Latin-1 assumption is bad, IMHO. I think most projects ignore MacOS. I
use a Mac at home, a very old one running MacOS 8.6, which knows nothing
about iso-8859-1. I couldn't even read your patch properly; I had to
convert it first *using* Python. Once converted, it's useless to Python,
since it's a different encoding (MacRoman). MacOS X may work better, but m=
y
machine is too old to run it.
I do a lot of testing from my SourceForge shell accont. It's faster. ;-)
> Anyway, I'll fix this ASAP.
Thank you! =20
> Hmmm... I guess the below is a good method for escaping the string?
>=20
>>>> "awera=E4=F6=E5".decode('iso-8859-1')
> u'awera\xe4\xf6\xe5'
>=20
> I wonder what the difference between \xXX and \uXXXX is..?
In Unicode context, there isn't any difference between \xXY and \u00XY;
"\xXY" is simply the representation of "\u00XY". The first 256 code points
of Unicode correspond to ISO-8859-1/Latin-1 (0-127 are ASCII). I think the
representation of Unicode strings *should* say \u00XY (or even \uXY),
because \xXY loses the "Unicode" association and implies a Latin-1
assumption. In other words, without the "u" string prefix, "\xXY" looks
like an 8-bit string and requires knowledge of the encoding; "\u00XY" is
obviously Unicode and there is no encoding to know about. But I don't know
the details or the rationale behind the Unicode implementation or this
particular decision.
--=20
David Goodger <go...@us...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|