| 
      
      
      From: David G. <go...@us...> - 2002-07-21 00:38:42
      
     | 
| Adam Chodorowski wrote:
> [swedish mappings]
>=20
>> Thanks Adam.  But Python modules should be 7-bit clean; please convert t=
he
>> accented characters to "\uXXXX" Unicode escapes (and the strings to "u''=
"
>> Unicode strings).  8-bit strings will work as long as the modules aren't
>> transfered across platforms, but they'll fail if poorly converted.  Best=
 to
>> leave nothing to chance.
>=20
> Have to say that this is the first project I've seen to require that; mos=
t
> other projects assume iso-8859-1 if nothing else is specified.
A Latin-1 assumption is bad, IMHO.  I think most projects ignore MacOS.  I
use a Mac at home, a very old one running MacOS 8.6, which knows nothing
about iso-8859-1.  I couldn't even read your patch properly; I had to
convert it first *using* Python.  Once converted, it's useless to Python,
since it's a different encoding (MacRoman).  MacOS X may work better, but m=
y
machine is too old to run it.
I do a lot of testing from my SourceForge shell accont.  It's faster.  ;-)
> Anyway, I'll fix this ASAP.
Thank you! =20
> Hmmm... I guess the below is a good method for escaping the string?
>=20
>>>> "awera=E4=F6=E5".decode('iso-8859-1')
> u'awera\xe4\xf6\xe5'
>=20
> I wonder what the difference between \xXX and \uXXXX is..?
In Unicode context, there isn't any difference between \xXY and \u00XY;
"\xXY" is simply the representation of "\u00XY".  The first 256 code points
of Unicode correspond to ISO-8859-1/Latin-1 (0-127 are ASCII).  I think the
representation of Unicode strings *should* say \u00XY (or even \uXY),
because \xXY loses the "Unicode" association and implies a Latin-1
assumption.  In other words, without the "u" string prefix, "\xXY" looks
like an 8-bit string and requires knowledge of the encoding; "\u00XY" is
obviously Unicode and there is no encoding to know about.  But I don't know
the details or the rationale behind the Unicode implementation or this
particular decision.
--=20
David Goodger  <go...@us...>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/
 |