Re: [Gramps-devel] Python string problem

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Den Saturday 05 June 2010 21.35.03 skrev Gerald Britton:
> so does that mean you SOL or is there a way to handle it?
Yes,
Convert str to unicode before calling spli() or you can do str.decode('utf-8').split().
But in this case it's best to convert to unicode as you later will shorten the string and you want 
to avoid to cut a string with a utf-8 sequence.

/Peter

 
> On Sat, Jun 5, 2010 at 1:32 PM, Peter Landgren <pet...@te...> wrote:
> > Hi,
> > when working with http://www.gramps-project.org/bugs/view.php?id=3935
> >
> > I have found out why this happens.
> >
> > For certain characters, at least for any of  "àĠŠƠǠȠɠʠΠР", the
> > str.split() function gives in different result in Linux and Windows.
> >
> > str.split() tried to decode the supplied string using the current
> > encoding. If this is UTF-8 as in Linux normally, str.split() works ok. If
> > the encoding is cp1252, as in Windows normaly, The second part of these
> > characters has the hex value of \xa0, which is interpreted as a
> > whitespace and thus the character is split within an UTF-8 sequence. This
> > generates error further down.
> >
> > It's similar to slice a string in the middle of UTF-8 sequences.
> >
> > /Peter
>

Re: [Gramps-devel] Python string problem

Gramps, the open source genealogy program

Re: [Gramps-devel] Python string problem