I forgot to copy everyone on my original reply to Doug. Doug provided several good patches that will be incorporated. The unicode/latin-1 problem concerns me, so here is my extended response on the character encoding problem with python 2.0.
I'd thought I'd give you a better explaination about what I think is wrong with the unicode-latin1 issue. While your patches work, I think they are treating the symptom, and not the source of the problem.
All internal strings in gramps should be latin-1 encoded. The fact that you have to translate them to latin indicates that there is a problem somewhere that is allowing non-latin-1 characters to get into the data. There are three sources of input at this time - entry into the interface, gramps input file, and GEDCOM.
The entry into the interface should always return latin-1, since gnome does not currently handle unicode. I think we can probably eliminate this one. The gramps input file under python 2.0 uses an encoded input file, that is supposed to translate from unicode to latin-1 as it data is read in. In ReadXML.py you should see around line 71 the following line:
xml_file = EncodedFile(gzip.open(filename,"rb"),'utf-8','latin-1')
It is possible that this is not doing that I think it should.
The third possiblity is in the GEDCOM import. My bet is on this one. It looks as if under 2.0 I am not decoding unicode properly. My guess is that changing lines 32-36 of latin_utf8.py to:
might to the trick. I think this patch probably needs to be made.
A couple of questions:
1. Did you originally import your data from a GEDCOM file?
2. Was the GEDCOM file encoded as ASCII, ANSEL, UNICODE, or UTF-8? (check the CHAR line towards the top of the file)