Thread: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

gramps-devel

[Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-18 17:20:48

Hi,

When I try to import a GEDCOM file created by PAF, I see that the 
current code in trunk is not able to detect the file type by 
interpreting the 1st 2 bytes of that file. Instead, I see the error 
shown below:

/home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning: 
Unicode equal comparison failed to convert both arguments to Unicode - 
interpreting them as being unequal
   if line == "\xef\xbb":
/home/test/trunk/gramps/plugins/lib/libgedcom.py:7449: UnicodeWarning: 
Unicode equal comparison failed to convert both arguments to Unicode - 
interpreting them as being unequal
   elif line == "\xff\xfe":
/home/test/trunk/gramps/plugins/lib/libgedcom.py:7455: UnicodeWarning: 
Unicode equal comparison failed to convert both arguments to Unicode - 
interpreting them as being unequal
   elif line[0] == "\x00" or line[1] == "\x00":
2013-02-18 18:00:08.556: WARNING: libgedcom.py: line 7482: Foute lijn1 
in GEDCOM bestand.

My GEDCOM file starts like this (od -t x1):

0000000 ef bb bf 30 20 48 45 41 44 0d 0a 31 20 53 4f 55

So I know that I need to get above code working to be able to import 
files that I create in PAF or RM. It's in __detect_file_decoder(self, 
input_file), and it works OK in 3.4. Can anyone help me here? I read a 
few things about the subject on stackoverflow, but I'm still lost here. 
I use Python 2.7.3, on Mint 14.

I tried opening the file in another mode, but that didn't help. Putting 
str() at the right side of == helps me to get rid of the Unicode error 
messages, but it doesn't help to get the import running. I assume that 
line is sort of a byte string that can't be converted to Unicode, so 
maybe the whole comparison code must be rewritten, but I don't know how.

My bug report is here:

http://www.gramps-project.org/bugs/view.php?id=6462

And I have enough time to experiment, once I know where to start.

thanks,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-18 18:16:33

On Feb 18, 2013, at 9:20 AM, Enno Borgsteede <enn...@gm...> wrote:

> Hi,
> 
> When I try to import a GEDCOM file created by PAF, I see that the 
> current code in trunk is not able to detect the file type by 
> interpreting the 1st 2 bytes of that file. Instead, I see the error 
> shown below:
> 
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning: 
> Unicode equal comparison failed to convert both arguments to Unicode - 
> interpreting them as being unequal
>   if line == "\xef\xbb":
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7449: UnicodeWarning: 
> Unicode equal comparison failed to convert both arguments to Unicode - 
> interpreting them as being unequal
>   elif line == "\xff\xfe":
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7455: UnicodeWarning: 
> Unicode equal comparison failed to convert both arguments to Unicode - 
> interpreting them as being unequal
>   elif line[0] == "\x00" or line[1] == "\x00":
> 2013-02-18 18:00:08.556: WARNING: libgedcom.py: line 7482: Foute lijn1 
> in GEDCOM bestand.
> 
> My GEDCOM file starts like this (od -t x1):
> 
> 0000000 ef bb bf 30 20 48 45 41 44 0d 0a 31 20 53 4f 55
> 
> So I know that I need to get above code working to be able to import 
> files that I create in PAF or RM. It's in __detect_file_decoder(self, 
> input_file), and it works OK in 3.4. Can anyone help me here? I read a 
> few things about the subject on stackoverflow, but I'm still lost here. 
> I use Python 2.7.3, on Mint 14.
> 
> I tried opening the file in another mode, but that didn't help. Putting 
> str() at the right side of == helps me to get rid of the Unicode error 
> messages, but it doesn't help to get the import running. I assume that 
> line is sort of a byte string that can't be converted to Unicode, so 
> maybe the whole comparison code must be rewritten, but I don't know how.
> 
> My bug report is here:
> 
> http://www.gramps-project.org/bugs/view.php?id=6462
> 
> And I have enough time to experiment, once I know where to start.

Two suggestions:

Put a print(lines) at libgedcom.py:7445 to see what it's actually reading,
and figure out where it's actually opening the file and make sure that it's doing so in binary ("rb").

Regards,
John Ralls.

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-20 14:30:05

Hi Benny,
> After a GEDCOM import with problems, it is a good idea to run the
> check and repair database tool
> and the
> rebuild secondary indexes / rebuild reference map tools
>
> It has suggested on this list to do that automatically, but I'm 
> personally not in favour of this.
I ran a search for 'unknown' on Mantis, and found a report that gave me 
the impression that typeless objects are created on purpose, when 
unresolved references (to repos, notes, whatever) are found in a GEDCOM 
file.

While I understand the reasoning behind this, I would personally prefer 
the creation of dummy objects with a type that does not create problems 
on subsequent GEDCOM export. Then it's up to the user to check those 
some time later, just like one can check for GEDCOM-import notes.

regards,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Tim L. <guy...@gm...> - 2013-02-21 21:39:43

enno wrote
> I ran a search for 'unknown' on Mantis, and found a report that gave me 
> the impression that typeless objects are created on purpose, when 
> unresolved references (to repos, notes, whatever) are found in a GEDCOM 
> file.
> 
> While I understand the reasoning behind this, I would personally prefer 
> the creation of dummy objects with a type that does not create problems 
> on subsequent GEDCOM export. Then it's up to the user to check those 
> some time later, just like one can check for GEDCOM-import notes.


(I haven't actually looked at the code just now, but...)

It is not actually creating typeless objects, indeed there would be no way
to do that. It is creating objects of whatever type is required, but is
setting some 'characterising attribute' of the object to 'unknown' Indeed,
when we implemented this, we changed the message from talking about unknown
type to talking about unknown characterising attribute as 'unknown type' was
both wrong and confusing.

>From your error messages, it looks as though it is trying to create a
Repository object, and to link the note about 'this is a note about a
repository object that has been created' to the repository. But evidently
something is going wrong, and possibly the repository object is not
correctly created, and possibly the note is not correctly linked. At least
the link from an object to the repository (I think it must be a link from a
source to a repository, because that is the only type of object that links
to repositories) does not seem to be created properly.

Thanks for the info about the problem - it is supposed to be doing what you
suggest, but is just not doing it right.



--
View this message in context: http://gramps.1791082.n4.nabble.com/trunk-problem-importing-GEDCOM-files-created-by-Windows-programs-tp4658840p4658895.html
Sent from the GRAMPS - Dev mailing list archive at Nabble.com.

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-22 21:22:02

Hi Tim,
> (I haven't actually looked at the code just now, but...) It is not 
> actually creating typeless objects, indeed there would be no way to do 
> that. It is creating objects of whatever type is required, but is 
> setting some 'characterising attribute' of the object to 'unknown'
Right, and my analysis was wrong. I have a RootsMagic GEDCOM file that 
has a lot of missing notes, and when I import that, Gramps creates note 
object with type unknown. And my mistake was that I thought that that 
unknown type was causing those subsequent GEDCOM export problem. It's not.

I can say that now, because the export runs fine after running check and 
repair, and after that, those notes of unknown type are still there. In 
other words, things are indeed like you say they are, meaning that 
objects are created, but some links are wrong, or not all needed objects 
are created.

> From your error messages, it looks as though it is trying to create a 
> Repository object, and to link the note about 'this is a note about a 
> repository object that has been created' to the repository. But 
> evidently something is going wrong, and possibly the repository object 
> is not correctly created, and possibly the note is not correctly linked.
In my latest test, which goes wrong in 3.4.3 and trunk, it's not about 
repositories, but notes, and I can upload the offending GEDCOM as 
private if that helps. During import more than 1400 notes are created, 
but apparently that's not enough, since check and repair reports about 
12 notes missing.

The error message that I get looks very much like the one in

http://www.gramps-project.org/bugs/view.php?id=5184

which was closed because there was not enough feedback to reproduce it. 
I can provide that feedback, for 3.4.3 or trunk, so I suggest that above 
bug is reopened. Jerome may not like that, because he closed it just 
yesterday ...

thanks,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: jerome <rom...@ya...> - 2013-02-23 08:28:55

> http://www.gramps-project.org/bugs/view.php?id=5184

> which was closed because there was not enough feedback to
> reproduce it. 
> I can provide that feedback, for 3.4.3 or trunk, so I
> suggest that above 
> bug is reopened. Jerome may not like that, because he closed
> it just 
> yesterday ...

Yes, feel free to re-open.
I suppose there is no specific rights access on tracker for this?
There is button for that bettween 'Relations' section and 'Attachements' section.

Note, this report was for Gramps 3.3.0. The first release of 3.3.x branch.
Reported on 2011-08-26.

I have tried to understand where or what could be wrong.
Asked for trying to run 'check' and repair' after a gedcom import and if there is any testcase database for reproducing it.

Tim has reviewed citation section and gedcom handling.
As we cannot reproduce it and as today the code might be different, as there was no additional informations from the reporter, as it was for 3.3.0; maybe to close it might one solution? If someone else can reproduce it, then we could re-open it or fill a new bug report. 

Note, playing with an experimental GenoPro to Gramps converter via python core modules and XML file format[1], I often failed into broken or missing references issues. I also used an 'unknown' string when I got this type of problems... We cannot fix all types of errors and we cannot reject complete file for one error (even we can check syntax and validate file).

About the experimental converter, the main issues were family relations, status and individual locations. Python has nice and simple ways for handling that (map, list, dict, etc ...). I did not used the simpliers, but I suppose I get the expected ideas/concepts! The second issue was related to note. Every time I saw no right place for data, I stored content into a Note object.

But like with Gedcom file format, there is no index for notes...
About family, we can check references into individuals and into a complete family. How many passes for a complete family?

As said, there is so many way for generating Gedcom sequences...
To run 'Check and Repair' after a gedcom import can be useful.
Otherwise, keep in mind that gecom file format can be corrupted (ie. did you already send a gedcom with non-ASCII characters via mail server using different encoding as the gedcom file?). Is there any way to know if IDs set into the gedcom are good for a trusted index? Does the program, which has generated the gedcom, aims exchange and genealogical collaborations? Or does it use time for having the better gedcom handling as possible? etc ...


[1] http://www.gramps-project.org/bugs/view.php?id=2757


Jérôme


--- En date de : Ven 22.2.13, Enno Borgsteede <enn...@gm...> a écrit :

> De: Enno Borgsteede <enn...@gm...>
> Objet: Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs
> À: gra...@li...
> Date: Vendredi 22 février 2013, 22h21
> Hi Tim,
> > (I haven't actually looked at the code just now,
> but...) It is not 
> > actually creating typeless objects, indeed there would
> be no way to do 
> > that. It is creating objects of whatever type is
> required, but is 
> > setting some 'characterising attribute' of the object
> to 'unknown'
> Right, and my analysis was wrong. I have a RootsMagic GEDCOM
> file that 
> has a lot of missing notes, and when I import that, Gramps
> creates note 
> object with type unknown. And my mistake was that I thought
> that that 
> unknown type was causing those subsequent GEDCOM export
> problem. It's not.
> 
> I can say that now, because the export runs fine after
> running check and 
> repair, and after that, those notes of unknown type are
> still there. In 
> other words, things are indeed like you say they are,
> meaning that 
> objects are created, but some links are wrong, or not all
> needed objects 
> are created.
> 
> > From your error messages, it looks as though it is
> trying to create a 
> > Repository object, and to link the note about 'this is
> a note about a 
> > repository object that has been created' to the
> repository. But 
> > evidently something is going wrong, and possibly the
> repository object 
> > is not correctly created, and possibly the note is not
> correctly linked.
> In my latest test, which goes wrong in 3.4.3 and trunk, it's
> not about 
> repositories, but notes, and I can upload the offending
> GEDCOM as 
> private if that helps. During import more than 1400 notes
> are created, 
> but apparently that's not enough, since check and repair
> reports about 
> 12 notes missing.
> 
> The error message that I get looks very much like the one
> in
> 
> http://www.gramps-project.org/bugs/view.php?id=5184
> 
> which was closed because there was not enough feedback to
> reproduce it. 
> I can provide that feedback, for 3.4.3 or trunk, so I
> suggest that above 
> bug is reopened. Jerome may not like that, because he closed
> it just 
> yesterday ...
> 
> thanks,
> 
> Enno
> 
> 
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Gramps-devel mailing list
> Gra...@li...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-23 17:12:44

Hi Jerome,

I created a new issue for 3.4.2 as a followup to this:
  > http://www.gramps-project.org/bugs/view.php?id=5184
I was not able to reopen it, and I guess that’s because my Mantis status is reporter, not developer.
  But like with Gedcom file format, there is no index for notes...
H’m, I don’t understand what you mean my that. Sorry.

regards,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Jérôme <rom...@ya...> - 2013-02-24 09:08:22

Enno,

> I was not able to reopen it, and I guess that’s because my Mantis status
> is reporter, not developer

Maybe only the reporter and people having developer status are allowed 
to do that?

> I created a new issue for 3.4.2 as a followup to this:
>
>       http://www.gramps-project.org/bugs/view.php?id=5184

OK, thank you.

Yes, it seems to be the same problem, still there with last stable 
release. It also makes 'summary' tab per project more clear[1]!

> But like with Gedcom file format, there is no index for notes...
>
> H’m, I don’t understand what you mean my that. Sorry.

You can put Note almost everywhere, via simple note having an ID, source 
text, transcription into a source/citation, comment, etc ...

Same multiple locations/choices for source, objects, which all have an ID.

See tab[2] which can give an overview.

Without safe ID (xref or whatever key for a mapping) you can easily 
corrupt and lost data... See quick tests[3] on FAM tags and relations.

Having only a part of comment, text, note with an ID is a Gedcom design 
choice.

= > One will say that Gramps should follow Gedcom logic...

Otherwise, should a person exist:

1. once there is gender value
2. once there is a name
3. once there is an ID
4. once there is a relation with someone else: parent of, child of, 
spouse of, partner of, witness at shared event, noted into a source or 
transcription, etc ...

On gedcom's Family, we suppose that a parent with gender male is the 
father and a parent with gender female is the mother.

What happen into a gedcom where there is a gender issue: someone noted 
gender:male (IND) set as wife (FAM) or someone noted gender:female (IND) 
set as husband (FAM)?

It seems that Gramps inherited from these limitations/issues...

Where I want to go?

Well, it is time to have a gedcom 5.5.x replacement!

'bettergedcom' or 'gedcomx' sound better...

At least, everything around individual relations and family, 
source/citation, places, note and object, privacy: lack on exchange.

I do not care of others specification goals related to data business.


Anyway, the title of the subject might be a little bit different (update)?

Title : 'trunk problem importing GEDCOM files created by Windows 
programs' becomes something less general about the OS, but rather 
oriented gedcom customization/declinaison/declension and current gramps' 
handling (also on 3.4.x).



[1] http://www.gramps-project.org/bugs/summary_page.php
[2] http://bertrand-maillard.pagesperso-orange.fr/sylvain/tabletag.html
[3] http://www.gramps-project.org/bugs/view.php?id=2710.


regards,
Jérôme


Le 23/02/2013 18:12, Enno Borgsteede a écrit :
> Hi Jerome,
> I created a new issue for 3.4.2 as a followup to this:
>
>      > http://www.gramps-project.org/bugs/view.php?id=5184
>
> I was not able to reopen it, and I guess that’s because my Mantis status
> is reporter, not developer.
>
>     But like with Gedcom file format, there is no index for notes...
>
> H’m, I don’t understand what you mean my that. Sorry.
> regards,
> Enno

Re: [Gramps-devel] problems importing GEDCOM files

From: Enno B. <enn...@gm...> - 2013-02-25 17:43:21

Hi Jérôme,
> Enno,
>
>> I was not able to reopen it, and I guess that’s because my Mantis status
>> is reporter, not developer
>
> Maybe only the reporter and people having developer status are allowed 
> to do that?
I think so, yes.
> Without safe ID (xref or whatever key for a mapping) you can easily 
> corrupt and lost data... See quick tests[3] on FAM tags and relations.
I know, and I have a new thought about that. Here it is:

After changing the transaction stuff for citation merge, I tried .gramps 
import with the test file with 100 001 citations. It takes more than an 
hour here, but it doesn't crash, and while it was running, I looked at 
some import source, and found that transactions are OFF during GEDCOM 
import. Can that be related to the corruptions we find?

I assume that transactions were switched off, because GEDCOM imports 
would also be very slow otherwise, and I haven't tested what happens 
when I switch them off in XML import too. It's worth a try though, I 
think, later ...
> Well, it is time to have a gedcom 5.5.x replacement!
>
> 'bettergedcom' or 'gedcomx' sound better...
I don't see much happening at better gedcom, but gedcomx sure does 
interesting things. One thing I see there, is that they got rid of 
citations, but do allow for nested sources! And the FamilySearch tree 
also takes a very simple approach to sources, which looks like the 
opposite of what others are doing. I like that.

I'm experimenting a lot with the FS tree, using RootsMagic, and 
previously Ancestral Quest, and if that's the future of genealogy, we 
may see more exchange of small datasets, i.e. single persons, families, 
and associated objects like places, notes, and sources. Do you see that too?

cheers,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Peter L. <pet...@te...> - 2013-02-18 18:42:42

Den Monday 18 February 2013 18.20.26 skrev Enno Borgsteede:
> Hi,
> 
> When I try to import a GEDCOM file created by PAF, I see that the
> current code in trunk is not able to detect the file type by
> interpreting the 1st 2 bytes of that file. Instead, I see the error
> shown below:
> 
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning:
> Unicode equal comparison failed to convert both arguments to Unicode -
> interpreting them as being unequal
>    if line == "\xef\xbb":
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7449: UnicodeWarning:
> Unicode equal comparison failed to convert both arguments to Unicode -
> interpreting them as being unequal
>    elif line == "\xff\xfe":
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7455: UnicodeWarning:
> Unicode equal comparison failed to convert both arguments to Unicode -
> interpreting them as being unequal
>    elif line[0] == "\x00" or line[1] == "\x00":
> 2013-02-18 18:00:08.556: WARNING: libgedcom.py: line 7482: Foute lijn1
> in GEDCOM bestand.
libgedcom tries to compare a bytestring in "line" with an ordinary string (unicode) which fails.

/Peter

> My GEDCOM file starts like this (od -t x1):
> 
> 0000000 ef bb bf 30 20 48 45 41 44 0d 0a 31 20 53 4f 55
> 
> So I know that I need to get above code working to be able to import
> files that I create in PAF or RM. It's in __detect_file_decoder(self,
> input_file), and it works OK in 3.4. Can anyone help me here? I read a
> few things about the subject on stackoverflow, but I'm still lost here.
> I use Python 2.7.3, on Mint 14.
> 
> I tried opening the file in another mode, but that didn't help. Putting
> str() at the right side of == helps me to get rid of the Unicode error
> messages, but it doesn't help to get the import running. I assume that
> line is sort of a byte string that can't be converted to Unicode, so
> maybe the whole comparison code must be rewritten, but I don't know how.
> 
> My bug report is here:
> 
> http://www.gramps-project.org/bugs/view.php?id=6462
> 
> And I have enough time to experiment, once I know where to start.
> 
> thanks,
> 
> Enno
> 
> 
> ---------------------------------------------------------------------------
> --- The Go Parallel Website, sponsored by Intel - in partnership with
> Geeknet, is your hub for all things parallel software development, from
> weekly thought leadership blogs to news, videos, case studies, tutorials,
> tech docs, whitepapers, evaluation guides, and opinion stories. Check out
> the most recent posts - join the conversation now.
> http://goparallel.sourceforge.net/
> _______________________________________________
> Gramps-devel mailing list
> Gra...@li...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel

-- 
Peter Landgren
Talken Hagen	
671 94  BRUNSKOG
0570-530 21
070-345 0964
pet...@te...
Skype: pgl4820.2

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-18 20:08:22

Hi Peter,
> Den Monday 18 February 2013 18.20.26 skrev Enno Borgsteede:
>> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning:
>> Unicode equal comparison failed to convert both arguments to Unicode -
>> interpreting them as being unequal
>>     if line == "\xef\xbb":
>>
> libgedcom tries to compare a bytestring in "line" with an ordinary string (unicode) which fails.
>
> /Peter
I got that, but I have no idea on the repair strategy. Tried backquotes 
around line, and extra ' and \ to the right of the == and that works, 
but there must be a better way, like forcing the right parameter to be 
interpreted as a bytestring, just like I would do in C(++).

I tried str() on the right string, which let me get rid of the error, 
but the compare didn't return the right encoding. Elsewhere, I read 
about using encode, but I haven't tried that yet.

thanks,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-18 19:54:38

Hi John,
> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning:
> Unicode equal comparison failed to convert both arguments to Unicode -
> interpreting them as being unequal
>    if line == "\xef\xbb":
> Two suggestions:
>
> Put a print(lines) at libgedcom.py:7445 to see what it's actually reading,
> and figure out where it's actually opening the file and make sure that it's doing so in binary ("rb").
I added

         print(`line`)

and got this

'\xef\xbb'

so the file is being read like it should, even though it seems to be 
opened as "r".

In other words, I still have to repair the ifs, so that they compare 
byte strings instead of unicode.

Changing the 1st if to

         if `line` == "'\\xef\\xbb'":

and making the same change to the other ones, including one in 
UTF8Reader works, but I hope there is a nicer way to accomplish this.

thanks,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-18 20:30:57

On Feb 18, 2013, at 11:54 AM, Enno Borgsteede <enn...@gm...> wrote:

> Hi John,
>> /home/test/trunk/gramps/plugins/lib/libgedcom.py:7445: UnicodeWarning:
>> Unicode equal comparison failed to convert both arguments to Unicode -
>> interpreting them as being unequal
>>   if line == "\xef\xbb":
>> Two suggestions:
>> 
>> Put a print(lines) at libgedcom.py:7445 to see what it's actually reading,
>> and figure out where it's actually opening the file and make sure that it's doing so in binary ("rb").
> I added
> 
>        print(`line`)
> 
> and got this
> 
> '\xef\xbb'
> 
> so the file is being read like it should, even though it seems to be opened as "r".
> 
> In other words, I still have to repair the ifs, so that they compare byte strings instead of unicode.
> 
> Changing the 1st if to
> 
>        if `line` == "'\\xef\\xbb'":
> 
> and making the same change to the other ones, including one in UTF8Reader works, but I hope there is a nicer way to accomplish this.
> 

I think that's a band-aid. line shouldn't be unicode. That it is suggests the file is getting opened with io.open(encoding="utf-8"), which it clearly shouldn't be: It should be opened binary ("rb") with no encoding.

Then, so that it will work with both Py2 and Py3, the tests should be
  if line == b"\xef\xbb"

Regards,
John Ralls

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-18 21:21:27

Hi John,
> I think that's a band-aid. line shouldn't be unicode.
I agree.
> That it is suggests the file is getting opened with 
> io.open(encoding="utf-8"), which it clearly shouldn't be: It should be 
> opened binary ("rb") with no encoding.
So, this section in GedcomInfoDB

         if sys.version_info[0] < 3:
             try:
                 filepath = os.path.join(DATA_DIR, "gedcom.xml")
                 ged_file = open(filepath.encode('iso8859-1'), "r")
             except:
                 return
         else:
             try:
                 filepath = os.path.join(DATA_DIR, "gedcom.xml")
                 ged_file = open(filepath, "rb")
             except:
                 return

can be reduced to the else part. Sounds good.
> Then, so that it will work with both Py2 and Py3, the tests should be 
> if line == b"\xef\xbb"
That's the info that I needed. Thanks!

regards,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-18 22:18:05

On 18-02-13 21:30, John Ralls wrote:
> Then, so that it will work with both Py2 and Py3, the tests should be 
> if line == b"\xef\xbb"
Something weird here: The Python site says that the b prefix is ignored 
in Python 2, but my experience shows that it definitely makes a 
difference in 2.7.3.

cheers,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-18 23:16:56

On Feb 18, 2013, at 2:17 PM, Enno Borgsteede <enn...@gm...> wrote:

> On 18-02-13 21:30, John Ralls wrote:
>> Then, so that it will work with both Py2 and Py3, the tests should be if line == b"\xef\xbb"
> Something weird here: The Python site says that the b prefix is ignored in Python 2, but my experience shows that it definitely makes a difference in 2.7.3.

Really?

Python 2.7.3 (default, Jan  6 2013, 14:08:41) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> p b"\xec\xbb"
  File "<stdin>", line 1
    p b"\xec\xbb"
                ^
SyntaxError: invalid syntax
>>> b"\xec\xbb"
'\xec\xbb'
>>> "\xec\xbb" == b"\xec\xbb"
True
>>> u"\xec\xbb" == b"\xec\xbb"
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False
>>> u"\xec\xbb"
u'\xec\xbb'
>>> u"\xec\xbb" == "\xec\xbb"
False
>>> 

Tells me that if the b isn't ignored, it's not doing anything. But you still need it for Py3.

Regards,
John Ralls

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Benny M. <ben...@gm...> - 2013-02-19 09:19:34

2013/2/19 John Ralls <jr...@ce...>

>
> On Feb 18, 2013, at 2:17 PM, Enno Borgsteede <enn...@gm...> wrote:
>
> > On 18-02-13 21:30, John Ralls wrote:
> >> Then, so that it will work with both Py2 and Py3, the tests should be
> if line == b"\xef\xbb"
> > Something weird here: The Python site says that the b prefix is ignored
> in Python 2, but my experience shows that it definitely makes a difference
> in 2.7.3.
>
> Really?
>
> Python 2.7.3 (default, Jan  6 2013, 14:08:41)
> [GCC 4.2.1 (Apple Inc. build 5659)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> p b"\xec\xbb"
>   File "<stdin>", line 1
>     p b"\xec\xbb"
>                 ^
> SyntaxError: invalid syntax
> >>> b"\xec\xbb"
> '\xec\xbb'
> >>> "\xec\xbb" == b"\xec\xbb"
> True
> >>> u"\xec\xbb" == b"\xec\xbb"
> __main__:1: UnicodeWarning: Unicode equal comparison failed to convert
> both arguments to Unicode - interpreting them as being unequal
> False
> >>> u"\xec\xbb"
> u'\xec\xbb'
> >>> u"\xec\xbb" == "\xec\xbb"
> False
> >>>
>
> Tells me that if the b isn't ignored, it's not doing anything. But you
> still need it for Py3.
>

Enno,

python 2 behaves different because of pieces like:

         if sys.version_info[0] < 3:
             try:
                 filepath = os.path.join(DATA_DIR, "gedcom.xml")
                 ged_file = open(filepath.encode('iso8859-
1'), "r")
             except:
                 return
         else:
             try:
                 filepath = os.path.join(DATA_DIR, "gedcom.xml")
                 ged_file = open(filepath, "rb")
             except:
                 return
the first part is the code from gramps 3.4, which is working at the moment.

So if you now have a fail with that, something somewhere else changed.
Even if you only do the else part in your change, python2 might still
behave different because the conversion functions that are used behave
different.
See what is imported from constfunc, eg cuni, ...

Benny

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-19 13:25:14

Hi Benny,
> python 2 behaves different because of pieces like:
>
>          if sys.version_info[0] < 3:
>              try:
>                  filepath = os.path.join(DATA_DIR, "gedcom.xml")
>                  ged_file = open(filepath.encode('iso8859-
> 1'), "r")
>              except:
>                  return
>          else:
>              try:
>                  filepath = os.path.join(DATA_DIR, "gedcom.xml")
>                  ged_file = open(filepath, "rb")
>              except:
>                  return
> the first part is the code from gramps 3.4, which is working at the 
> moment.
John reminded me that this is not the actual GEDCOM, and to tell you the 
truth, I have no idea why this code is there. I know that it's used, but 
in the mean time I found the opening of the actual GEDCOM in 
importgedcom.py:

         if sys.version_info[0] < 3:
             ifile = open(filename, "rU")
         else:
             ifile = open(filename, "rb")

I wonder what will happen when I change that to "rb".
> So if you now have a fail with that, something somewhere else changed.
> Even if you only do the else part in your change, python2 might still 
> behave different because the conversion functions that are used behave 
> different.
> See what is imported from constfunc, eg cuni, ...
You mean things like this (in importgedcom.py):

from gramps.gen.constfunc import STRTYPE

Things like that probably explain why the if that gave me a problem, 
works OK in a python shell, where nothing special is imported.

I see lots of 'interesting' imports in libgedcom.py too, like:

from __future__ import print_function, unicode_literals

if sys.version_info[0] < 3:
     from cStringIO import StringIO
else:
     from io import StringIO

But I tend to leave them as is, assuming that they were put there for a 
reason.

Latest news: When I change the open mentioned above to "rb" for all 
python versions, my RootsMagic GEDCOM file still imports OK, but the b's 
before the string literals are still needed. I think I can live with 
that, especially when someone else can test the attached PAF GEDCOM with 
python 3.

thanks,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Benny M. <ben...@gm...> - 2013-02-19 13:47:46

2013/2/19 Enno Borgsteede <enn...@gm...>

> Hi Benny,
>
>  python 2 behaves different because of pieces like:
>>
>>          if sys.version_info[0] < 3:
>>              try:
>>                  filepath = os.path.join(DATA_DIR, "gedcom.xml")
>>                  ged_file = open(filepath.encode('iso8859-
>> 1'), "r")
>>              except:
>>                  return
>>          else:
>>              try:
>>                  filepath = os.path.join(DATA_DIR, "gedcom.xml")
>>                  ged_file = open(filepath, "rb")
>>              except:
>>                  return
>> the first part is the code from gramps 3.4, which is working at the
>> moment.
>>
> John reminded me that this is not the actual GEDCOM, and to tell you the
> truth, I have no idea why this code is there. I know that it's used, but in
> the mean time I found the opening of the actual GEDCOM in importgedcom.py:
>
>         if sys.version_info[0] < 3:
>             ifile = open(filename, "rU")
>         else:
>             ifile = open(filename, "rb")
>
> I wonder what will happen when I change that to "rb".


The < 3 codepath is the old code which is well tested and is working at the
moment in version 3.4. So there should be no need to change that, or a lot
things will have to be changed.
If things fail for you in python 2, it is because somebody changed
something somewhere else since the 3.4 code.

Benny

>
>  So if you now have a fail with that, something somewhere else changed.
>> Even if you only do the else part in your change, python2 might still
>> behave different because the conversion functions that are used behave
>> different.
>> See what is imported from constfunc, eg cuni, ...
>>
> You mean things like this (in importgedcom.py):
>
> from gramps.gen.constfunc import STRTYPE
>
> Things like that probably explain why the if that gave me a problem, works
> OK in a python shell, where nothing special is imported.
>
> I see lots of 'interesting' imports in libgedcom.py too, like:
>
> from __future__ import print_function, unicode_literals
>
> if sys.version_info[0] < 3:
>     from cStringIO import StringIO
> else:
>     from io import StringIO
>
> But I tend to leave them as is, assuming that they were put there for a
> reason.
>
> Latest news: When I change the open mentioned above to "rb" for all python
> versions, my RootsMagic GEDCOM file still imports OK, but the b's before
> the string literals are still needed. I think I can live with that,
> especially when someone else can test the attached PAF GEDCOM with python 3.
>
> thanks,
>
> Enno
>
>

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-19 13:59:17

Hi John,
> On Feb 18, 2013, at 2:17 PM, Enno Borgsteede <enn...@gm...> wrote:
>
>> On 18-02-13 21:30, John Ralls wrote:
>>> Then, so that it will work with both Py2 and Py3, the tests should be if line == b"\xef\xbb"
>> Something weird here: The Python site says that the b prefix is ignored in Python 2, but my experience shows that it definitely makes a difference in 2.7.3.
> Really?
Yes. And I guess that's because of the types that are imported in 
Gramps, like Benny said. That's why your experiments in a python shell, 
and the ones I did here before, yield different results.

The latest code that I run here opens all files as "rb", and still needs 
those b's before string literals. It works OK on Windows files (PAF and 
RM) and GEDCOM files created by Gramps itself, in Linux Mint. The only 
thing that I haven't tested yet is importing a GEDCOM created by a 
Windows Gramps (3.4.2).

I must add that in order to test GEDCOM files written by Gramps trunk, I 
first had to change the export code, like this:

Index: gramps/plugins/export/exportgedcom.py
===================================================================
--- gramps/plugins/export/exportgedcom.py    (revision 21372)
+++ gramps/plugins/export/exportgedcom.py    (working copy)
@@ -236,7 +236,7 @@
          """

          self.dirname = os.path.dirname (filename)
-        self.gedcom_file = io.open(filename, "w", encoding='utf-8')
+        self.gedcom_file = open(filename, "w")
          self._header(filename)
          self._submitter()
          self._individuals()

Which puts me in a situation where I read all GEDCOMs as rb, and write 
them as w. What logic is that?

regards,

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Benny M. <ben...@gm...> - 2013-02-19 16:39:51

2013/2/19 Enno Borgsteede <enn...@gm...>

> Hi John,
> > On Feb 18, 2013, at 2:17 PM, Enno Borgsteede <enn...@gm...> wrote:
> >
> >> On 18-02-13 21:30, John Ralls wrote:
> >>> Then, so that it will work with both Py2 and Py3, the tests should be
> if line == b"\xef\xbb"
> >> Something weird here: The Python site says that the b prefix is ignored
> in Python 2, but my experience shows that it definitely makes a difference
> in 2.7.3.
> > Really?
> Yes. And I guess that's because of the types that are imported in
> Gramps, like Benny said. That's why your experiments in a python shell,
> and the ones I did here before, yield different results.
>
> The latest code that I run here opens all files as "rb", and still needs
> those b's before string literals. It works OK on Windows files (PAF and
> RM) and GEDCOM files created by Gramps itself, in Linux Mint. The only
> thing that I haven't tested yet is importing a GEDCOM created by a
> Windows Gramps (3.4.2).
>
> I must add that in order to test GEDCOM files written by Gramps trunk, I
> first had to change the export code, like this:
>
> Index: gramps/plugins/export/exportgedcom.py
> ===================================================================
> --- gramps/plugins/export/exportgedcom.py    (revision 21372)
> +++ gramps/plugins/export/exportgedcom.py    (working copy)
> @@ -236,7 +236,7 @@
>           """
>
>           self.dirname = os.path.dirname (filename)
> -        self.gedcom_file = io.open(filename, "w", encoding='utf-8')
> +        self.gedcom_file = open(filename, "w")
>           self._header(filename)
>           self._submitter()
>           self._individuals()
>

Hmm. It is not wrong of Gramps to create it's own output in a fixed known
encoding, and utf-8 should be the one to use then.

Benny

>
> Which puts me in a situation where I read all GEDCOMs as rb, and write
> them as w. What logic is that?
>
> regards,
>
> Enno
>
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Gramps-devel mailing list
> Gra...@li...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-19 17:07:58

On Feb 19, 2013, at 8:39 AM, Benny Malengier <ben...@gm...> wrote:

> 
> 
> 
> 2013/2/19 Enno Borgsteede <enn...@gm...>
> Hi John,
> > On Feb 18, 2013, at 2:17 PM, Enno Borgsteede <enn...@gm...> wrote:
> >
> >> On 18-02-13 21:30, John Ralls wrote:
> >>> Then, so that it will work with both Py2 and Py3, the tests should be if line == b"\xef\xbb"
> >> Something weird here: The Python site says that the b prefix is ignored in Python 2, but my experience shows that it definitely makes a difference in 2.7.3.
> > Really?
> Yes. And I guess that's because of the types that are imported in
> Gramps, like Benny said. That's why your experiments in a python shell,
> and the ones I did here before, yield different results.
> 
> The latest code that I run here opens all files as "rb", and still needs
> those b's before string literals. It works OK on Windows files (PAF and
> RM) and GEDCOM files created by Gramps itself, in Linux Mint. The only
> thing that I haven't tested yet is importing a GEDCOM created by a
> Windows Gramps (3.4.2).
> 
> I must add that in order to test GEDCOM files written by Gramps trunk, I
> first had to change the export code, like this:
> 
> Index: gramps/plugins/export/exportgedcom.py
> ===================================================================
> --- gramps/plugins/export/exportgedcom.py    (revision 21372)
> +++ gramps/plugins/export/exportgedcom.py    (working copy)
> @@ -236,7 +236,7 @@
>           """
> 
>           self.dirname = os.path.dirname (filename)
> -        self.gedcom_file = io.open(filename, "w", encoding='utf-8')
> +        self.gedcom_file = open(filename, "w")
>           self._header(filename)
>           self._submitter()
>           self._individuals()
> 
> Hmm. It is not wrong of Gramps to create it's own output in a fixed known encoding, and utf-8 should be the one to use then.
> 
> 
> Which puts me in a situation where I read all GEDCOMs as rb, and write
> them as w. What logic is that?
> 

I think the message here is that the gedcom import/export plugins are doing things which have side effects on reading and
writing files. It might have seemed a brilliant hack at the time, and it might even have been necessary, but it's making it difficult
now to figure out how to get everything wired up correctly for both Py2 and Py3.

Regards,
John Ralls

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-19 19:53:55

On Feb 19, 2013, at 5:59 AM, Enno Borgsteede <enn...@gm...> wrote:
> 
> I must add that in order to test GEDCOM files written by Gramps trunk, I first had to change the export code, like this:
> 
> Index: gramps/plugins/export/exportgedcom.py
> ===================================================================
> --- gramps/plugins/export/exportgedcom.py    (revision 21372)
> +++ gramps/plugins/export/exportgedcom.py    (working copy)
> @@ -236,7 +236,7 @@
>         """
> 
>         self.dirname = os.path.dirname (filename)
> -        self.gedcom_file = io.open(filename, "w", encoding='utf-8')
> +        self.gedcom_file = open(filename, "w")
>         self._header(filename)
>         self._submitter()
>         self._individuals()

Is that because otherwise you get this?
Traceback (most recent call last):
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 484, in do_prepare
    success = self.save()
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 589, in save
    self.option_box_instance)
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 1447, in export_data
    ret = ged_write.write_gedcom_file(filename)
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 240, in write_gedcom_file
    self._header(filename)
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 323, in _header
    self._writeln(0, "HEAD")
  File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 282, in _writeln
    self.gedcom_file.write("%d %s\n" % (level, token))
TypeError: must be unicode, not str

The correct fix for that is:

--- a/gramps/plugins/export/exportgedcom.py
+++ b/gramps/plugins/export/exportgedcom.py
@@ -271,15 +271,15 @@ class GedcomWriter(UpdateCallback):
                 # make it unicode so that breakup below does the right thin.
                 text = cuni(text)
                 if limit:
-                    prefix = "\n%d CONC " % (level + 1)
+                    prefix = cuni("\n%d CONC " % (level + 1))
                     txt = prefix.join(breakup(text, limit))
                 else:
                     txt = text
-                self.gedcom_file.write("%d %s %s\n" % (token_level, token, txt)
+                self.gedcom_file.write(cuni("%d %s %s\n" % (token_level, token,
                 token_level = level + 1
                 token = "CONT"
         else:
-            self.gedcom_file.write("%d %s\n" % (level, token))
+            self.gedcom_file.write(cuni("%d %s\n" % (level, token)))
     
     def _header(self, filename):
         """

After applying that, I'm able to round-trip with the following error:
Warn: ADDR overwritten             Line    19: 2 ADR1 Not Provided
Error: REPO '' (input as @@) not in input GEDCOM. Record with typifying attribute 'Unknown' created

The imported file was not self-contained.
To correct for that, 1 objects were created and
their typifying attribute was set to 'Unknown'.
Where possible these 'Unknown' objects are 
referenced by note N0002.

I'll commit that change after I test it with Py3.

Regards,
John Ralls

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: Enno B. <enn...@gm...> - 2013-02-19 20:43:21

Hi John,
> Is that because otherwise you get this?
> Traceback (most recent call last):
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 484, in do_prepare
>      success = self.save()
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 589, in save
>      self.option_box_instance)
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 1447, in export_data
>      ret = ged_write.write_gedcom_file(filename)
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 240, in write_gedcom_file
>      self._header(filename)
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 323, in _header
>      self._writeln(0, "HEAD")
>    File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 282, in _writeln
>      self.gedcom_file.write("%d %s\n" % (level, token))
> TypeError: must be unicode, not str
>
> The correct fix for that is:
>
> --- a/gramps/plugins/export/exportgedcom.py
> +++ b/gramps/plugins/export/exportgedcom.py
> @@ -271,15 +271,15 @@ class GedcomWriter(UpdateCallback):
>                   # make it unicode so that breakup below does the right thin.
>                   text = cuni(text)
>                   if limit:
> -                    prefix = "\n%d CONC " % (level + 1)
> +                    prefix = cuni("\n%d CONC " % (level + 1))
>                       txt = prefix.join(breakup(text, limit))
>                   else:
>                       txt = text
> -                self.gedcom_file.write("%d %s %s\n" % (token_level, token, txt)
> +                self.gedcom_file.write(cuni("%d %s %s\n" % (token_level, token,
>                   token_level = level + 1
>                   token = "CONT"
>           else:
> -            self.gedcom_file.write("%d %s\n" % (level, token))
> +            self.gedcom_file.write(cuni("%d %s\n" % (level, token)))
>       
>       def _header(self, filename):
>           """
That's it, right. I filed a bug report for that

http://www.gramps-project.org/bugs/view.php?id=6461

and I plan to test your patch later tonight. I'm in a RootsDev 
conference right now.
> After applying that, I'm able to round-trip with the following error:
> Warn: ADDR overwritten             Line    19: 2 ADR1 Not Provided
> Error: REPO '' (input as @@) not in input GEDCOM. Record with typifying attribute 'Unknown' created
>
> The imported file was not self-contained.
> To correct for that, 1 objects were created and
> their typifying attribute was set to 'Unknown'.
> Where possible these 'Unknown' objects are
> referenced by note N0002.
Ah, yes, I know those errors. I filed a report for ADDR stuff earlier 
myself, for 3.4. And I saw that the 'Unknown' objects may create fatal 
errors on the next GEDCOM export. I'll have to retest that, and see if 
there's a report for that.
> I'll commit that change after I test it with Py3.
Thanks.

Enno

Re: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

From: John R. <jr...@ce...> - 2013-02-19 22:27:58

On Feb 19, 2013, at 12:43 PM, Enno Borgsteede <enn...@gm...> wrote:

> Hi John,
>> Is that because otherwise you get this?
>> Traceback (most recent call last):
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 484, in do_prepare
>>     success = self.save()
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/gui/plug/export/_exportassistant.py", line 589, in save
>>     self.option_box_instance)
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 1447, in export_data
>>     ret = ged_write.write_gedcom_file(filename)
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 240, in write_gedcom_file
>>     self._header(filename)
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 323, in _header
>>     self._writeln(0, "HEAD")
>>   File "/Users/john/Development/Gramps-Build/Gramps-svn/src/gramps-git/gramps/plugins/export/exportgedcom.py", line 282, in _writeln
>>     self.gedcom_file.write("%d %s\n" % (level, token))
>> TypeError: must be unicode, not str
>> 
>> The correct fix for that is:
>> 
>> --- a/gramps/plugins/export/exportgedcom.py
>> +++ b/gramps/plugins/export/exportgedcom.py
>> @@ -271,15 +271,15 @@ class GedcomWriter(UpdateCallback):
>>                  # make it unicode so that breakup below does the right thin.
>>                  text = cuni(text)
>>                  if limit:
>> -                    prefix = "\n%d CONC " % (level + 1)
>> +                    prefix = cuni("\n%d CONC " % (level + 1))
>>                      txt = prefix.join(breakup(text, limit))
>>                  else:
>>                      txt = text
>> -                self.gedcom_file.write("%d %s %s\n" % (token_level, token, txt)
>> +                self.gedcom_file.write(cuni("%d %s %s\n" % (token_level, token,
>>                  token_level = level + 1
>>                  token = "CONT"
>>          else:
>> -            self.gedcom_file.write("%d %s\n" % (level, token))
>> +            self.gedcom_file.write(cuni("%d %s\n" % (level, token)))
>>            def _header(self, filename):
>>          """
> That's it, right. I filed a bug report for that
> 
> http://www.gramps-project.org/bugs/view.php?id=6461
> 
> and I plan to test your patch later tonight. I'm in a RootsDev conference right now.
>> After applying that, I'm able to round-trip with the following error:
>> Warn: ADDR overwritten             Line    19: 2 ADR1 Not Provided
>> Error: REPO '' (input as @@) not in input GEDCOM. Record with typifying attribute 'Unknown' created
>> 
>> The imported file was not self-contained.
>> To correct for that, 1 objects were created and
>> their typifying attribute was set to 'Unknown'.
>> Where possible these 'Unknown' objects are
>> referenced by note N0002.
> Ah, yes, I know those errors. I filed a report for ADDR stuff earlier myself, for 3.4. And I saw that the 'Unknown' objects may create fatal errors on the next GEDCOM export. I'll have to retest that, and see if there's a report for that.
>> I'll commit that change after I test it with Py3.

r21375, backported to gramps40 r21378, and I resolved the bug. Thanks for pointing it out, I hadn't made the connection.

Regards,
John Ralls

1 2 > >> (Page 1 of 2)

Thread: [Gramps-devel] trunk problem importing GEDCOM files created by Windows programs

Gramps, the open source genealogy program

gramps-devel