#355 GedMerge v0.7 - Merge GEDCOM files

open
Greg Roach
5
2009-07-26
2006-10-20
Greg Roach
No

Anyone who has merged two gedcom files will know that
it is a slow and error-prone tasks. This module allows
a site administrator to merge (import) the indi/fam
facts from one gedcom into another. It automates the
entire process, leaving the user to simply confirm each
step.

The whole process is incremental, so you can stop at
any time and when you restart it will resume where you
left off. It also means that if you make a small
change in your import gedcom, then the next time you
import it you will only have to process the changes,
not the whole file.

It works by using the _UID tags to identify objects
that are the same in both files. These are copied
between gedcoms or created if not already present.

From each pair of matched objects, it tries to match
further objects using the FAMS/FAMC relationships. For
example if main gedcom has person X married to person
Y1 and the import gedcom has person X married to person
Y2, then there are two options: Y1 and Y2 are the same
person (so match them) or they are different persons
(so import Y2 into the main gedcom). Simply click on
the appropriate option. The "most likely" option is
generally the one at the top of the list.

(NB it is possible here that person Y2 should in fact
be married to someone completely different. If you
find a "structural" difference between the gedcoms, you
will need to stop and fix it before continuing.)

For each pair of matched objects, a "smart import"
process is then used that will import only facts that
are not already present. There is a degree of
intelligence built in, so that if you already have

1 DEAT
2 DATE 21 JAN 1900
3 PLAC Westminster, London, England

The system won't import any of these:

1 DEAT Y

1 DEAT
2 DATE ABT 1900

1 DEAT
2 PLAC Westminster

All imported facts are "tagged" with the same source
record, allowing you to easily identify and review them
after the import.

Note that there is no "undo" facility (it doesn't
currently work with the accept/reject mechanism), so
make a backup before you start :-)

Discussion

1 2 3 > >> (Page 1 of 3)
  • Greg Roach
    Greg Roach
    2006-10-20

     
    Attachments
  • Greg Roach
    Greg Roach
    2006-10-20

    • assigned_to: nobody --> fisharebest
    • summary: GedMerge v0.1 --> GedMerge v0.1 - Merge GEDCOM files
     
  • Greg Roach
    Greg Roach
    2006-12-02

    • summary: GedMerge v0.1 - Merge GEDCOM files --> GedMerge v0.2 - Merge GEDCOM files
     
  • Greg Roach
    Greg Roach
    2006-12-02

    v0.2

     
    Attachments
  • Greg Roach
    Greg Roach
    2006-12-02

    Logged In: YES
    user_id=1466942
    Originator: YES

    v0.2 - fix bug spotted by booma

     
  • Greg Roach
    Greg Roach
    2006-12-04

    Logged In: YES
    user_id=1466942
    Originator: YES

    v0.3 - better matching on names and facts

     
  • Greg Roach
    Greg Roach
    2006-12-04

    v0.3

     
    Attachments
  • Greg Roach
    Greg Roach
    2006-12-04

    • summary: GedMerge v0.2 - Merge GEDCOM files --> GedMerge v0.3 - Merge GEDCOM files
     
  • Greg Roach
    Greg Roach
    2006-12-04

    • summary: GedMerge v0.3 - Merge GEDCOM files --> GedMerge v0.4 - Merge GEDCOM files
     
  • Greg Roach
    Greg Roach
    2006-12-04

    Logged In: YES
    user_id=1466942
    Originator: YES

    v0.4 - bugfix

     
  • Greg Roach
    Greg Roach
    2006-12-04

    v0.4

     
    Attachments
  • Logged In: YES
    user_id=1223195
    Originator: NO

    I don't know why, but I'm not finding GEDMERGE very intuitive. I'm trying to merge two GEDCOMs that have a single individual in common, but I seem to be in a loop on a FAM record - unable to get past it. Here's how I'm interpreting the action buttons:

    "Link these" - Means "These two records represent the same entity. Merge them."

    "Import this" - Means "This entity is not found in the primary GEDCOM. Add it."

    Is it this simple, or am I misinterpreting?

    GEDMERGE is a much-needed function. I'm probably just suffering from a mental block - not uncommon. I really want this thing to work!

     
  • Greg Roach
    Greg Roach
    2007-02-21

    Logged In: YES
    user_id=1466942
    Originator: YES

    <<Is it this simple, or am I misinterpreting?>>

    It *is* that simple.

    The process is fairly repetitive (e.g. Match person X. Match spouses of X. Match children of X, etc.), so might give the impression of being in a loop. Hopefully you're being given slightly different options each time.

    If you do think you've found a bug, I'd be grateful if you could email the details - especially if you can send some gedcoms (or excerpts) that reproduce it.

     
  • Laurie
    Laurie
    2007-03-22

    Logged In: YES
    user_id=1246980
    Originator: NO

    I attempted to use version 4 of this today and got the following error>

    ERROR 8: Undefined index: Lewis
    0 Error occurred on line 196 of file gedmerge.php in function extract_all
    1 called from line 329 of file gedmerge.php

    ERROR 8: Undefined index: Lewis
    0 Error occurred on line 222 of file gedmerge.php in function extract_all
    1 called from line 329 of file gedmerge.php

    Any ideas??

    Thanks

    Laurie

     
  • Greg Roach
    Greg Roach
    2007-03-22

    Logged In: YES
    user_id=1466942
    Originator: YES

    You need to import the new gedcom file before you try to run the merge. (You can delete it afterwards, but you can only merge databases when they are both already imported in PGV.)

    The error suggests you didn't do this.

    If it is something else, please email me more details and I'll investigate.

     
  • Rollike
    Rollike
    2007-06-05

    Logged In: YES
    user_id=1802699
    Originator: NO

    Hello.

    I would like to know what Dmiddle found out?
    I seem to get the same result.. It loops... I can't see any difference in what is happening.
    I have two options...
    i can clik link these and i can click Import this. isn't gedmerge.php suppose to jump to the next person after importing?
    I guess that i have clicked at that button 20 times now and the same family apperas again and again...
    Only thing not alike ater importing is this line
    "Linking Valo.ged:F610 and Sine_Nielsen.ged:F131 by _UID ..."
    Valo.ged:F610 is one number higher every time. And when i searc my phpgedview for the new familynumbers they do not excist.

    I would like to know if this sounds like a bug?
    I have downloaded V0.4

    Thanks in advance
    Mette

     
  • Rollike
    Rollike
    2007-06-05

    Logged In: YES
    user_id=1802699
    Originator: NO

    Ehrm....
    SORRY!!! Hmm.. Okay you can call me a noob!
    I had not enabled the Autoaccept funktion!

    Now all is working perfectly! :-)

     
  • Greg Roach
    Greg Roach
    2007-06-05

    Logged In: YES
    user_id=1466942
    Originator: YES

    Do you have "auto-accept changes for this user" enabled? You need to.

     
  • Warren Meads
    Warren Meads
    2007-10-10

    Logged In: YES
    user_id=1505352
    Originator: NO

    I have used this twice over the last couple of days and have the following comments

    This is a great addition to the functionality and should be added to the main admin menu

    1 It does not work where there are blanks in the gedcome file name
    2 In this program I under stand there is some intellegance built in on birth dates but I found that were there was a year in the orginal record and a more full date in the imported record then both records remained. Perhaps more work around the Chisterning tag as well

    3 The Global ID from both records ended up in the merged record. This needs an option to remove the orginal one and replace it with the new one. Difficut as each situation may be quite different.

    4 The source record ended up in every fact being merged which included the Name and gender. Perhaps options on these should be given as my personl preference is not to put these particular level 2 sources in. I would have perfered a level 1 source record on new records without level 2 on name and gender. ok with level 2 on the other tags and only level 2 on the facts transfered accross on merged records

    5

     
  • Warren Meads
    Warren Meads
    2007-10-10

    Logged In: YES
    user_id=1505352
    Originator: NO

    I have used this twice over the last couple of days and have the following comments

    This is a great addition to the functionality and should be added to the main admin menu

    1 It does not work where there are blanks in the gedcome file name
    2 In this program I under stand there is some intellegance built in on birth dates but I found that were there was a year in the orginal record and a more full date in the imported record then both records remained. Perhaps more work around the Chisterning tag as well

    3 The Global ID from both records ended up in the merged record. This needs an option to remove the orginal one and replace it with the new one. Difficut as each situation may be quite different.

    4 The source record ended up in every fact being merged which included the Name and gender. Perhaps options on these should be given as my personl preference is not to put these particular level 2 sources in. I would have perfered a level 1 source record on new records without level 2 on name and gender. ok with level 2 on the other tags and only level 2 on the facts transfered accross on merged records

    5 Given that you have user input needed on evey new (and merged) record then perhaps the same functionality on the indiividual merge should be provided where you can choose which facts are transfered.

    Warren

     
  • Greg Roach
    Greg Roach
    2007-10-10

    Logged In: YES
    user_id=1466942
    Originator: YES

    Hi Warren, thanks for the feedback.

    1) I'll look into this

    2) The code looks for one date as a substring of another. Thus "1869" is a substring of "12 JAN 1869", and the code can assume the latter one is the "better" one. However, "ABT 1869" isn't a substring, so this test fails. I guess this is what is happening. The same logic is used on place records. This was just a bit of "quick+dirty" logic to eliminate *some* of the duplicates. I agree it could be better.

    3) This is deliberate. It means that if you get an updated copy of this gedcom in the future, all the linking will already be done for you. Personally, I use the privacy settings to hide _UID fields, so I don't see them.

    4) Good point. I'll look into changing this.

    5) The existing edit_merge.php could also be improved. One day, when I get lots of free time(!), I'll sort this out....

    Greg

     
  • Todd
    Todd
    2007-11-26

    Logged In: YES
    user_id=1764459
    Originator: NO

    OK, I can not download the files... are they still aviable? or is there a new way to merge the gedcom files from within PVG?

     
  • Greg Roach
    Greg Roach
    2007-11-26

    Logged In: YES
    user_id=1466942
    Originator: YES

    What exactly do you mean by <<I can not download the files>> ?

    The links at the bottom of this page work fine for me.

     
  • Todd
    Todd
    2007-11-26

    Logged In: YES
    user_id=1764459
    Originator: NO

    no problems now... I was getting a 500 error last night.

     
  • Warren Meads
    Warren Meads
    2008-02-21

    Logged In: YES
    user_id=1505352
    Originator: NO

    I have used this to merge a couple of Gedcoms over the last 2 months. I generally think the program runs ok but I came across a couple of issues.

    1 I have a web host who has restricted time to 50000 (although loading a gedcom only works on 20 sec's) and this meant that the max size source gedcom could only be about 100 individual records as any more and it timed out. I think the same functionality on time out on the gedcom load needs to be incorporated into gedmerge

    2 During the processing (and I can only assume it timed out part way through a creation) it created new individual records with the same id number as the source in the destination gedcom. This ment that some orginal records in the destination gedcom were overwritten. This needs to be traped and avoided.

    3 There were a couple of records in the source gedcom where the "name" was repeadly written to the destination gedcom on every pass to find a new record to merge. in the end I removed the record from the source gedcom to stop this happening and had to edit the destination individual record to remove the additional name tags.

     
1 2 3 > >> (Page 1 of 3)