Menu

#355 GedMerge v0.7 - Merge GEDCOM files

open
5
2009-07-26
2006-10-20
Greg Roach
No

Anyone who has merged two gedcom files will know that
it is a slow and error-prone tasks. This module allows
a site administrator to merge (import) the indi/fam
facts from one gedcom into another. It automates the
entire process, leaving the user to simply confirm each
step.

The whole process is incremental, so you can stop at
any time and when you restart it will resume where you
left off. It also means that if you make a small
change in your import gedcom, then the next time you
import it you will only have to process the changes,
not the whole file.

It works by using the _UID tags to identify objects
that are the same in both files. These are copied
between gedcoms or created if not already present.

From each pair of matched objects, it tries to match
further objects using the FAMS/FAMC relationships. For
example if main gedcom has person X married to person
Y1 and the import gedcom has person X married to person
Y2, then there are two options: Y1 and Y2 are the same
person (so match them) or they are different persons
(so import Y2 into the main gedcom). Simply click on
the appropriate option. The "most likely" option is
generally the one at the top of the list.

(NB it is possible here that person Y2 should in fact
be married to someone completely different. If you
find a "structural" difference between the gedcoms, you
will need to stop and fix it before continuing.)

For each pair of matched objects, a "smart import"
process is then used that will import only facts that
are not already present. There is a degree of
intelligence built in, so that if you already have

1 DEAT
2 DATE 21 JAN 1900
3 PLAC Westminster, London, England

The system won't import any of these:

1 DEAT Y

1 DEAT
2 DATE ABT 1900

1 DEAT
2 PLAC Westminster

All imported facts are "tagged" with the same source
record, allowing you to easily identify and review them
after the import.

Note that there is no "undo" facility (it doesn't
currently work with the accept/reject mechanism), so
make a backup before you start :-)

Discussion

<< < 1 2 3 > >> (Page 2 of 3)
  • Greg Roach

    Greg Roach - 2008-02-21

    Logged In: YES
    user_id=1466942
    Originator: YES

    <1> This would involve a disproportionate amount of effort. It is only done on import, as there is no real alternative. You could always install a db/web server on your PC and run the merge locally. This would remove all CPU/RAM limits. Packages such as xampp are a one-click installation and work well.

    <2> New records are created using PGV's existing functions for creating new records (with new IDs). If you're getting duplicates, then it suggests your next_id table is corrupted. This would be an issue with the PGV library functions, not gedmerge.

    <3> Was there anything unusual about this name record?

     
  • Nick Jenkins

    Nick Jenkins - 2008-03-20

    Logged In: YES
    user_id=132985
    Originator: NO

    I'd like to report two possible bugs and one request with gedmerge.php v0.4, on PhpGedView v4.1.2 and PHP v5.1.2:

    First possible bug, my ged file was called "descendents of Beatrice Pearl Marshall.ged", and when I went to use gedmerge.php, after selecting the two files, I got these PHP errors:
    --------------------------------
    ERROR 8: Undefined index: descendents
    0 Error occurred on line 196 of file gedmerge.php in function extract_all
    1 called from line 329 of file gedmerge.php

    ERROR 8: Undefined index: descendents
    0 Error occurred on line 222 of file gedmerge.php in function extract_all
    1 called from line 329 of file gedmerge.php
    Identify the same individual in both GEDCOM files.

    ERROR 8: Undefined index: descendents
    0 Error occurred on line 388 of file gedmerge.php
    --------------------------------

    .... when I deleted that GED and renamed it to "descendents-of-Beatrice-Pearl-Marshall.ged" (i.e. replace all spaces with dashes) and reimported, then the above error went away (i.e. does not work with spaces in the GED file name).

    Second possible bug: Corrected the above and specified the individual that matches, and then got these PHP warnings / errors:
    -------------------------------
    Look for records that need matching up or importing

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 424 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 73 of file gedmerge.php in function simple_gedrec
    1 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 81 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php
    ------------------------------

    ... then on the next screen it gave these PHP errors / warnings:
    ------------------------------
    Look for records that need matching up or importing

    ERROR 8: Undefined index: F24
    0 Error occurred on line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 73 of file gedmerge.php in function simple_gedrec
    1 called from line 427 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 81 of file gedmerge.php in function simple_gedrec
    2 called from line 427 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 431 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 435 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 431 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 435 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 459 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 463 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 73 of file gedmerge.php in function simple_gedrec
    1 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 81 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined offset: 1
    0 Error occurred on line 53 of file gedmerge.php in function url_of_object
    1 called from line 93 of file gedmerge.php in function simple_gedrec
    2 called from line 471 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 459 of file gedmerge.php

    ERROR 8: Undefined index: F24
    0 Error occurred on line 463 of file gedmerge.php
    -----------------------------------

    and then later other errors like:
    -----------------------------------
    ERROR 8: Undefined index: I76
    0 Error occurred on line 161 of file gedmerge.php in function merge_uids
    1 called from line 331 of file gedmerge.php
    ERROR 20: Invalid GEDCOM 5.5 format.
    -----------------------------------

    ... and ...
    -----------------------------------
    ERROR 8: Undefined index: I76
    0 Error occurred on line 459 of file gedmerge.php

    ERROR 8: Undefined index: I76
    0 Error occurred on line 463 of file gedmerge.php

    ERROR 8: Undefined index: I76
    0 Error occurred on line 459 of file gedmerge.php
    -----------------------------------

    The request I would make is that it's too chatty and requires way too much manual intervention. For example, sometimes there would be only one choice, and I would need to click the button to continue. If there is only one choice, can it please auto click or even better, just skip that screen? There's not much benefit to the user in presenting them with a screen with one and only one choice :-) But even better would be if there was an option to say "always take first option", which would be like clicking the first button all the time. This is because all I did was click the first button repeatedly, and it the script could do this for me, then that would be even better :-) The reason for this is that the tree that I was merging had only one common person, and the rest was unique to both trees. So ideally it would prompt just _once_ for the common person, and say "which of these two do you want to use?", and then do the rest automatically, since asides from this, there was no conflict whatsoever between the two. As it is, it currently requires a huge amount of clicking (for every individual and family, as far as I can tell). However, apart from one click to resolve that single conflict, I can't see why any further clicking should be needed.

    -- All the best,
    Nick.

     
  • Greg Roach

    Greg Roach - 2008-03-21

    Logged In: YES
    user_id=1466942
    Originator: YES

    1) gedcoms with spaces in the name.
    Probably easy to fix. This module needs a bit of a rewrite/overhaul, so I'll consider that when I do it.

    2) ERROR 8: Undefined offset: 1.
    This could be caused by having non-standard spacing in the level 0 record, ie. multiple spaces in "0 @I1@ INDI".

    3) ERROR 8: Undefined index: I76
    This will happen if you have not enabled the "automatically accept changes" option.

    <<it's too chatty>>

    True, but I can't see a foolproof way of improving it. In my tree I have families with two children of the same name, and people who married two spouses of the same name. While we can be 99.9% sure that two children called John are the same person, assuming this will break the 0.1%.

     
  • Nick Jenkins

    Nick Jenkins - 2008-03-26

    Logged In: YES
    user_id=132985
    Originator: NO

    > gedcoms with spaces in the name.
    > Probably easy to fix. This module needs a bit of a rewrite/overhaul, so
    > I'll consider that when I do it.

    Thank you.

    > 2) ERROR 8: Undefined offset: 1.
    > This could be caused by having non-standard spacing in the
    > level 0 record, ie. multiple spaces in "0 @I1@ INDI".

    Yes, that's exactly what it has in this file: "0 @I1@ INDI".
    The software that it came from was a GEDCOM export from "PAF version 5.2.18.0". I have no idea whether that software is any good or not, but someone gave me a file in the PAF format, and the PAF software seemed to be the easiest way to export it into GEDCOM format so that I could use it with PhpGedView and graft it onto my main tree.

    > ERROR 8: Undefined index: I76
    > This will happen if you have not enabled the "automatically accept
    > changes" option.

    On the "Update User Account" page for my account, I had "Automatically accept changes made by this user" ticked.

    However, I had the "Access level" set to "Access" rather than "Accept" for these two GED files, and I have changed this now - but it sounds like this was the cause of the problem.

    > While we can be 99.9% sure that two children
    > called John are the same person, assuming this will break the 0.1%.

    I have no problem with that case. What I'm concerned about is that I have a tree, and then I'm grafting on a tree with only 1 single person who overlaps, then it should ideally prompt only for the conflicts. Instead, what I think it's doing, is copying a bit from the source tree to the destination tree, then thinking there's a conflict between the two because now some information is in both (information which the merge script copied). In reality though, there is no conflict. In other words: Where both trees have identical information, do not prompt (it seems to prompt at the moment). When one tree has a bit of information, and the other does not, then use the tree that has the information, and do not prompt (it seems to prompt at the moment). If these two cases were eliminated, then, I _think_ it would only prompt once, instead of the hundred or so times that it seemed to prompt currently when doing a merge.

     
  • steve

    steve - 2008-08-13

    Logged In: YES
    user_id=1293159
    Originator: NO

    It looks like I'm not the first to run gedmerge without autoaccept enabled and then wonder why it wasn't working. I made these changes in my copy to first check that autoaccept is actually enabled:

    321a322,327
    > // first check that auto-accept is on
    > $user_id = get_user_id(getUserName());
    > if (get_user_setting($user_id, 'auto_accept') != 'Y') {
    > print "<p style='font-size:large;color:red;'>WARNING: you must enable auto-accept of changes for this user for gedmerge to work!</p>";
    > }
    >

     
  • jaymax2

    jaymax2 - 2008-08-14

    Logged In: YES
    user_id=1406656
    Originator: NO

    Where does one find the "auto-accept" option, must be somewhere in the Admin area but I am not finding it.

    With all the options that PGV has in so many diverse locations there should be a need for an index of options.

     
  • glen stadig

    glen stadig - 2008-08-30

    Logged In: YES
    user_id=2189168
    Originator: NO

    I have a large GEDCOM and want to import some smaller GEDCOMs that may not have a common link from rootweb.com. Actually I doubt they will have a common link(if they do it would be good to know). Would this be the tool to use to merge the data sets?
    I can upload the GEDCOM to import and read in PHPGEDVIEW but I want to append myexisting data with the new records saving the manual entry.

    If this is the way to go, how do i configure..
    I assume I copy up the gedmerge.php to the server and place in the PHPGEDVIEW root.
    Then, How do i call it to start the whole process?

     
  • Greg Roach

    Greg Roach - 2008-08-31

    Logged In: YES
    user_id=1466942
    Originator: YES

    If these other gedcoms have no common person, then simply import them as separate/independent gedcoms.

    PGV is designed for multiple gedcoms.

    Then, when you find a common link, you can merge.

    To start the process, simply type gedmerge.php into the address-bar of your web-browser

     
  • Julio Sánchez Fernández

    There have been no further updates to gedmerge right?

    To make it work in the SVN trunk there are a number of issues. First, there are a couple of functions whose names conflict with others in PGV and in at least one case the interface is different. Second, functions_import.php is imported multiple times and complains. Third, since import_request_variables is no longer active globally, local changes are needed.

    Anyway, a very interesting facility. I miss, however, the possibility of "skipping" merges, i.e. of neither adding or merging entries because I am not interested in importing some branch because it goes in the wrong direction, for example. I may want to import all descendants of someone from another GEDCOM but all the other ancestors of those descendants.

     
  • Greg Roach

    Greg Roach - 2008-10-28

    Hi Julio,

    If you have already updated gedmerge.php to work with the latest SVN, can you let me have the new code?

    <<I miss, however, the possibility of "skipping" merges>>

    The code "spiders" outwards from a starting point, to find new records to merge. A "skip" option would not work. Instead, you could use the clippings-cart to extract descendants/etc. from the new gedcom, and then merge the extract.

     
  • Julio Sánchez Fernández

    I have sent to you the needed fixes. They are crude and should actually be done better.

    I understand why skip would not work for the current gedmerge. What it would need is to keep in the session a list of tree branches that have been "skipped", something like a "no follow" list, so that the main loop would not follow them and proceed to choose a different candidate, just as if that branch had already been traversed successfully. "Skip" would just add one ID to the list of "no follow" IDs.

    Repeating the process would require "skipping" by hand the unwanted branches again. But in my opinion, it requires less effort than selecting subtrees in one GEDCOM, sending them to the clippings cart, downloading the result, uploading it again so that it can be imported and importing it.

     
  • Anonymous

    Anonymous - 2008-11-03

    Julio or Greg - is an updated version of gedmerge available anywhere?

     
  • Greg Roach

    Greg Roach - 2008-11-03

    Julio sent me his "quick-and-dirty" patch. I haven't had a chance to look at it yet.

     
  • Greg Roach

    Greg Roach - 2008-11-04

    v0.5

     
  • Greg Roach

    Greg Roach - 2008-11-04

    v0.5 added, for compatibility with PGV4.1.6

     
  • Greg Roach

    Greg Roach - 2008-11-04
    • summary: GedMerge v0.4 - Merge GEDCOM files --> GedMerge v0.5 - Merge GEDCOM files
     
  • Warren Meads

    Warren Meads - 2008-12-29

    Any chance of having this updated for the new SQL schema in svn and the addition of the new table "_name"

     
  • Greg Roach

    Greg Roach - 2008-12-31

    New version attached. I don't have time to test it.
    File Added: gedmerge.php

     
  • Greg Roach

    Greg Roach - 2008-12-31

    v0.6

     
  • Wes Groleau

    Wes Groleau - 2009-02-09

    > I have no problem with that case. What I'm concerned about is that I have
    > a tree, and then I'm grafting on a tree with only 1 single person who
    > overlaps, then it should ideally prompt only for the conflicts. Instead,

    For this special case, it would be a lot easier to export your entire GEDCOM,
    paste in the new one right before TRLR, re-import and then use the merge on
    the admin screen for that single individual.

     
  • cv55

    cv55 - 2009-05-24

    GedMerge requires the "auto-accept feature" to be activated. No such manual entry, and even when switching to the english GUI, I do not find out, what this means (seems, it is not active in my PGV, since gedmerge does not work...). Could you give me a tip where to find it?
    Thanks, volker.

     
  • Anonymous

    Anonymous - 2009-05-24

    Volker
    1 - 'auto-accept' refers to the setting in your user profile "Automatically accept changes made by this user". It needs to be ticked.
    BUT
    2 - I'm not sure if GedMerge will work for recent releases of PGV (later than 4.1.7). Try it, but be sure to have good backup copies of everything first.

     
  • sali40

    sali40 - 2009-07-26

    I'm trying gedmerge on svn version of phpgedview.
    First error was simple to fix.
    Row 27 had to be turned from "require("includes/functions_edit.php");"
    into "require("includes/functions/functions_edit.php");"

    Now I'm facing another error: "Fatal error: Call to a member function escapeSimple() on a non-object in C:\xampp\htdocs\gedmerge.php on line 186"

    This is the content of the row:
    " $sql="SELECT i_id, i_name, i_gedcom FROM ".$TBLPREFIX."individuals WHERE i_file='".$DBCONN->escapeSimple($GEDCOMS[$ged1]["id"])."'"; "

    I think that the "non object" is "($GEDCOMS[$ged1]["id"]) "
    If the problem lays there, there are other rows that need the same change.

    How should I change it to fix the error?

     
  • Greg Roach

    Greg Roach - 2009-07-26

    PGV 4.2.2 includes a rewrite to the database access layer. This module needs a rewrite to match.

    Updated version attached.

     
  • Greg Roach

    Greg Roach - 2009-07-26

    v0.7 (for PGV 4.2.2 and later)

     
<< < 1 2 3 > >> (Page 2 of 3)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.