Thread: [Gramps-devel] Database compare and merge

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello Developers,
this is quite a long message to a difficult matter. So bear with me.

Please find attached a python script for comparing data in 2 Gramps xml
files.
http://gramps.1791082.n4.nabble.com/file/n3832887/GrampsCompare.py
GrampsCompare.py 
The comparison is done in both databases starting with the "same" person,
which you have to specify.
For test:
- Create gramps xml-files by unzipping two gramps archives.
- Find IDs of the same "key"-person in both files.
- Start script with parameters firstFile, firstID, secondFile, secondID
- Output is written to screen. You might want to redirect to file.
It is not (yet) a tool to compare entries in 2 different databases but you
can already find the changes that have been done to a database you shared
with some other person or to a backup you did some time ago.

I'm currently working with Gramps 3.2.5-1 on WinXP and used xml-files from
this version for program development and test. But as long as the attributes
"id" "handle" and "hlink" are in the xml it should work for other versions
as well.

My intention was to find a possible way to a database compare and merge in
Gramps. Following the devs and users mailing lists for quite a while now
this matter came up from time to time, but I found no concept of solution
discussed and no hint that someone is working on this right now.
GrampsConnect was mentioned in some posts, but I did not check if there is a
concept or solution there yet.

After hacking a quick database compare script which took several hours to
complete a run on my database with about 3000 people, I tweaked the script
to now finish in less than a minute. This should be an amount of time that a
user could accept for a complicated function to finish?
Since I'm fairly new to Python and using this script as a way to learn the
language there might be even better ways to do things. But hey, I'm proud of
what I achieved in these few evenings! :-)
I added tons of comments to the code to make you guys understand what the
script is doing! So have fun with it. I don't claim any (copy)rights.

Now, what could a database compare and merge look like in the gui and what
is still do be done.

First to the GUI of a compare and merge. If you look at the compare and
merge window for a person in Gramps you see the person and related info side
by side.
This could be changed to a display as shown in attached cmpwin.png. 
http://gramps.1791082.n4.nabble.com/file/n3832887/cmpwin.png cmpwin.png 
The changes are:
- For every subnode type (tag) in the database there is a "section" in the
window. (For person eg. gender, names, ...
- Only subnodes without handles are displayed for comparison.
- All subnodes referring to other nodes with handles are shown in two lists
(If a compare by script was performed beforehand the list entrys might show
different colors for identical, changed and missing references.) The first
list contains the items that "match" items in the other database, the second
one shows the items that do not match or could match more than one items in
the other database.
- There is a means to "link" nodes from both databases. (The button with the
"=" between the lower lists. The broken or unbroken chain symbol on a button
would be more appropriate.) If you see the 2 marriage entries in this
example on the right side, you have to decide which of these is already in
the left database. So you select first the left and then one of the right
marriage events to see the data in the quickview. If you find them to refer
to the same marriage, you select both and press the "=" button to link these
2 entries. They will be moved to the lists above (matched information).
- If you doubleclick (there could be a button for this) an entry in the
matched information list, the content of the window is replaced by the
content of the selected node (e.g. the family from the childof reference).
There could be a "back" or even a "history" button for navigation like in
browsers.
- With the "+" button you add information from the 2nd database, with "<"
you replace information.
- The window looks and works the same for all comparisons, no matter if
events, persons, families, ... are compared.

What do you think of this GUI concept (not design!, thats far from nice)? Do
you think this could be a way to handle data from 2 databases?

Now to the question what still has to be done.
- The standalone script has first to be improved and in the end to be
integrated into Gramps.
- The comparison of the data nodes currently does not have a "closer look"
at the data. The data itself has to be taken into account. Eg. currently
only a check attributesDB1 == attributesDB2 is done. This should ignore the
'change' attribute.
- In the end this function must not rely on same IDs or handles but has to
check the data itself. This might get a bit complicated, but I think with
the approach that you only have to check the data referred to by already
matched nodes it can be solved and handled. Even for comparison with a
Gramps xml from an imported GEDCOM.
- It might be useful to generate an output file with the found differences
for further evaluation in Gramps or other software?

My excuses for this long post. I hope it was worth reading and clear enough
to make you understand what I try to say. My mother tongue is not English
which makes it a little difficult to explain complicated things. 

Kind regards and have fun
Heinz

--
View this message in context: http://gramps.1791082.n4.nabble.com/Database-compare-and-merge-tp3832887p3832887.html
Sent from the GRAMPS - Dev mailing list archive at Nabble.com.

Thread: [Gramps-devel] Database compare and merge

Gramps, the open source genealogy program

gramps-devel