From: derHeinzi <hei...@ya...> - 2011-09-22 09:47:23
|
Hello Developers, this is quite a long message to a difficult matter. So bear with me. Please find attached a python script for comparing data in 2 Gramps xml files. http://gramps.1791082.n4.nabble.com/file/n3832887/GrampsCompare.py GrampsCompare.py The comparison is done in both databases starting with the "same" person, which you have to specify. For test: - Create gramps xml-files by unzipping two gramps archives. - Find IDs of the same "key"-person in both files. - Start script with parameters firstFile, firstID, secondFile, secondID - Output is written to screen. You might want to redirect to file. It is not (yet) a tool to compare entries in 2 different databases but you can already find the changes that have been done to a database you shared with some other person or to a backup you did some time ago. I'm currently working with Gramps 3.2.5-1 on WinXP and used xml-files from this version for program development and test. But as long as the attributes "id" "handle" and "hlink" are in the xml it should work for other versions as well. My intention was to find a possible way to a database compare and merge in Gramps. Following the devs and users mailing lists for quite a while now this matter came up from time to time, but I found no concept of solution discussed and no hint that someone is working on this right now. GrampsConnect was mentioned in some posts, but I did not check if there is a concept or solution there yet. After hacking a quick database compare script which took several hours to complete a run on my database with about 3000 people, I tweaked the script to now finish in less than a minute. This should be an amount of time that a user could accept for a complicated function to finish? Since I'm fairly new to Python and using this script as a way to learn the language there might be even better ways to do things. But hey, I'm proud of what I achieved in these few evenings! :-) I added tons of comments to the code to make you guys understand what the script is doing! So have fun with it. I don't claim any (copy)rights. Now, what could a database compare and merge look like in the gui and what is still do be done. First to the GUI of a compare and merge. If you look at the compare and merge window for a person in Gramps you see the person and related info side by side. This could be changed to a display as shown in attached cmpwin.png. http://gramps.1791082.n4.nabble.com/file/n3832887/cmpwin.png cmpwin.png The changes are: - For every subnode type (tag) in the database there is a "section" in the window. (For person eg. gender, names, ... - Only subnodes without handles are displayed for comparison. - All subnodes referring to other nodes with handles are shown in two lists (If a compare by script was performed beforehand the list entrys might show different colors for identical, changed and missing references.) The first list contains the items that "match" items in the other database, the second one shows the items that do not match or could match more than one items in the other database. - There is a means to "link" nodes from both databases. (The button with the "=" between the lower lists. The broken or unbroken chain symbol on a button would be more appropriate.) If you see the 2 marriage entries in this example on the right side, you have to decide which of these is already in the left database. So you select first the left and then one of the right marriage events to see the data in the quickview. If you find them to refer to the same marriage, you select both and press the "=" button to link these 2 entries. They will be moved to the lists above (matched information). - If you doubleclick (there could be a button for this) an entry in the matched information list, the content of the window is replaced by the content of the selected node (e.g. the family from the childof reference). There could be a "back" or even a "history" button for navigation like in browsers. - With the "+" button you add information from the 2nd database, with "<" you replace information. - The window looks and works the same for all comparisons, no matter if events, persons, families, ... are compared. What do you think of this GUI concept (not design!, thats far from nice)? Do you think this could be a way to handle data from 2 databases? Now to the question what still has to be done. - The standalone script has first to be improved and in the end to be integrated into Gramps. - The comparison of the data nodes currently does not have a "closer look" at the data. The data itself has to be taken into account. Eg. currently only a check attributesDB1 == attributesDB2 is done. This should ignore the 'change' attribute. - In the end this function must not rely on same IDs or handles but has to check the data itself. This might get a bit complicated, but I think with the approach that you only have to check the data referred to by already matched nodes it can be solved and handled. Even for comparison with a Gramps xml from an imported GEDCOM. - It might be useful to generate an output file with the found differences for further evaluation in Gramps or other software? My excuses for this long post. I hope it was worth reading and clear enough to make you understand what I try to say. My mother tongue is not English which makes it a little difficult to explain complicated things. Kind regards and have fun Heinz -- View this message in context: http://gramps.1791082.n4.nabble.com/Database-compare-and-merge-tp3832887p3832887.html Sent from the GRAMPS - Dev mailing list archive at Nabble.com. |