From: Enno B. <enn...@gm...> - 2013-02-10 17:11:45
|
Hi Doug, > Gramps XML import now allows some handling of overlapping, > pre-existing data, but it never does the "merge" for you. There will > always be cleanup for the human to do afterwards (removing duplicates, > merging two partials, etc). > The CSV import does some of this this automatically. That is why I > suggested that you might want to use it, or at least look at the logic > of that import. OK, I see your point. One can create a CSV that references an existing event using the same syntax that is already available to address existing persons there. But as far as I'm concerned the merging doesn't end there. What I mean is that one may find a lot of results in the birth register of a particular town, which all have a source line like I quoted below, where only year and page/record numbers differ. Alle Friezen birth / registration 31-12-1818 / 02-01-1819 residence of the parents - child Age Bosma <http://www.allefriezen.nl/en/component/genealogie/?task=persoon_result&zoekmethode=eenvoudig&persoon_1_achternaam=Bosma&persoon_1_tussenvoegsel=&persoon_1_voornaam=Age&persoon_1_patroniem=> sex m father Jetze Bosma <http://www.allefriezen.nl/en/component/genealogie/?task=persoon_result&zoekmethode=eenvoudig&persoon_1_achternaam=Bosma&persoon_1_tussenvoegsel=&persoon_1_voornaam=Jetze&persoon_1_patroniem=> mother Trijntje Ages Baarda <http://www.allefriezen.nl/en/component/genealogie/?task=persoon_result&zoekmethode=eenvoudig&persoon_1_achternaam=Baarda&persoon_1_tussenvoegsel=&persoon_1_voornaam=Trijntje&persoon_1_patroniem=Ages> source Geboorteregister 1819, Leeuwarderadeel, Pagina B1 further information When I convert a record like this to CSV, I will likely treat "Alle Friezen" as the repository, and split the source line like this: source title: Geboorteregister Leeuweradeel citation volume/year: 1819 citation page: B1 There may be lot of lines in the CSV where repository name (and URL) are the same, and also lots with similar source titles, and IMO it would be very nice if we can create import code that can automatically merge those too. For my own tree, I'm more interested in another province, with similar results: Alle Groningers Overlijden 04-05-1950 Groningen Overledene Neeltje Bosma <http://allegroningers.nl/personen/q/persoon_voornaam_t_0/Neeltje/q/persoon_achternaam_t_0/Bosma> Geslacht v Leeftijd 97 jaar Geboorteplaats Leeuwarden Relatie Dirk Schuitema <http://allegroningers.nl/personen/q/persoon_voornaam_t_0/Dirk/q/persoon_achternaam_t_0/Schuitema> Vader Age Bosma <http://allegroningers.nl/personen/q/persoon_voornaam_t_0/Age/q/persoon_achternaam_t_0/Bosma> Moeder Tjamkje van der Werf <http://allegroningers.nl/personen/q/persoon_voornaam_t_0/Tjamkje/q/persoon_tussenvoegsel_t_0/van%20der/q/persoon_achternaam_t_0/Werf> Bron Overlijdensregister Groningen 1950 Aktenummer 619 And I hope we can find a way to create software that can import both. I mean, apart from the Dutch labels in the 2nd example, the structure of the result is pretty much the same. We need slightly different rules to split the Bron line here, but the elements in that are the same. What I'm most concerned with myself is: a. Where to store the URL of a particular record, b. Where to store event and person details. Dinner time! regards, Enno |