From: <jul...@gm...> - 2007-10-30 11:19:40
|
Hi, AFAIK, FamilySearch does not guarantee unique, eternal, identifiers. However, the GEDCOM ID they use on the downloaded GEDCOMs has been permanent for years. It is always the same for each person in the record. I have a small Perl script that copies that value into some form of the _UID nonstandard attribute. For instance: 1 _UID IGI::I500077973070 I.e., I qualify the number with a numbering authority code ('IGI'). This way I can tell records I already downloaded and merge them together. Unconditionally doing this, however, is dangerous because there is no guarantee that the FamilySearch IDs will not change in the future, so this should only be done under very controlled circumstances. I do the same for the Vital Records Index: 1 _UID VRI-2000-ES::I4611604-1 In this case, it is much safer because CDs are immutable, the concatenation of year and region codes makes the code unique. A new CD edition would change the IDs but, at least, no unwanted merges would happen. In every case I repeatedly reuse data from a computer database I have constructed one specific algorithm to create unique IDs. One other example, from the Guipuzcoa online church records: 1 _UID DEAH:111500101-0001-0-e7289c-2 It contains the source reference, page number, record number and a partial hash of the name (this was derived by experimentation after several failed tries, you don't wanna know the pathological cases that appear). Every source that does not assign unique IDs needs ad-hoc handling. Ideally every source should generate a permanent unique ID for records it originates. Records received from other sources are not given new IDs, but records merged from different sources keep all the received IDs. Regards, Julio 2007/10/30, Gerald Britton <ger...@gm...>: > > If the familysearch gedcoms have unique ids for people, events etc > them gramps import could discard the duplicates as an option. So you > could start a new db, import the geds (removing dups), the open your > good db and import the new db into it. > > > > On 10/30/07, Benny Malengier <ben...@gm...> wrote: > > 2007/10/30, Douglas S. Blank <db...@cs...>: > > > > > > [Moving to developers list.] > > > > > > The controversial aspect of this is the automatic merge (which is, of > > > course, the whole point). We can do this as a "fork" of the current > GEDCOM > > > import, but that wouldn't be useful for anyone else, and would quickly > > > suffer "bitrot" as the original GEDCOM import continued to change. > Perhaps > > > there is a way that we can work within the GRAMPS GEDCOM import? > > > > > > Well, in 3.0 you could make an automatic revision, then work on a the > data, > > and do rollback to revision if problems. > > > > I always planned on working on a better (perhaps two-stage, interactive) > > > merge-er. But an easier option would be to have some type of option on > the > > > Import Dialog that either: a) kept all duplicates, or b) attempted > > > automatic merges. This would be less controversial (I think) if there > was > > > an "Undo Import" that was quick and painless, and readily available. > Is > > > there? > > > > > > I see possibilities with automatic merge, but it should be on a unique > > identifier. I just have too many people with the same name to be able > to > > have the name determine the merging or not (that is, all first sons of > all > > children have the name of their grandfather, and are born in the same > decade > > or so). > > > > Would developers allow a GEDCOM import to do automatic merges in the > same > > > manner that ImportCSV works? > > > > > > I would rather see GEDCOM -> XML, and allow automatic merge of XML. > > Like this the typical problem of importing GEDCOM is separated from the > > merging, we allow merging of gramps family trees without the need to > pass > > over GEDCOM, and we have power over our XML format, so we can add things > > there, so we could make export to XML and have extra data written needed > for > > possible merging. > > Note that an overwrite in XML is already as easy as replacing the handle > > with the handle of an existing object. > > > > Benny > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > |