2013/2/25 Tim Lyons <guy.linton@gmail.com>
I would have thought that starting a new transaction for every merge would slow the merge citations significantly, because disk writes (or possibly delayed writes) would occur for every transaction because of writing before-looks or after-looks or whatever. At least that is the impression I get from other comments on other transaction problems in Gramps.  But I am not an expert on the database mechanisms we are using. Do you have any figures for comparison timings between the two different ways of running the merge?

I suppose the optimum solution might be something like a new transaction every (say) 100 merges, but that would not be so neat to code.

What Enno does,  every source has it's own transaction to merge citations, seems like a nice implementation overall. In case of a crash with a source, all sources already handled are already done.
The logics are independant, so doing an independant part atomic is good, no?
Better still would be to only start the transaction if we know some work will be needed.

About timing the difference. As Espen notes, he runs out of locks with the tool, so it cannot run at all for large databases that are converted from pre-citation to after-citation.


I think another solution might possibly be even better. As I understand it, most of the citations are actually 'empty' (this is misleadingly called 'no citation' on some of the displays). I don't see any point in creating multiple, empty citations, so think it would be better if GEDCOM import kept a record of any empty citation for a source, and always used the same empty citation whenever it was going to create a new empty citation. Would this work for you? Could you let me see a sample of your GEDCOM import, so I can see how GEDCOM structures your empty citations?