On Feb 25, 2013, at 4:28 PM, Benny Malengier <firstname.lastname@example.org> wrote:John,I think you go to fast.
We provide the flag (gen/db/write.py) db.DB_PRIVATE, so we open for a single process. So that part of your assertion we already do.However, we also initialize the locks, db.DB_INIT_LOCK, which is needed because GTK is multithreaded, so 3 different gramplet might be going over the database, together with a view. Doing a save, and having cursor above a view column will have GTK retrieving data while save is ongoing.
As far as I remember, that is why we need the LOCK system. Now, it might be true that we can remove the LOCK, I have never seen a deadlock type of error with gramps, which really amazed me when starting with Gramps, coming from database programming. But then, we don't do normally writes in two transactions at the same time, so problems go down fast when we combine that with short view and write transaction + lock of application on batch transactions behind a progress bar.
Anyway, as the doc says:
- Initialize the locking subsystem. This subsystem should be used when multiple processes or threads are going to be reading and writing a Berkeley DB database, so that they do not interfere with each other. If all threads are accessing the database(s) read-only, locking is unnecessary. When the DB_INIT_LOCK flag is specified, it is usually necessary to run a
- I do think that GTK being multithread, requires us to have it. Hence, shorter transactions for import or batch operations, are an option, as long as they are done sufficiently intelligent.
I don't think there is a big problem doing this in many tools. On import though, a crash would give part imported, part not, and reimporting will cause a problem. Other workarounds might be devised there though.
I'm not going too fast, 'cause I'm not going anywhere. I seriously don't have time to muck about in the db backend. That said:Whatever makes you think that Gtk is multithreaded? It *supports* multithreading, but only if the client application is written to be. In fact there's been a lot of discussion on the gtk-dev list over the years about how it's important to have only one thread operating on GtkWidgets. For portability that needs to be the main thread, because OSX and Windows will only send events there.
AFAICT Gramps isn't multithreaded: grampsgui.py calls GObject.threads_init(), but that's it. The webapp has a single call to python's threading.Thread.start(), but it's not part of Gramps proper, and I didn't dig into it to see if it's touching the database in both the main and the worker threads.As Enno pointed out, the problem with having locking turned on and all inclusive transactions is that it doesn't scale. Every record the import touches gets a lock which isn't released until the transaction is either committed or rolled back, and eventually bdb runs out of space in the lock table. IIRC, you can checkpoint at the beginning of the "batch" operation and easily roll back to the checkpoint if you crash in the middle: No need to try to do everything in a single txn.Regards,John Ralls