From: Dimitry S. <sd...@ib...> - 2014-05-13 10:08:52
|
Hello, All. If a record has uncommitted head version created by active transaction and some garbage in backversions "tail", can anyone (background GC thread, sweep or parallel transaction) wipe this garbage from the tail while the transaction is still active? I can't understand why list_staying() is so picky: it scans for every next version from the beginning which raise its complexity to O(N^2). -- WBR, SD. |
From: Vlad K. <hv...@us...> - 2014-05-13 11:36:15
|
> If a record has uncommitted head version created by active transaction and some garbage > in backversions "tail", can anyone (background GC thread, sweep or parallel transaction) > wipe this garbage from the tail while the transaction is still active? Yes > I can't understand why list_staying() is so picky: it scans for every next version from > the beginning which raise its complexity to O(N^2). list_staying used to undo dead record version. It is not a most often operation. And, yes, it is terrible inefficient. It restarts from the beginning of the versions chain when need next record version because : a) it used handoff's to make sure backpointers is valid b) data page with target version is released Regards, Vlad |
From: Dimitry S. <sd...@ib...> - 2014-05-13 11:52:29
|
13.05.2014 13:36, Vlad Khorsun wrote: > Yes How? If I remember your lesson right, garbage in a record is collected only if head version is marked with transaction number lesser that OAT. > list_staying used to undo dead record version. It is not a most often operation. Actually, it is used in following cases: 1) Undo with VIO_backout(). 2) Undo of update_in_place() in VIO_verb_cleanup(). 3) Applying third and following changes of the same record in one transaction. -- WBR, SD. |
From: Vlad K. <hv...@us...> - 2014-05-13 12:25:50
|
> 13.05.2014 13:36, Vlad Khorsun wrote: >> Yes > > How? If I remember your lesson right, garbage in a record is collected only if head > version is marked with transaction number lesser that OAT. This is a most often case. But, if you look at VIO_chase_record_version, you'll see that concurrency (snapshot) transactions could pass not visible for them versions of record and stop at some point. And in this point record version could be - visible for current tx - mature for GC - still have backversions In this case, cooperative GC could happen (i.e. purge will be called) before return to the caller (VIO_next or VIO_data). >> list_staying used to undo dead record version. It is not a most often operation. > > Actually, it is used in following cases: > > 1) Undo with VIO_backout(). > 2) Undo of update_in_place() in VIO_verb_cleanup(). > 3) Applying third and following changes of the same record in one transaction. I know. I just a bit lazy when need to wrtite too much words, especially in Engilish :) In any case - all your points above is cases of undo operation. Regards, Vlad |
From: Dimitry S. <sd...@ib...> - 2014-05-13 14:00:11
|
13.05.2014 14:25, Vlad Khorsun wrote: >> Actually, it is used in following cases: >> > >> >1) Undo with VIO_backout(). >> >2) Undo of update_in_place() in VIO_verb_cleanup(). >> >3) Applying third and following changes of the same record in one transaction. > I know. I just a bit lazy when need to wrtite too much words, especially in Engilish:) > In any case - all your points above is cases of undo operation. No. Number 3 is savepoint merge which is done on every operation. And I made a mistake, it happens on every call of update_in_place(). -- WBR, SD. |
From: Jim S. <ji...@ji...> - 2014-05-13 14:28:42
|
On 5/13/2014 6:08 AM, Dimitry Sibiryakov wrote: > Hello, All. > > If a record has uncommitted head version created by active transaction and some garbage > in backversions "tail", can anyone (background GC thread, sweep or parallel transaction) > wipe this garbage from the tail while the transaction is still active? > I can't understand why list_staying() is so picky: it scans for every next version from > the beginning which raise its complexity to O(N^2). > Index garbage collection is a very thorny problem. The original version predates threads in any operating system. Garbage collection occurred only when a record was visited. Garbage collection of an uncommitted record version was highly problematic since it was impossible to know whether the current version of the record would be going or staying, hence impossible in some cases to know whether an unreachable record version had an index key that could be garbage collected or not. We introduced threading in version 3. It was a nightmare. Virtually all of the operating systems that supported threading were buggy as hell (I wrote our own threading package for VMS, which didn't yet support threads). Sun's threading examples didn't even compile. A infrequent hang on Apollo was finally diagnosed as an "unfixable" design flaw when a signal arriving during a thread switch would be dropped on the floor, requiring that all signal based mechanisms in Interbase be rewritten to use threads. Revisiting garbage collection just wasn't in the cards. For most of its existence, Interbase and Firebird have only had one thread active in the engine. This kept synchronization of garbage collection and record update, which kept garbage collection simple and sane. Fine grain multi-threading, however, opens a major can of worms. My subsequent database systems do all garbage collection other than record backout in a dedicated garbage collect thread. Worker thread note when garbage collection is in order for a record and post it for the garbage collector to take up on next cycle. I don't know the current state of Firebird, but the idea of trying to synchronize cooperative (i.e. worker thread) and dedicated thread garbage collection is something I wouldn't want to take on. However, in direct answer to your question, let ask another: How to do garbage collect old record versions without visiting them? Any how can you even tell if there are collectable old versions without traversing the live versions first? There are probably lots of things that could be done to improve garbage collection. The one I think most fruitful is ability to garbage collect unreachable intermediate records between active versions at the front of a chain and older versions accessible only by very long running transactions so that very long running transaction don't impede all garbage collection. |