From: Lester C. <le...@ls...> - 2012-01-02 20:56:05
|
Dimitry Sibiryakov wrote: >> In my own cases, ANY downtime during office hours is unacceptable, so any >> > failure that brings the whole system down would result in penalties! Organised >> > downtime is possible on many sites, but some sites are running 24/7, So we >> > maintain data in a manor that the system will work with elements down, but the >> > database operation must be maintained. Firebird has been running reliably for >> > many years on these sites, when other services HAVE crashed, to the extent that >> > services have been moving onto our framework due simply to it's reliability:) >> > >> > Customers care very much about downtime ... especially if it HAS to happen >> > simply for maintenance reasons. > But maintenance doesn't inevitable cause downtime. Maintenance of one piece of a system > can't stop whole system if other pieces will do all the job. This is the main idea behind > RAID1-6, for example. What's wrong with your system if it cannot work without only one > part of it?.. Backup and restore take a finite and growing amount of time with 10+ years worth of data. It is rare to need to run a cycle, but when the need does arise then it has to be handled. The point about loosing part of the system relates to things like loosing a ticket printer or display device ... not critical since the users can work around the losses, and as long as *A* copy of the database can be seen by a web server, they can continue to work. I've even had RAID systems fail in the past, so nowadays it's a lot more reliable to have simple duplicate machines on the system each capable of providing the services needed. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk// Firebird - http://www.firebirdsql.org/index.php |
From: Kjell R. <kje...@da...> - 2012-01-03 20:04:19
|
Den 2012-01-01 21:31 skrev Kjell Rilbe såhär: > 3. Stay at 32 bit id:s but somehow "compact" all id:s older than OIT(?) > into OIT-1. Allow id:s to wrap around. > Pros: > - No extra disk space. > - Probably rather simple code changes for id usage. > Cons: > - May be difficult to find an effective way to compact old id:s. > - Even if an effective way to compact old id:s is found, it may be > complicated to implement. > - May be difficult or impossible to perform compacting without large > amounts of write locks. > - Potential for problems if OIT isn't incremented (but such OIT problems > should be solved anyway). I'm thinking about this solution, in case a solution is actually needed (see recent subthread). I assume the sweep only looks at record versions that are deleted, and "mark them" (?) for garbage collection if they have transaction id less than OIT. Correct? This is not sufficient for the "consolidation" of old transaction id:s. What's needed is, in principle, a task that reads through ALL record versions, and for each one with transaction id < OIT, change it to OIT - 1. When it has done that for the entire database, it can move the max useable transaction id to OIT - 2. Then it can wait until the database starts to exhaust the "transaction id space" again before repeating the cycle. To make this a bit less work intensive, would it be possible and a good idea to mark each database page with the lowest transaction id in use on that page? In that case, the task could skip all pages where this value is >= OIT - 1. But would it require a lot of overhead to keep this info up to date? I don't know how a page is used to access a record on it.... But doesn't cooperative garbage collection mean that on each page access, all deleted records versions on that page are marked for garbage collection? In that case I assume it will read the transaction id and deletion state of all record versions on the page anyway, and that's all that's needed to keep the page's lowest transaction id up to date. Or am I missing something (likely...)? I assume the lowest transaction id on a page can never become lower, and a new page will always have a "near-the-tip" lowest transaction id. So the consolidation task would not have to re-check a page that is updated after the task checks it but before the task completes the cycle. But how to make sure it checks all pages? Is there any well-defined order in which it could check all pages, without running the risk of missing some of them, even if the database is "live"? Kjell -- -------------------------------------- Kjell Rilbe DataDIA AB E-post: kj...@da... Telefon: 08-761 06 55 Mobil: 0733-44 24 64 |
From: Dmitry Y. <fir...@ya...> - 2012-01-03 20:14:56
|
04.01.2012 0:04, Kjell Rilbe wrote: > To make this a bit less work intensive, would it be possible and a good > idea to mark each database page with the lowest transaction id in use on > that page? In that case, the task could skip all pages where this value > is>= OIT - 1. It could avoid page writes but not page reads. The latter could also be possible to achieve but the solution is going to be even more complicated. Dmitry |
From: Dimitry S. <sd...@ib...> - 2012-01-03 20:12:10
|
03.01.2012 21:04, Kjell Rilbe wrote: > What's needed is, in principle, a task that reads through ALL record > versions, and for each one with transaction id< OIT, change it to OIT - > 1. When it has done that for the entire database, it can move the max > useable transaction id to OIT - 2. It means to fetch/read every page with exclusive lock, modify it, mark it as a dirty and write it back to the disk (when?). Crazy i/o load and long lock waits are guaranteed. -- SY, SD. |
From: Kjell R. <kje...@da...> - 2012-01-03 20:39:45
|
Den 2012-01-03 21:11 skrev Dimitry Sibiryakov såhär: > 03.01.2012 21:04, Kjell Rilbe wrote: >> What's needed is, in principle, a task that reads through ALL record >> versions, and for each one with transaction id< OIT, change it to OIT - >> 1. When it has done that for the entire database, it can move the max >> useable transaction id to OIT - 2. > It means to fetch/read every page with exclusive lock, modify it, mark it as a dirty > and write it back to the disk (when?). Crazy i/o load and long lock waits are guaranteed. > Yes, but perhaps it's more tolerable to have a somewhat slower system for two days than to have 1 day of downtime? Assuming the cluster/replication solution is not used... And I would assume that it would only need to lock a single page at a time, and that it would take very short time to do the job on that single page. So, while it would incur a lot of write locks on a lot of pages, there will only be a single page lock at a time, and the lock will be very short lived. So, no long waits, but a whole lot of very short waits. Still, I am not really arguing that this is the best solution. Seems to me you're right Dimitry, that a replication/cluster solution is better. But could that perhaps be made a bit easier? I consider myself to be pretty good at SQL and at high level system development, but I've never setup a cluster, nor FB replication. So, for me this is a real stumbling block. On top of all other things I need to do yesterday, I would also have to learn how to set this up. I'm sure I'm not the only one... So, what could be done with FB to make the process easier of backing up and restoring a live database and the syncing the new copy with all updates since the start of backup, and dfinally bringing the copy live instead of the old one? Perhaps this is what would be "best spent devel resources" to solve the issue? Kjell -- -------------------------------------- Kjell Rilbe DataDIA AB E-post: kj...@da... Telefon: 08-761 06 55 Mobil: 0733-44 24 64 |
From: Leyne, S. <Se...@br...> - 2012-01-03 17:51:39
|
Jesus > > Single server without downtime is a myth anyway. > The problem is not downtime is how much downtime. Backup and restore is > so much downtime. If that is the case, how much downtime is acceptable? There are a couple of possible solutions which would reduce the downtime; - a new backup/restore tool which would use multiple readers/writers to minimize execution time, - a "data port" utility which would allow for data to be ported from a live database to a new database while live is active but would need a finalization step where the live database is shutdown to apply the final data changes and add FK constraints. There are, however, certain "realities" which cannot be overcome; disk throughput/IO performance. Sean |
From: Dimitry S. <sd...@ib...> - 2012-01-03 18:27:19
|
03.01.2012 18:51, Leyne, Sean wrote: > - a "data port" utility which would allow for data to be ported from a live database to a new database while live is active but would need a finalization step where the live database is shutdown to apply the final data changes and add FK constraints. And exactly this utility is called "replicator". If made right, it doesn't need FK deactivation and can do "finalization step" when new database is already in use. Aren't you tired inventing a wheel?.. -- SY, SD. |
From: Ann H. <aha...@nu...> - 2012-01-03 18:44:51
|
Dimitry, >> - a "data port" utility which would allow for data to be ported from a live database to a new database while live is active but would need a finalization step where the live database is shutdown to apply the final data changes and add FK constraints. > > And exactly this utility is called "replicator". If made right, it doesn't need FK > deactivation and can do "finalization step" when new database is already in use. > Aren't you tired inventing a wheel?.. Different vehicles need different wheels. The wheels on my bicycle wouldn't do at all for a cog-railway and cog-railway wheels work very badly on airplanes. Airplane wheels are no use at all in a grandfather clock. Engineering is all about creating new wheels. Right now, what we're looking for is a wheel that can reset transaction ids. I'm not sure that either replication or the mechanism Sean is proposing (similar to either the start of a shadow database or nbackup) can solve the overflowing transaction id problem. Cheers, Ann |
From: Dimitry S. <sd...@ib...> - 2012-01-03 19:09:21
|
03.01.2012 19:44, Ann Harrison wrote: > I'm not sure that either replication or the > mechanism Sean is proposing (similar to either the start of a shadow > database or nbackup) can solve the overflowing transaction id problem. Simply: let's say we have two synchronous database. One is primary and the second is... well... secondary. When transaction count reach, say, 1000000000 transaction, we shut down replication and perform backup-restore of the secondary database. Then we continue replication and after some time we again have two synchronous databases. In primary database transaction counter is big, in secondary - low. Now we switch the roles. Primary database become a secondary and vice versa. After then we can repeat previous step to reset transaction counter in ex-primary database without stopping whole system. Voila. -- SY, SD. |
From: Woody <woo...@su...> - 2012-01-03 19:14:54
|
From: "Ann Harrison" <aha...@nu...> > Dimitry, > >> And exactly this utility is called "replicator". If made right, it >> doesn't need FK >> deactivation and can do "finalization step" when new database is already >> in use. >> Aren't you tired inventing a wheel?.. > > Different vehicles need different wheels. The wheels on my bicycle > wouldn't do at all for a cog-railway and cog-railway wheels work very > badly on airplanes. Airplane wheels are no use at all in a > grandfather clock. Engineering is all about creating new wheels. > Right now, what we're looking for is a wheel that can reset > transaction ids. I'm not sure that either replication or the > mechanism Sean is proposing (similar to either the start of a shadow > database or nbackup) can solve the overflowing transaction id problem. > Maybe I'm a little dense, (probably :), but doesn't FB already know what the oldest interesting transaction id is? Why couldn't transaction numbers be allowed to wrap back around up to that point? As long as transactions are committed at some point, the oldest transaction would move and it would solve most problems being run into now, IMO. I will accept any and all ridicule if this seems idiotic since I really don't know the code and haven't even looked at it. I'm just amazed and impressed at how easy it is to set up and use FB in everything I do. :) Woody (TMW) |
From: Ann H. <aha...@nu...> - 2012-01-03 19:48:10
|
Woody, > Maybe I'm a little dense, (probably :), but doesn't FB already know what the > oldest interesting transaction id is? Why couldn't transaction numbers be > allowed to wrap back around up to that point? As long as transactions are > committed at some point, the oldest transaction would move and it would > solve most problems being run into now. The oldest interesting transaction is the oldest on that is not known to be committed. If the oldest interesting transaction is 34667, and you're transaction 55778, you know that anything created by transaction 123 is certain to be committed. Now lets assume that you're transaction 4294967000 and the oldest interesting transaction was 4294000000 when you started. (Probably ought to mention that (IIRC) a transaction picks up the value of its the oldest interesting on startup). Then the transaction counter rolls around and some new transaction 3 starts creating new versions... You know they're committed, so you read the new data. More generally, how do you know the difference between the old transaction 3 record versions which you do need to read and new new transaction 3 records that you don't want to read? > I will accept any and all ridicule if this seems idiotic ... Not at all idiotic. This stuff is complicated. Cheers, Ann |
From: Kjell R. <kje...@da...> - 2012-01-03 19:49:33
|
Den 2012-01-03 20:14 skrev Woody såhär: > Maybe I'm a little dense, (probably :), but doesn't FB already know what the > oldest interesting transaction id is? Why couldn't transaction numbers be > allowed to wrap back around up to that point? As long as transactions are > committed at some point, the oldest transaction would move and it would > solve most problems being run into now, IMO. As far as I understand, with very limited knowledge about the ods, each version of each record contains the id of the transaction that created that record version. So, if transaction with id 5 created the record version it will say "5" in there, always, as long as that record version is still in existence. Now, old record versions get garbage collected. If the record version is not the current one and its id is lower than the OIT, the disk space for that record version is marked as free. But this happens ONLY for transaction version that are not "current". Consider a lookup table that's created when the database is created. Those records will possibly never change. Same goes for log records. So, that old "5" will stay there forever, regardless of the OIT. Kjell -- -------------------------------------- Kjell Rilbe DataDIA AB E-post: kj...@da... Telefon: 08-761 06 55 Mobil: 0733-44 24 64 |
From: Ann H. <aha...@nu...> - 2012-01-03 18:36:37
|
Sean, > >> The problem is not downtime is how much downtime. Backup and restore is >> so much downtime. > > There are a couple of possible solutions which would reduce the downtime; > - a new backup/restore tool which would use multiple readers/writers to minimize execution time, Here we're talking about a logical backup that can be used to restart transaction numbers. Record numbers are based loosely on record storage location. Since a logical backup/restore changes storage location and thus record numbers and indexes link values to record numbers, indexes must be recreated. The problem with a multi-threaded logical backup is that all the threads contend for the same I/O bandwidth and possibly the same CPU time. Much of the restore time is spent sorting keys to recreate indexes and multiple threads would contend for the same temporary disk I/O. > - a "data port" utility which would allow for data to be ported from a live database to a new database while live is active but would need a finalization step where the live database is shutdown to apply the final data changes and add FK constraints. It's not immediately obvious to me how that sort of backup/restore could reset transaction numbers. > There are, however, certain "realities" which cannot be overcome; disk throughput/IO performance. > > |
From: Thomas S. <ts...@ib...> - 2012-01-03 19:49:49
|
>>> The problem is not downtime is how much downtime. Backup and restore is >>> so much downtime. >> >> There are a couple of possible solutions which would reduce the downtime; >> - a new backup/restore tool which would use multiple readers/writers to minimize execution time, > > Here we're talking about a logical backup that can be used to restart > transaction numbers. Record numbers are based loosely on record > storage location. Since a logical backup/restore changes storage > location and thus record numbers and indexes link values to record > numbers, indexes must be recreated. > > The problem with a multi-threaded logical backup is that all the > threads contend for the same I/O bandwidth and possibly the same CPU > time. Much of the restore time is spent sorting keys to recreate > indexes and multiple threads would contend for the same temporary disk > I/O. While the restore process is pretty much I/O bound, creating indices loves RAM, as I have seen in some tests I had made in the past. So restore might get a speed up when more RAM can be utilized. There is a -bu(ffers) option for the restore, but I think this really overrides the database page buffers than acting as a higher temporary page cache. Another option might be to restore to a RAM disk, if the database file fits onto it and then move the restored database to a persistent storage. >> - a "data port" utility which would allow for data to be ported from a live database to a new database while live is active but would need a finalization step where the live database is shutdown to apply the final data changes and add FK constraints. > > It's not immediately obvious to me how that sort of backup/restore > could reset transaction numbers. > >> There are, however, certain "realities" which cannot be overcome; disk throughput/IO performance. True, but things are getting better if we can do more stuff in RAM which would go to disk otherwise, especially temporary data e.g. when creating indices. Regards, Thomas |
From: Leyne, S. <Se...@br...> - 2012-01-03 20:10:01
|
Thomas, > While the restore process is pretty much I/O bound, While that is true for desktop PCs with HDDs (not SSDs) or server without a cached RAID controllers, but it is certainly not true as a blanket statement. There is plenty of room for more throughput. > Another option might be to restore to a RAM disk, if the database file fits > onto it and then move the restored database to a persistent storage. Agreed, and a restore process with multiple writers and multiple index rebuilds would be of even more significant benefit! A RAM disk based backup/restore is possible today without any change to FB. I am referring to a truly kick-but restore process. Sean |
From: Dmitry Y. <fir...@ya...> - 2012-01-03 20:06:19
|
03.01.2012 23:49, Kjell Rilbe wrote: > As far as I understand, with very limited knowledge about the ods, each > version of each record contains the id of the transaction that created > that record version. So, if transaction with id 5 created the record > version it will say "5" in there, always, as long as that record version > is still in existence. > > Now, old record versions get garbage collected. If the record version is > not the current one and its id is lower than the OIT, the disk space for > that record version is marked as free. > > But this happens ONLY for transaction version that are not "current". > Consider a lookup table that's created when the database is created. > Those records will possibly never change. Same goes for log records. So, > that old "5" will stay there forever, regardless of the OIT. Theoretically, it could be worked around. A regular or manually started "something-like-a-sweep-but-different" activity could visit all the committed record versions and reset their txn IDs to e.g. OIT-1, thus making the ID space denser and causing a wrap around to be more-or-less safe. But it's going to be terribly slow (almost all data pages have to be modified). Also, with the wrapping allowed, the whole logic that handles txn IDs would turn to be much more complicated (simple checks like MAX(ID1, ID2) won't work anymore). In the past, I liked the idea to wrap the txn IDs, but now I'm more and more keen to consider other solutions instead. Dmitry |
From: Ann H. <aha...@nu...> - 2012-01-03 21:00:07
|
On Tue, Jan 3, 2012 at 3:06 PM, Dmitry Yemanov <fir...@ya...> wrote: > > In the past, I liked the idea to wrap the txn IDs, but now I'm more and > more keen to consider other solutions instead. > Completely agree - including having changed my mind. I agree with Dimitry S. that his replicate/backup/restore/reverse strategy works, assuming that the load is light enough that the newly restored replicant can eventually catch up. At the same time, one of Firebird's strong points is the limited amount of expertise necessary to manage the system, so in the longer run, either a variable length transaction id or a record version by record version flag indicating the size of the transaction id has a lot of merit. Best regards, Ann |
From: Kjell R. <kje...@da...> - 2012-01-03 21:35:07
|
Den 2012-01-03 21:59 skrev Ann Harrison såhär: > On Tue, Jan 3, 2012 at 3:06 PM, Dmitry Yemanov<fir...@ya...> wrote: > >> In the past, I liked the idea to wrap the txn IDs, but now I'm more and >> more keen to consider other solutions instead. >> > Completely agree - including having changed my mind. I agree with > Dimitry S. that his replicate/backup/restore/reverse strategy works, > assuming that the load is light enough that the newly restored > replicant can eventually catch up. At the same time, one of > Firebird's strong points is the limited amount of expertise necessary > to manage the system, so in the longer run, either a variable length > transaction id or a record version by record version flag indicating > the size of the transaction id has a lot of merit. Or a more automated and built-in support to do such a replicate/backup/restore/reverse. For me it's question of time. Sure, I could learn how to setup a cluster and replication. But there are dozens of other things I also need to do yesterday, so having to learn this on top of everything else is a stumbling block. Could the procedure be "packaged" into some kind of utility program or something? I'm thinking that nbackup locks the master file while keeping track of changed pages in a separate file. Perhaps a transaction id consolidation similar to what happens on backup/restore could be performed on a locked database master while logging updates in a separate file, and then bring the consolidated master up to date again. If this is very difficult, perhaps there's no point - devel resources better spent elsewhere. But if it would be a fairly simple task...? Kjell -- -------------------------------------- Kjell Rilbe DataDIA AB E-post: kj...@da... Telefon: 08-761 06 55 Mobil: 0733-44 24 64 |
From: Ann H. <aha...@nu...> - 2012-01-03 22:09:21
|
Kjell, > > Or a more automated and built-in support to do such a > replicate/backup/restore/reverse. For me it's question of time. Sure, I > could learn how to setup a cluster and replication. But there are dozens > of other things I also need to do yesterday, so having to learn this on > top of everything else is a stumbling block. > > Could the procedure be "packaged" into some kind of utility program or > something? The short answer is probably just "No." Could someone build a robot that would identify a flat tire, take your spare tire out of your trunk, jack up your car, remove the flat, put on the spare, lower the care, and put the flat tire back in the trunk? Probably. Would it be easier than learning to change a tire? Somewhat unlikely. On a heavily loaded system, the replicated database (replicant in my jargon) can't share a disk and set of set of cpu's with the primary database. (That's the trunk part of the analogy.) Once established, the replicant has to create a foundation copy of the primary database (jacking up the car), then process updates until it's approximately current current with the primary database (removing the old tire), then initiate a backup/restore, wait for the restore to complete successfully(install the new tire), swap in the newly created database and catch up to the primary again (lower the car). Finally, once the newly restored replicant is absolutely current, the system must quiesce for a few seconds to swap primary and replicant databases (getting the old tire into the trunk). > > I'm thinking that nbackup locks the master file while keeping track of > changed pages in a separate file. Perhaps a transaction id consolidation > similar to what happens on backup/restore could be performed on a locked > database master while logging updates in a separate file, and then bring > the consolidated master up to date again. nbackup works at the page level which is simpler than handling record level changes. Unlike records, pages never go away, nor do they change their primary identifier. > If this is very difficult, perhaps there's no point - devel resources > better spent elsewhere. But if it would be a fairly simple task...? Alas, I doubt that its simple. Best regards, Ann |
From: Leyne, S. <Se...@br...> - 2012-01-03 19:37:45
|
> 03.01.2012 18:51, Leyne, Sean wrote: > > - a "data port" utility which would allow for data to be ported from a live > database to a new database while live is active but would need a finalization > step where the live database is shutdown to apply the final data changes and > add FK constraints. > > And exactly this utility is called "replicator". If made right, it doesn't need FK > deactivation and can do "finalization step" when new database is already in > use. > Aren't you tired inventing a wheel?.. As Ann poetically replied, there are different wheels for different vehicles. I was/am thinking that replicator requires more setup than a multi-threaded data pump utility would. The problem is not a perpetual problem, it is relatively one- time only. As for the FKs, I see the tool as being a some which needs to have the maximum performance. FKs are required for referential integrity but they slow down bulk move operations, since the FK values needs to be checked. Further, since the data integrity would already be enforced in the Live database, the target database would not need FKs until the "go live" pre-launch. Also, FK would require that the database schema be analyzed to process tables/rows in the correct order, which is an obstacle to maximum performance. Sean |
From: Dimitry S. <sd...@ib...> - 2012-01-03 19:51:03
|
03.01.2012 20:37, Leyne, Sean wrote: > As for the FKs, I see the tool as being a some which needs to have the maximum performance. If you are going to move whole database at once - yes. Fortunately, in most cases it is not necessary. -- SY, SD. |
From: Dimitry S. <sd...@ib...> - 2011-12-27 21:32:02
|
27.12.2011 22:17, Leyne, Sean wrote: > I should have said: > > "That type of solution is not what immediately comes to mind for me, since I see a shared disk solution (using redundant SAN storage) to be much easier to implement for FB." Unfortunately, shared-storage cluster doesn't solve transaction limit problem. BTW, distributed lock manager also isn't a trivial thing. And is quite slow. -- SY, SD. |