|
From: Alexander P. <pes...@ma...> - 2010-01-06 09:03:50
|
On Wednesday 06 January 2010 00:36:58 Ann W. Harrison wrote: > Gary Franklin wrote: > > Our company's testing department created a stress test that creates a > > large number of tables, populates them, performs queries and then deletes > > the tables and does this over and over many times. > > When you say "deletes the tables" do you mean that it deletes the > records or that it drops the tables - or both? Is there any chance > that some other connection has an open transaction while the test > is running? > > Could you run gstat -a -r on the database after it gets bloated. > That will show where the space is being lost, I think. And also > run gstat -h while the test is running for a definitive answer > on whether there are old transactions around during the test, > not just after it has completed. There is such data in Gary's letter later, at least gstat -h is present. Looks like there is no such transaction. > The numbers are very disturbing - and its slightly interesting > that the file sizes are exactly the same for the database after > it's been backed up and restored and the first run. I'd expect > some variation unless this is a single user test - or at least > a single writer test. Provided empty database is restored once again (remember, all tables were dropped), may be it can the same. > > I would expect the size of the database to grow the first time I run the > > test (from 802,816 bytes to 3,600,384 bytes) but why does the database > > continue to grow each and every run after that? Shouldn't the unused > > pages be used instead of the database continually getting bigger? > > Yes, of course unused pages should be reused. It may grow a bit because > garbage collection happens later in one run than another or record > placement changes from run to run. > > > Is there any way to run the garbage collection after a specific number > > of transactions have taken place, say 10,000, rather than being triggered > > off the sweep interval? > > Garbage collection runs continuously, at least that's the theory. When > a transaction reads a record, it checks for old versions that are no > longer visible to running transactions. If there are, it removes them > or puts them on a list of records to be removed. If it finds a deleted > record and there are no transactions that started before the delete was > committed, it does the same. There are patterns of usage that defeat > that scheme - for example, if you always use indexed access through an > index that always increments, delete only the oldest records, and read > only the newest, the old records are never revisited, so they never go > away. Looks like here we have that case. Table is deleted (is it DELETE FROM TABLE1 followed by DROP TABLE1 commands?), what may cause it's data pages to be revisited? > Sweep reads all of every table, getting rid of that sort of > detritus. > > The sweep interval is based on the difference between the oldest > non-committed transaction and the current transaction which generally > works well - you don't want to have the overhead of a transaction > reading every record in the database unless you need it and the > on-going garbage collection should be good enough. Sweep interval does not make sense for classic and superclassic, only for superserver. Am I wrong? > > Is there a way to automatically reduce the size of the database without > > running gbak? > > No. The database is internally linked by page number, so it has to > be completely rewritten to release pages. I suppose Gary talks here about reuse of pages from dropped table. |