Re: [Sqlgrey-users] Using multiple Connect tables

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Dave Strickler wrote the following on 29.03.2007 21:47 :
> (sorry on the answering-old-threads-to-start-new-ones... It's a old,
> bad habit and I will stop)
>  
> The problem we are having is a very old one. In about 24 hours, our
> Connect table will grow to 6 million records. We can purge it down,
> but that takes time, and what really hurts us is doing a VACUUM FULL
> which would write/read lock the database for long periods of time,
> therefore pulling SQLGrey "offline". We are not using pg_autovacuum,
> and perhaps this would be the solution for now.

Do not do a VACUUM FULL. This is for extreme conditions only (when the
disk space used becomes huge). VACUUM ANALYZE should be enough and won't
block SQLgrey (but probably slow it down a bit). pg_autovacuum will do
it for you but not as soon as you launch it, as it is designed to wait
for enough changes to a table to analyze it, so you'll have to launch a
VACUUM ANALYZE manually the first time.
1/ Note that if you stop PostgreSQL or prevent SQLgrey to connect to it,
it will switch to passthrough (handy if you need to take PostgreSQL
offline to tweak it for a few seconds/minutes).
2/ If you never ANALYZE, PostgreSQL might very well do full table scans
instead of using indexes which obviously will slow down SQLgrey
tremendously.

>  
> However, I have a larger question that asks, "what do we do when the
> Connect table reaches 10 million, or 20 million?" How large can it
> grow before I'm in trouble? Since I don't know it's upper limits, I
> fear I may run into them randomly one day.

It depends on you actual setup. I've run one old PostgreSQL 7.1 instance
with around this number of lines years ago and it didn't break. But it
could get too slow. You should compare the time needed to query the
connect table and the rate of connections to your SMTP servers. If your
query time is less than one tenth of the average delay between 2
connexions you should be safe.

If not, time to optimize. Throw memory at PostgreSQL and increase :
- shared_buffer (attention au delà d'un certain seuil l'OS ne peut pas
suivre et PostgreSQL ne démarrera pas), which should help maintain
index  in memory
- max_fsm_pages and max_fsm_relations (which should prevent too much
fragmentation of the on-disk data), run *VACUUM VERBOSE to check if
these parameters are enough at the end you will see something like:*
*INFO:  free space map: 12 relations, 2319 pages stored; 2464 total
pages needed*
*DETAIL:  Allocated FSM size: 1000 relations + 20000 pages = 182 kB
shared memory.*
****

be sure to be at least around the values reported by the INFO line.
**

- commit_demay, should help commit several writes at the same time,
- reduce commit_siblings to 2 (with SQLgrey you can have at most as much
sibling as SQLgrey servers, if you have only one, commit_* won't change
your performance),

Lionel.