|
From: Ben C. <ben...@gm...> - 2010-11-10 22:56:35
|
Hi Max,
We gave that a go but still had issues, and surprisingly this now looks like
a load issue. Using the perl debugger we found the backup job is failing on
this line:
DELETE FROM node_ip WHERE mac NOT IN (SELECT mac FROM node);
The number of records in node_ip was approaching 1,000,000. For as yet
unexplained reasons this has been growing very fast for the last few months
- up from only 30,000 records in June.
But the killer seems to be the query plan which ignores the available
indices:
netdisco=> EXPLAIN DELETE FROM node_ip WHERE mac NOT IN (SELECT mac FROM node);
QUERY PLAN
-----------------------------------------------------------------------------
Seq Scan on node_ip (cost=7603.26..1849477612.26 rows=435168 width=6)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=7603.26..11242.04 rows=244478 width=6)
-> Seq Scan on node (cost=0.00..6164.78 rows=244478 width=6)
(5 rows)
This query was eventually dieing after around several hours. Just by
deleting old entries in the table the performance has improved. At around
300,000 records it's tolerable.
Leaving the excess records aside, I am wondering if this problem is due to
the version of postgres we are running. I would really appreciate it if
someone out there has a >8.1 server they can run the EXPLAIN on and copy the
list.
thanks!
On Tue, Nov 9, 2010 at 4:28 AM, Max Baker <ma...@wa...> wrote:
> Ben,
>
> It looks like your database might be corrupted. I would suggest you run
> an fsck and check /var/log/messages to see if you see any other issues .
> I'm afraid I don't know what the procedure is to fix a Postgres db any more.
>
> -m
>
>
|