|
From: Ben C. <ben...@gm...> - 2010-11-11 01:22:17
|
>
>
> Replace:
> my $rows = sql_do("DELETE FROM node_ip WHERE mac NOT IN (SELECT mac FROM
> node)");
> With:
> my $rows = sql_do("DELETE FROM node_ip WHERE mac IN (SELECT DISTINCT mac
> FROM node_ip LEFT JOIN node USING (mac) WHERE port IS NULL)");
>
Ah, that's much better!
netdisco=# EXPLAIN DELETE FROM node_ip WHERE mac IN (SELECT DISTINCT mac
FROM node_ip LEFT JOIN node USING (mac) WHERE port IS NULL);
QUERY PLAN
----------------------------------------------------------------------------------------------
Hash Join (cost=139124.99..170640.85 rows=402903 width=6)
Hash Cond: ("outer".mac = "inner".mac)
-> Seq Scan on node_ip (cost=0.00..20436.03 rows=402903 width=12)
-> Hash (cost=139110.72..139110.72 rows=5708 width=6)
-> Unique (cost=93267.11..139053.64 rows=5708 width=6)
-> Merge Left Join (cost=93267.11..138046.38 rows=402903
width=6)
Merge Cond: ("outer".mac = "inner".mac)
Filter: ("inner".port IS NULL)
-> Sort (cost=61882.63..62889.89 rows=402903 width=6)
Sort Key: public.node_ip.mac
-> Seq Scan on node_ip (cost=0.00..20436.03
rows=402903 width=6)
-> Sort (cost=31384.48..31995.46 rows=244390
width=22)
Sort Key: node.mac
-> Seq Scan on node (cost=0.00..6170.90
rows=244390 width=22)
(14 rows)
netdisco -B now runs in under a minute.
BTW, it turns out the root cause was lowering our DHCP times to one hour on
several large networks to alleviate address contention issues. This caused
the node_ip to increase from a few tens of thousands to a million entries.
cheers!
|