|
From: <pos...@gz...> - 2013-08-08 21:01:06
|
We have a server with 64 cores and 384GB of RAM. We'd like to take
advantage of that hardware to speed up some queries that take 8+ hours
to run, and Postgres-XC seems like a good fit.
I've setup a cluster of 8 data nodes (I'll increase that to 48 for real
usage), 1 coordinator, and 1 GTM, all running on the same physical
server. I'm using 1.1 beta, on Postgres 9.2.
Here are the relevant commands (near-identical repeated commands are
omitted):
/usr/local/pgsql/bin/initdb -D
/var/lib/pgsql/9.2/postgres-xc/data_coord1 --nodename coord1
/usr/local/pgsql/bin/initdb -D
/var/lib/pgsql/9.2/postgres-xc/data_datanode1 --nodename datanode1
...
/usr/local/pgsql/bin/initdb -D
/var/lib/pgsql/9.2/postgres-xc/data_datanode8 --nodename datanode8
/usr/local/pgsql/bin/initgtm -D /var/lib/pgsql/9.2/postgres-xc/data_gtm
-Z gtm
/usr/local/pgsql/bin/gtm -D /var/lib/pgsql/9.2/postgres-xc/data_gtm >>
/var/lib/pgsql/9.2/postgres-xc/logfile 2>&1 &
/usr/local/pgsql/bin/postgres -X -p 15432 -D
/var/lib/pgsql/9.2/postgres-xc/data_datanode1 >>
/var/lib/pgsql/9.2/postgres-xc/logfile 2>&1 &
...
/usr/local/pgsql/bin/postgres -X -p 15439 -D
/var/lib/pgsql/9.2/postgres-xc/data_datanode8 >>
/var/lib/pgsql/9.2/postgres-xc/logfile 2>&1 &
/usr/local/pgsql/bin/postgres -C -p 5477 -D
/var/lib/pgsql/9.2/postgres-xc/data_coord1 >>
/var/lib/pgsql/9.2/postgres-xc/logfile 2>&1 &
/usr/local/pgsql/bin/psql -p 5477 -c "CREATE NODE datanode1 WITH (TYPE =
'datanode', PORT = 15432)" postgres
...
/usr/local/pgsql/bin/psql -p 5477 -c "CREATE NODE datanode8 WITH (TYPE =
'datanode', PORT = 15439)" postgres
/usr/local/pgsql/bin/psql -p 5477 -c "SELECT pgxc_pool_reload()" postgres
/usr/local/pgsql/bin/createdb -p 5477 test
/usr/local/pgsql/bin/psql -p 5477 test
I then created the following tables:
CREATE TABLE trails1 (
id text,
a_lat double precision,
a_long double precision,
b_lat double precision,
b_long double precision,
trail_id character varying(20),
type character varying(4),
distance numeric(10,5)
);
CREATE INDEX table1_a_lat ON table1 USING btree (a_lat);
CREATE INDEX table1_a_long ON table1 USING btree (a_long);
CREATE INDEX table1_b_lat ON table1 USING btree (b_lat);
CREATE INDEX table1_b_long ON table1 USING btree (b_long);
CREATE INDEX table1_type ON table1 USING btree (type);
CREATE INDEX table1_distance ON table1 USING btree (distance);
CREATE TABLE trails2 (
a_lat double precision,
a_long double precision,
b_lat double precision,
b_long double precision,
type character varying(5),
distance numeric(16,8)
);
CREATE INDEX table2_a_lat ON table2 USING btree (a_lat);
CREATE INDEX table2_a_long ON table2 USING btree (a_long);
CREATE INDEX table2_b_lat ON table2 USING btree (b_lat);
CREATE INDEX table2_b_long ON table2 USING btree (b_long);
CREATE INDEX table2_type ON table2 USING btree (type);
CREATE INDEX table2_distance ON table2 USING btree (distance);
I think I should have had something in the table definition about how to
partition the data.
I populated them using copy, with 331,106 rows in table1, and 1,124,421
rows in table2.
Simple queries return the right values:
select count(*) as count from table1;
331106
select count(*) as count from table2;
1124421
When I do a big query (the one that normally takes hours to complete) it
is only using one CPU core. I expected to see Postgres processes using
near 100% CPU on 8 cores.
SELECT
count(*) as count
FROM
trails1
WHERE
not exists (
SELECT
'x'
FROM
trails2
WHERE
trails2.a_lat >= trails1.a_lat - 0.000833 AND
trails2.a_lat <= trails1.a_lat + 0.000833 AND
trails2.a_long >= trails1.a_long - 0.000833 AND
trails2.a_long <= trails1.a_long + 0.000833 AND
trails2.b_lat >= trails1.b_lat - 0.000833 AND
trails2.b_lat <= trails1.b_lat + 0.000833 AND
trails2.b_long >= trails1.b_long - 0.000833 AND
trails2.b_long <= trails1.b_long + 0.000833 AND
(
trails2.type = trails1.type OR
trails2.type = 'S'
) AND
trails2.distance >= trails1.distance - 1.0 AND
trails2.distance <= trails1.distance + 1.0
);
I haven't let it run to completion. After seeing the lack of CPU usage
I went looking for other problems (and found them).
If I reload the pool and then check it, I get true:
SELECT pgxc_pool_reload();
t
SELECT pgxc_pool_check();
t
But after I interact with the data at all, it fails:
select count(*) as count from table1;
331106
SELECT pgxc_pool_check();
f
So I don't think it's working properly. The logs don't show anything.
Any suggestions?
Thanks!
|