|
From: Tatsuo I. <is...@po...> - 2012-11-06 01:06:03
|
Hi, PostgreSQL Enterprise Consortium is planning to do a benchmark against Postges-XC. If we would use standard pgbench workload(pgbench default, -N, -S), what is a recommended portioning plan for pgbench_accounts? Any suggestions will be appreciated. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp |
|
From: Michael P. <mic...@gm...> - 2012-11-06 01:20:37
|
On Tue, Nov 6, 2012 at 10:05 AM, Tatsuo Ishii <is...@po...> wrote: > Hi, > > PostgreSQL Enterprise Consortium is planning to do a benchmark against > Postges-XC. If we would use standard pgbench workload(pgbench default, > -N, -S), what is a recommended portioning plan for pgbench_accounts? > If you want to show up the scalability, I recommend that you use pgbench with option -k for initialization and launching, which is an option that has been added in the pgbench version of XC available in its source code. This allows to to a benchmark test by using bid as a distribution key so this minimizes the amount of 2PC done when write operations involve several nodes in a transaction. $ pgbench --help Initialization options: -k distribute by primary key branch id - bid Benchmarking options: -k query with default key and additional key branch id (bid) Depending on your cluster structure, I would also recommend you also to use PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for example with the Datanode that is on the same server as a Coordinator if you use a structure of 1 Coordinator and 1 Datanode per server. This also reduces the network load by having replicated table read being done on the preferred node in priority. This is especially better if the node is local of course. > > Any suggestions will be appreciated. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > > > ------------------------------------------------------------------------------ > LogMeIn Central: Instant, anywhere, Remote PC access and management. > Stay in control, update software, and manage PCs from one command center > Diagnose problems and improve visibility into emerging IT issues > Automate, monitor and manage. Do more in less time with Central > http://p.sf.net/sfu/logmein12331_d2d > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Michael Paquier http://michael.otacoo.com |
|
From: Michael P. <mic...@gm...> - 2012-11-15 05:29:49
|
On Thu, Nov 15, 2012 at 1:47 PM, Tatsuo Ishii <is...@po...> wrote: > >> PostgreSQL Enterprise Consortium is planning to do a benchmark against > >> Postges-XC. If we would use standard pgbench workload(pgbench default, > >> -N, -S), what is a recommended portioning plan for pgbench_accounts? > >> > > If you want to show up the scalability, I recommend that you use pgbench > > with option -k for initialization and launching, which is an option that > > has been added in the pgbench version of XC available in its source code. > > This allows to to a benchmark test by using bid as a distribution key so > > this minimizes the amount of 2PC done when write operations involve > several > > nodes in a transaction. > > $ pgbench --help > > Initialization options: > > -k distribute by primary key branch id - bid > > Benchmarking options: > > -k query with default key and additional key branch id (bid) > > > > Depending on your cluster structure, I would also recommend you also to > use > > PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for > > example with the Datanode that is on the same server as a Coordinator if > > you use a structure of 1 Coordinator and 1 Datanode per server. This also > > reduces the network load by having replicated table read being done on > the > > preferred node in priority. This is especially better if the node is > local > > of course. > > Thanks for suggestion. We did pgbench -k benchmark and got good > result. Details will be published at PGECONS seminar on December 7th > in Tokyo. > Thanks for letting me know. I'll show up at this presentation I am pretty interested, but no dinner for me :) https://www.pgecons.org/2012/12/07/1527/ > > Also I would like to do some read-only workload benchmark as well. Any > suggestion to get good result? I'm not sure if plain pgbench -S gives > good result. > When using -S you only perform a select on aid of pgbench_accounts, so if you initialize pgbench without -k pgbench_accounts will be hashed with aid as key as the default if no distribution is specified is to take a hash distribution and the first hashable column of relation. So this would be good for scans as you will always scan only a single node based on the value of aid. So yeah do not use -k for read evaluation. -- Michael Paquier http://michael.otacoo.com |
|
From: Tatsuo I. <is...@po...> - 2012-11-15 08:49:18
|
> On Thu, Nov 15, 2012 at 1:47 PM, Tatsuo Ishii <is...@po...> wrote: > >> >> PostgreSQL Enterprise Consortium is planning to do a benchmark against >> >> Postges-XC. If we would use standard pgbench workload(pgbench default, >> >> -N, -S), what is a recommended portioning plan for pgbench_accounts? >> >> >> > If you want to show up the scalability, I recommend that you use pgbench >> > with option -k for initialization and launching, which is an option that >> > has been added in the pgbench version of XC available in its source code. >> > This allows to to a benchmark test by using bid as a distribution key so >> > this minimizes the amount of 2PC done when write operations involve >> several >> > nodes in a transaction. >> > $ pgbench --help >> > Initialization options: >> > -k distribute by primary key branch id - bid >> > Benchmarking options: >> > -k query with default key and additional key branch id (bid) >> > >> > Depending on your cluster structure, I would also recommend you also to >> use >> > PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for >> > example with the Datanode that is on the same server as a Coordinator if >> > you use a structure of 1 Coordinator and 1 Datanode per server. This also >> > reduces the network load by having replicated table read being done on >> the >> > preferred node in priority. This is especially better if the node is >> local >> > of course. >> >> Thanks for suggestion. We did pgbench -k benchmark and got good >> result. Details will be published at PGECONS seminar on December 7th >> in Tokyo. >> > Thanks for letting me know. I'll show up at this presentation I am pretty > interested, but no dinner for me :) > > https://www.pgecons.org/2012/12/07/1527/ >> >> Also I would like to do some read-only workload benchmark as well. Any >> suggestion to get good result? I'm not sure if plain pgbench -S gives >> good result. >> > When using -S you only perform a select on aid of pgbench_accounts, so if > you initialize pgbench without -k pgbench_accounts will be hashed with aid > as key as the default if no distribution is specified is to take a hash > distribution and the first hashable column of relation. So this would be > good for scans as you will always scan only a single node based on the > value of aid. > So yeah do not use -k for read evaluation. I'm confused. If I don't give -k to pgbench -i, then all the data for pgbench_accounts go to the first data node. If CREATE TABLE is not supplied WITH DISTRIBUTE, what is an expected behavior? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp |
|
From: Koichi S. <koi...@gm...> - 2012-11-06 01:40:34
|
One suggestion. Because set bid value is relatively small, it may be better to distribute tables using MODULO, not HASH. Regards; ---------- Koichi Suzuki 2012/11/6 Michael Paquier <mic...@gm...>: > > > On Tue, Nov 6, 2012 at 10:05 AM, Tatsuo Ishii <is...@po...> wrote: >> >> Hi, >> >> PostgreSQL Enterprise Consortium is planning to do a benchmark against >> Postges-XC. If we would use standard pgbench workload(pgbench default, >> -N, -S), what is a recommended portioning plan for pgbench_accounts? > > If you want to show up the scalability, I recommend that you use pgbench > with option -k for initialization and launching, which is an option that has > been added in the pgbench version of XC available in its source code. This > allows to to a benchmark test by using bid as a distribution key so this > minimizes the amount of 2PC done when write operations involve several nodes > in a transaction. > $ pgbench --help > Initialization options: > -k distribute by primary key branch id - bid > Benchmarking options: > -k query with default key and additional key branch id (bid) > > Depending on your cluster structure, I would also recommend you also to use > PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for > example with the Datanode that is on the same server as a Coordinator if you > use a structure of 1 Coordinator and 1 Datanode per server. This also > reduces the network load by having replicated table read being done on the > preferred node in priority. This is especially better if the node is local > of course. > > >> >> >> Any suggestions will be appreciated. >> -- >> Tatsuo Ishii >> SRA OSS, Inc. Japan >> English: http://www.sraoss.co.jp/index_en.php >> Japanese: http://www.sraoss.co.jp >> >> >> ------------------------------------------------------------------------------ >> LogMeIn Central: Instant, anywhere, Remote PC access and management. >> Stay in control, update software, and manage PCs from one command center >> Diagnose problems and improve visibility into emerging IT issues >> Automate, monitor and manage. Do more in less time with Central >> http://p.sf.net/sfu/logmein12331_d2d >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > > > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > LogMeIn Central: Instant, anywhere, Remote PC access and management. > Stay in control, update software, and manage PCs from one command center > Diagnose problems and improve visibility into emerging IT issues > Automate, monitor and manage. Do more in less time with Central > http://p.sf.net/sfu/logmein12331_d2d > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Koichi S. <koi...@gm...> - 2012-11-06 10:02:55
|
pgxc_class has distribution definition of each table. I'm using the
following views as a handy tool:
----
[koichi@linker:scripts]$ cat distr_view.sql
DROP VIEW IF EXISTS xc_table_distribution;
CREATE VIEW xc_table_distribution AS
SELECT pg_class.relname relation,
CASE
WHEN pclocatortype = 'H' THEN 'Hash'
WHEN pclocatortype = 'M' THEN 'Modulo'
WHEN pclocatortype = 'N' THEN 'Round Robin'
WHEN pclocatortype = 'R' THEN 'Replicate'
ELSE 'Unknown'
END AS distribution,
pg_attribute.attname attname
FROM pg_class, pgxc_class, pg_attribute
WHERE pg_class.oid = pgxc_class.pcrelid
and pg_class.oid = pg_attribute.attrelid
and pgxc_class.pcattnum = pg_attribute.attnum
UNION
SELECT pg_class.relname relation,
CASE
WHEN pclocatortype = 'H' THEN 'Hash'
WHEN pclocatortype = 'M' THEN 'Modulo'
WHEN pclocatortype = 'N' THEN 'Round Robin'
WHEN pclocatortype = 'R' THEN 'Replicate'
ELSE 'Unknown'
END AS distribution,
'- none -' attname
FROM pg_class, pgxc_class, pg_attribute
WHERE pg_class.oid = pgxc_class.pcrelid
and pg_class.oid = pg_attribute.attrelid
and pgxc_class.pcattnum = 0
;
COMMENT ON VIEW xc_table_distribution IS
'View to show distribution/replication attribute of the given table.';
[koichi@linker:scripts]$
------
Info for pgbench tables are as follows:
-----------
koichi=# select * from xc_table_distribution
koichi-# ;
relation | distribution | attname
------------------+--------------+----------
t | Hash | a
pgbench_branches | Hash | bid
y | Modulo | a
pgbench_tellers | Hash | bid
pgbench_accounts | Hash | bid
x | Round Robin | - none -
pgbench_history | Hash | bid
s | Replicate | - none -
(8 rows)
koichi=#
------
Hope it helps.
Regards;
----------
Koichi Suzuki
2012/11/6 Tatsuo Ishii <is...@po...>:
>> One suggestion. Because set bid value is relatively small, it may be
>> better to distribute tables using MODULO, not HASH.
>
> What is the distribution method when pgbench -k is used?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
|
|
From: Koichi S. <koi...@gm...> - 2012-11-06 10:03:58
|
The result has several noise. Tables t, y, x, and s are all for my own test. Sorry for inconvenience. ---------- Koichi Suzuki 2012/11/6 Koichi Suzuki <koi...@gm...>: > pgxc_class has distribution definition of each table. I'm using the > following views as a handy tool: > > ---- > [koichi@linker:scripts]$ cat distr_view.sql > DROP VIEW IF EXISTS xc_table_distribution; > > > CREATE VIEW xc_table_distribution AS > > SELECT pg_class.relname relation, > CASE > WHEN pclocatortype = 'H' THEN 'Hash' > WHEN pclocatortype = 'M' THEN 'Modulo' > WHEN pclocatortype = 'N' THEN 'Round Robin' > WHEN pclocatortype = 'R' THEN 'Replicate' > ELSE 'Unknown' > END AS distribution, > pg_attribute.attname attname > FROM pg_class, pgxc_class, pg_attribute > WHERE pg_class.oid = pgxc_class.pcrelid > and pg_class.oid = pg_attribute.attrelid > and pgxc_class.pcattnum = pg_attribute.attnum > UNION > > SELECT pg_class.relname relation, > CASE > WHEN pclocatortype = 'H' THEN 'Hash' > WHEN pclocatortype = 'M' THEN 'Modulo' > WHEN pclocatortype = 'N' THEN 'Round Robin' > WHEN pclocatortype = 'R' THEN 'Replicate' > ELSE 'Unknown' > END AS distribution, > '- none -' attname > FROM pg_class, pgxc_class, pg_attribute > WHERE pg_class.oid = pgxc_class.pcrelid > and pg_class.oid = pg_attribute.attrelid > and pgxc_class.pcattnum = 0 > ; > > COMMENT ON VIEW xc_table_distribution IS > 'View to show distribution/replication attribute of the given table.'; > > [koichi@linker:scripts]$ > ------ > > Info for pgbench tables are as follows: > ----------- > koichi=# select * from xc_table_distribution > koichi-# ; > relation | distribution | attname > ------------------+--------------+---------- > t | Hash | a > pgbench_branches | Hash | bid > y | Modulo | a > pgbench_tellers | Hash | bid > pgbench_accounts | Hash | bid > x | Round Robin | - none - > pgbench_history | Hash | bid > s | Replicate | - none - > (8 rows) > > koichi=# > ------ > > Hope it helps. > > Regards; > ---------- > Koichi Suzuki > > > 2012/11/6 Tatsuo Ishii <is...@po...>: >>> One suggestion. Because set bid value is relatively small, it may be >>> better to distribute tables using MODULO, not HASH. >> >> What is the distribution method when pgbench -k is used? >> -- >> Tatsuo Ishii >> SRA OSS, Inc. Japan >> English: http://www.sraoss.co.jp/index_en.php >> Japanese: http://www.sraoss.co.jp |
|
From: Koichi S. <koi...@gm...> - 2012-11-06 11:07:10
|
Yes, the view works at coordinator locally. CREATE VIEW is propagated to all the coordinators so you should run it only once at one of them. The reason I suggested EXECUTE DIRECT is user table looked involved in the query. Regards; ---------- Koichi Suzuki 2012/11/6 Tatsuo Ishii <is...@po...>: > Thanks. Your view works great! BTW, Should the view be executed on > coordinator node? If so, I'm a little bit confused. Because you said: > >> Because pgxc_class and pg_class are both local to each node, oid is >> also local. To query correctly, they should be issued using EXECUTE >> DIRECT statement. Each EXECUTE DIRECT will return the result local >> to each node. >> 2012/11/6 Tatsuo Ishii <is...@po...>: >>> I sent a query to coordinator node and got following result. Doesn't >>> this means table "pgbench_accounts" distributed? >>> >>> test=# select * from pgxc_class as x join pg_class as c on x.pcrelid = c.oid where c.relname = 'pgbench_accounts'; > >> pgxc_class has distribution definition of each table. I'm using the >> following views as a handy tool: >> >> ---- >> [koichi@linker:scripts]$ cat distr_view.sql >> DROP VIEW IF EXISTS xc_table_distribution; >> >> >> CREATE VIEW xc_table_distribution AS >> >> SELECT pg_class.relname relation, >> CASE >> WHEN pclocatortype = 'H' THEN 'Hash' >> WHEN pclocatortype = 'M' THEN 'Modulo' >> WHEN pclocatortype = 'N' THEN 'Round Robin' >> WHEN pclocatortype = 'R' THEN 'Replicate' >> ELSE 'Unknown' >> END AS distribution, >> pg_attribute.attname attname >> FROM pg_class, pgxc_class, pg_attribute >> WHERE pg_class.oid = pgxc_class.pcrelid >> and pg_class.oid = pg_attribute.attrelid >> and pgxc_class.pcattnum = pg_attribute.attnum >> UNION >> >> SELECT pg_class.relname relation, >> CASE >> WHEN pclocatortype = 'H' THEN 'Hash' >> WHEN pclocatortype = 'M' THEN 'Modulo' >> WHEN pclocatortype = 'N' THEN 'Round Robin' >> WHEN pclocatortype = 'R' THEN 'Replicate' >> ELSE 'Unknown' >> END AS distribution, >> '- none -' attname >> FROM pg_class, pgxc_class, pg_attribute >> WHERE pg_class.oid = pgxc_class.pcrelid >> and pg_class.oid = pg_attribute.attrelid >> and pgxc_class.pcattnum = 0 >> ; >> >> COMMENT ON VIEW xc_table_distribution IS >> 'View to show distribution/replication attribute of the given table.'; >> >> [koichi@linker:scripts]$ >> ------ >> >> Info for pgbench tables are as follows: >> ----------- >> koichi=# select * from xc_table_distribution >> koichi-# ; >> relation | distribution | attname >> ------------------+--------------+---------- >> t | Hash | a >> pgbench_branches | Hash | bid >> y | Modulo | a >> pgbench_tellers | Hash | bid >> pgbench_accounts | Hash | bid >> x | Round Robin | - none - >> pgbench_history | Hash | bid >> s | Replicate | - none - >> (8 rows) >> >> koichi=# >> ------ >> >> Hope it helps. >> >> Regards; >> ---------- >> Koichi Suzuki >> >> >> 2012/11/6 Tatsuo Ishii <is...@po...>: >>>> One suggestion. Because set bid value is relatively small, it may be >>>> better to distribute tables using MODULO, not HASH. >>> >>> What is the distribution method when pgbench -k is used? >>> -- >>> Tatsuo Ishii >>> SRA OSS, Inc. Japan >>> English: http://www.sraoss.co.jp/index_en.php >>> Japanese: http://www.sraoss.co.jp |
|
From: Tatsuo I. <is...@po...> - 2012-11-06 11:14:34
|
Thanks. Your answer cleared my question. BTW, the view created on the coordinator node should be copied to data nodes? I couldn't find the view on the data nodes. If the answer is no, DDL executed on coordinator node is not copied to data node in Postgres-XC? CREATE TABLE seems to be copied to data node. Maybe CREATE VIEW is treated specially? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > Yes, the view works at coordinator locally. > > CREATE VIEW is propagated to all the coordinators so you should run it > only once at one of them. > > The reason I suggested EXECUTE DIRECT is user table looked involved in > the query. > > Regards; > ---------- > Koichi Suzuki > > > 2012/11/6 Tatsuo Ishii <is...@po...>: >> Thanks. Your view works great! BTW, Should the view be executed on >> coordinator node? If so, I'm a little bit confused. Because you said: >> >>> Because pgxc_class and pg_class are both local to each node, oid is >>> also local. To query correctly, they should be issued using EXECUTE >>> DIRECT statement. Each EXECUTE DIRECT will return the result local >>> to each node. >>> 2012/11/6 Tatsuo Ishii <is...@po...>: >>>> I sent a query to coordinator node and got following result. Doesn't >>>> this means table "pgbench_accounts" distributed? >>>> >>>> test=# select * from pgxc_class as x join pg_class as c on x.pcrelid = c.oid where c.relname = 'pgbench_accounts'; >> >>> pgxc_class has distribution definition of each table. I'm using the >>> following views as a handy tool: >>> >>> ---- >>> [koichi@linker:scripts]$ cat distr_view.sql >>> DROP VIEW IF EXISTS xc_table_distribution; >>> >>> >>> CREATE VIEW xc_table_distribution AS >>> >>> SELECT pg_class.relname relation, >>> CASE >>> WHEN pclocatortype = 'H' THEN 'Hash' >>> WHEN pclocatortype = 'M' THEN 'Modulo' >>> WHEN pclocatortype = 'N' THEN 'Round Robin' >>> WHEN pclocatortype = 'R' THEN 'Replicate' >>> ELSE 'Unknown' >>> END AS distribution, >>> pg_attribute.attname attname >>> FROM pg_class, pgxc_class, pg_attribute >>> WHERE pg_class.oid = pgxc_class.pcrelid >>> and pg_class.oid = pg_attribute.attrelid >>> and pgxc_class.pcattnum = pg_attribute.attnum >>> UNION >>> >>> SELECT pg_class.relname relation, >>> CASE >>> WHEN pclocatortype = 'H' THEN 'Hash' >>> WHEN pclocatortype = 'M' THEN 'Modulo' >>> WHEN pclocatortype = 'N' THEN 'Round Robin' >>> WHEN pclocatortype = 'R' THEN 'Replicate' >>> ELSE 'Unknown' >>> END AS distribution, >>> '- none -' attname >>> FROM pg_class, pgxc_class, pg_attribute >>> WHERE pg_class.oid = pgxc_class.pcrelid >>> and pg_class.oid = pg_attribute.attrelid >>> and pgxc_class.pcattnum = 0 >>> ; >>> >>> COMMENT ON VIEW xc_table_distribution IS >>> 'View to show distribution/replication attribute of the given table.'; >>> >>> [koichi@linker:scripts]$ >>> ------ >>> >>> Info for pgbench tables are as follows: >>> ----------- >>> koichi=# select * from xc_table_distribution >>> koichi-# ; >>> relation | distribution | attname >>> ------------------+--------------+---------- >>> t | Hash | a >>> pgbench_branches | Hash | bid >>> y | Modulo | a >>> pgbench_tellers | Hash | bid >>> pgbench_accounts | Hash | bid >>> x | Round Robin | - none - >>> pgbench_history | Hash | bid >>> s | Replicate | - none - >>> (8 rows) >>> >>> koichi=# >>> ------ >>> >>> Hope it helps. >>> >>> Regards; >>> ---------- >>> Koichi Suzuki >>> >>> >>> 2012/11/6 Tatsuo Ishii <is...@po...>: >>>>> One suggestion. Because set bid value is relatively small, it may be >>>>> better to distribute tables using MODULO, not HASH. >>>> >>>> What is the distribution method when pgbench -k is used? >>>> -- >>>> Tatsuo Ishii >>>> SRA OSS, Inc. Japan >>>> English: http://www.sraoss.co.jp/index_en.php >>>> Japanese: http://www.sraoss.co.jp |
|
From: Koichi S. <koi...@gm...> - 2012-11-06 11:34:31
|
No, So far, views are only propagated to coordinators. pgxc_* catalogs will be found only at coordinators. Depending upon objects, CREATE/ALTER/DROP are propagated to datanodes as well. View is a kind of exception. Regards; ---------- Koichi Suzuki 2012/11/6 Tatsuo Ishii <is...@po...>: > Thanks. Your answer cleared my question. > > BTW, the view created on the coordinator node should be copied to data > nodes? I couldn't find the view on the data nodes. If the answer is > no, DDL executed on coordinator node is not copied to data node in > Postgres-XC? CREATE TABLE seems to be copied to data node. Maybe > CREATE VIEW is treated specially? > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > >> Yes, the view works at coordinator locally. >> >> CREATE VIEW is propagated to all the coordinators so you should run it >> only once at one of them. >> >> The reason I suggested EXECUTE DIRECT is user table looked involved in >> the query. >> >> Regards; >> ---------- >> Koichi Suzuki >> >> >> 2012/11/6 Tatsuo Ishii <is...@po...>: >>> Thanks. Your view works great! BTW, Should the view be executed on >>> coordinator node? If so, I'm a little bit confused. Because you said: >>> >>>> Because pgxc_class and pg_class are both local to each node, oid is >>>> also local. To query correctly, they should be issued using EXECUTE >>>> DIRECT statement. Each EXECUTE DIRECT will return the result local >>>> to each node. >>>> 2012/11/6 Tatsuo Ishii <is...@po...>: >>>>> I sent a query to coordinator node and got following result. Doesn't >>>>> this means table "pgbench_accounts" distributed? >>>>> >>>>> test=# select * from pgxc_class as x join pg_class as c on x.pcrelid = c.oid where c.relname = 'pgbench_accounts'; >>> >>>> pgxc_class has distribution definition of each table. I'm using the >>>> following views as a handy tool: >>>> >>>> ---- >>>> [koichi@linker:scripts]$ cat distr_view.sql >>>> DROP VIEW IF EXISTS xc_table_distribution; >>>> >>>> >>>> CREATE VIEW xc_table_distribution AS >>>> >>>> SELECT pg_class.relname relation, >>>> CASE >>>> WHEN pclocatortype = 'H' THEN 'Hash' >>>> WHEN pclocatortype = 'M' THEN 'Modulo' >>>> WHEN pclocatortype = 'N' THEN 'Round Robin' >>>> WHEN pclocatortype = 'R' THEN 'Replicate' >>>> ELSE 'Unknown' >>>> END AS distribution, >>>> pg_attribute.attname attname >>>> FROM pg_class, pgxc_class, pg_attribute >>>> WHERE pg_class.oid = pgxc_class.pcrelid >>>> and pg_class.oid = pg_attribute.attrelid >>>> and pgxc_class.pcattnum = pg_attribute.attnum >>>> UNION >>>> >>>> SELECT pg_class.relname relation, >>>> CASE >>>> WHEN pclocatortype = 'H' THEN 'Hash' >>>> WHEN pclocatortype = 'M' THEN 'Modulo' >>>> WHEN pclocatortype = 'N' THEN 'Round Robin' >>>> WHEN pclocatortype = 'R' THEN 'Replicate' >>>> ELSE 'Unknown' >>>> END AS distribution, >>>> '- none -' attname >>>> FROM pg_class, pgxc_class, pg_attribute >>>> WHERE pg_class.oid = pgxc_class.pcrelid >>>> and pg_class.oid = pg_attribute.attrelid >>>> and pgxc_class.pcattnum = 0 >>>> ; >>>> >>>> COMMENT ON VIEW xc_table_distribution IS >>>> 'View to show distribution/replication attribute of the given table.'; >>>> >>>> [koichi@linker:scripts]$ >>>> ------ >>>> >>>> Info for pgbench tables are as follows: >>>> ----------- >>>> koichi=# select * from xc_table_distribution >>>> koichi-# ; >>>> relation | distribution | attname >>>> ------------------+--------------+---------- >>>> t | Hash | a >>>> pgbench_branches | Hash | bid >>>> y | Modulo | a >>>> pgbench_tellers | Hash | bid >>>> pgbench_accounts | Hash | bid >>>> x | Round Robin | - none - >>>> pgbench_history | Hash | bid >>>> s | Replicate | - none - >>>> (8 rows) >>>> >>>> koichi=# >>>> ------ >>>> >>>> Hope it helps. >>>> >>>> Regards; >>>> ---------- >>>> Koichi Suzuki >>>> >>>> >>>> 2012/11/6 Tatsuo Ishii <is...@po...>: >>>>>> One suggestion. Because set bid value is relatively small, it may be >>>>>> better to distribute tables using MODULO, not HASH. >>>>> >>>>> What is the distribution method when pgbench -k is used? >>>>> -- >>>>> Tatsuo Ishii >>>>> SRA OSS, Inc. Japan >>>>> English: http://www.sraoss.co.jp/index_en.php >>>>> Japanese: http://www.sraoss.co.jp |
|
From: Tatsuo I. <is...@po...> - 2012-11-06 10:47:01
|
Thanks. Your view works great! BTW, Should the view be executed on coordinator node? If so, I'm a little bit confused. Because you said: > Because pgxc_class and pg_class are both local to each node, oid is > also local. To query correctly, they should be issued using EXECUTE > DIRECT statement. Each EXECUTE DIRECT will return the result local > to each node. > 2012/11/6 Tatsuo Ishii <is...@po...>: >> I sent a query to coordinator node and got following result. Doesn't >> this means table "pgbench_accounts" distributed? >> >> test=# select * from pgxc_class as x join pg_class as c on x.pcrelid = c.oid where c.relname = 'pgbench_accounts'; > pgxc_class has distribution definition of each table. I'm using the > following views as a handy tool: > > ---- > [koichi@linker:scripts]$ cat distr_view.sql > DROP VIEW IF EXISTS xc_table_distribution; > > > CREATE VIEW xc_table_distribution AS > > SELECT pg_class.relname relation, > CASE > WHEN pclocatortype = 'H' THEN 'Hash' > WHEN pclocatortype = 'M' THEN 'Modulo' > WHEN pclocatortype = 'N' THEN 'Round Robin' > WHEN pclocatortype = 'R' THEN 'Replicate' > ELSE 'Unknown' > END AS distribution, > pg_attribute.attname attname > FROM pg_class, pgxc_class, pg_attribute > WHERE pg_class.oid = pgxc_class.pcrelid > and pg_class.oid = pg_attribute.attrelid > and pgxc_class.pcattnum = pg_attribute.attnum > UNION > > SELECT pg_class.relname relation, > CASE > WHEN pclocatortype = 'H' THEN 'Hash' > WHEN pclocatortype = 'M' THEN 'Modulo' > WHEN pclocatortype = 'N' THEN 'Round Robin' > WHEN pclocatortype = 'R' THEN 'Replicate' > ELSE 'Unknown' > END AS distribution, > '- none -' attname > FROM pg_class, pgxc_class, pg_attribute > WHERE pg_class.oid = pgxc_class.pcrelid > and pg_class.oid = pg_attribute.attrelid > and pgxc_class.pcattnum = 0 > ; > > COMMENT ON VIEW xc_table_distribution IS > 'View to show distribution/replication attribute of the given table.'; > > [koichi@linker:scripts]$ > ------ > > Info for pgbench tables are as follows: > ----------- > koichi=# select * from xc_table_distribution > koichi-# ; > relation | distribution | attname > ------------------+--------------+---------- > t | Hash | a > pgbench_branches | Hash | bid > y | Modulo | a > pgbench_tellers | Hash | bid > pgbench_accounts | Hash | bid > x | Round Robin | - none - > pgbench_history | Hash | bid > s | Replicate | - none - > (8 rows) > > koichi=# > ------ > > Hope it helps. > > Regards; > ---------- > Koichi Suzuki > > > 2012/11/6 Tatsuo Ishii <is...@po...>: >>> One suggestion. Because set bid value is relatively small, it may be >>> better to distribute tables using MODULO, not HASH. >> >> What is the distribution method when pgbench -k is used? >> -- >> Tatsuo Ishii >> SRA OSS, Inc. Japan >> English: http://www.sraoss.co.jp/index_en.php >> Japanese: http://www.sraoss.co.jp |
|
From: Tatsuo I. <is...@po...> - 2012-11-06 05:53:11
|
> One suggestion. Because set bid value is relatively small, it may be > better to distribute tables using MODULO, not HASH. What is the distribution method when pgbench -k is used? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp |
|
From: Tatsuo I. <is...@po...> - 2012-11-15 04:47:58
|
>> PostgreSQL Enterprise Consortium is planning to do a benchmark against >> Postges-XC. If we would use standard pgbench workload(pgbench default, >> -N, -S), what is a recommended portioning plan for pgbench_accounts? >> > If you want to show up the scalability, I recommend that you use pgbench > with option -k for initialization and launching, which is an option that > has been added in the pgbench version of XC available in its source code. > This allows to to a benchmark test by using bid as a distribution key so > this minimizes the amount of 2PC done when write operations involve several > nodes in a transaction. > $ pgbench --help > Initialization options: > -k distribute by primary key branch id - bid > Benchmarking options: > -k query with default key and additional key branch id (bid) > > Depending on your cluster structure, I would also recommend you also to use > PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for > example with the Datanode that is on the same server as a Coordinator if > you use a structure of 1 Coordinator and 1 Datanode per server. This also > reduces the network load by having replicated table read being done on the > preferred node in priority. This is especially better if the node is local > of course. Thanks for suggestion. We did pgbench -k benchmark and got good result. Details will be published at PGECONS seminar on December 7th in Tokyo. https://www.pgecons.org/2012/12/07/1527/ Also I would like to do some read-only workload benchmark as well. Any suggestion to get good result? I'm not sure if plain pgbench -S gives good result. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp |
|
From: Koichi S. <koi...@gm...> - 2012-11-15 05:27:33
|
Thanks a lot for the info. I registered for the seminar. Looking forward to hearing about XC. Regards; ---------- Koichi Suzuki 2012/11/15 Tatsuo Ishii <is...@po...>: >>> PostgreSQL Enterprise Consortium is planning to do a benchmark against >>> Postges-XC. If we would use standard pgbench workload(pgbench default, >>> -N, -S), what is a recommended portioning plan for pgbench_accounts? >>> >> If you want to show up the scalability, I recommend that you use pgbench >> with option -k for initialization and launching, which is an option that >> has been added in the pgbench version of XC available in its source code. >> This allows to to a benchmark test by using bid as a distribution key so >> this minimizes the amount of 2PC done when write operations involve several >> nodes in a transaction. >> $ pgbench --help >> Initialization options: >> -k distribute by primary key branch id - bid >> Benchmarking options: >> -k query with default key and additional key branch id (bid) >> >> Depending on your cluster structure, I would also recommend you also to use >> PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for >> example with the Datanode that is on the same server as a Coordinator if >> you use a structure of 1 Coordinator and 1 Datanode per server. This also >> reduces the network load by having replicated table read being done on the >> preferred node in priority. This is especially better if the node is local >> of course. > > Thanks for suggestion. We did pgbench -k benchmark and got good > result. Details will be published at PGECONS seminar on December 7th > in Tokyo. > > https://www.pgecons.org/2012/12/07/1527/ > > Also I would like to do some read-only workload benchmark as well. Any > suggestion to get good result? I'm not sure if plain pgbench -S gives > good result. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |