Re: [Postgres-xc-general] pgbench suggestion

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

> On Thu, Nov 15, 2012 at 1:47 PM, Tatsuo Ishii <is...@po...> wrote:
> 
>> >> PostgreSQL Enterprise Consortium is planning to do a benchmark against
>> >> Postges-XC. If we would use standard pgbench workload(pgbench default,
>> >> -N, -S), what is a recommended portioning plan for pgbench_accounts?
>> >>
>> > If you want to show up the scalability, I recommend that you use pgbench
>> > with option -k for initialization and launching, which is an option that
>> > has been added in the pgbench version of XC available in its source code.
>> > This allows to to a benchmark test by using bid as a distribution key so
>> > this minimizes the amount of 2PC done when write operations involve
>> several
>> > nodes in a transaction.
>> > $ pgbench --help
>> > Initialization options:
>> >   -k           distribute by primary key branch id - bid
>> > Benchmarking options:
>> >   -k           query with default key and additional key branch id (bid)
>> >
>> > Depending on your cluster structure, I would also recommend you also to
>> use
>> > PREFERRED node with ALTER NODE (ALTER NODE nodename WITH (PREFERRED)) for
>> > example with the Datanode that is on the same server as a Coordinator if
>> > you use a structure of 1 Coordinator and 1 Datanode per server. This also
>> > reduces the network load by having replicated table read being done on
>> the
>> > preferred node in priority. This is especially better if the node is
>> local
>> > of course.
>>
>> Thanks for suggestion. We did pgbench -k benchmark and got good
>> result. Details will be published at PGECONS seminar on December 7th
>> in Tokyo.
>>
> Thanks for letting me know. I'll show up at this presentation I am pretty
> interested, but no dinner for me :)
> 
> https://www.pgecons.org/2012/12/07/1527/
>>
>> Also I would like to do some read-only workload benchmark as well. Any
>> suggestion to get good result? I'm not sure if plain pgbench -S gives
>> good result.
>>
> When using -S you only perform a select on aid of pgbench_accounts, so if
> you initialize pgbench without -k pgbench_accounts will be hashed with aid
> as key as the default if no distribution is specified is to take a hash
> distribution and the first hashable column of relation. So this would be
> good for scans as you will always scan only a single node based on the
> value of aid.
> So yeah do not use -k for read evaluation.

I'm confused. If I don't give -k to pgbench -i, then all the data for
pgbench_accounts go to the first data node.

If CREATE TABLE is not supplied WITH DISTRIBUTE, what is an expected
behavior?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp