Re: [Postgres-xc-general] Node Configuration For High Volume Writes

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Nick,
I was going to ask you about the SELECT queries you are firing, but I see
that you have sent those in another thread. So, I will respond more there.

Following factors matter when it comes to coordinators
1. Number of connections - Coordinator is point of contact for
applications, so more the connections more is the load, and thus having
multiple coordinators helps there. But in your case, you mentioned that the
number of connections are not so many (probably your current PG system is
able to handle it), so you may want to look at the other factors.
2. Load on coordinator - In case of distributed tables, coordinator spends
CPU time, in combining those results (aggregates, sorting etc.), so even
though, there are small number of connections, a coordinator may get
loaded, because of query processing. So, in your case, check if coordinator
machine is reaching its CPU/network/disk IO/memory limits. If so, try
putting coordinator on a machine different from those where datanodes are
running. You may choose to share that machine with GTM, if needs so. This
will provide coordinator with the needed CPU/network/RAM resources. This
might actually work for you. In such case, you may want to give coordinator
a machine with higher CPU/core power and higher RAM.

On Tue, Aug 21, 2012 at 8:14 PM, Nick Maludy <nm...@gm...> wrote:

> All,
>
> I am currently exploring PostgresXC as a clustering solution for a project
> i am working on. The use case is a follows:
>
> - Time series data from multiple sensors
> - Sensors report at various rates from 50Hz to once every 5 minutes
> - INSERTs (COPYs) on the order of 1000+/s
> - No UPDATEs once the data is in the database we consider it immutable
> - Large volumes of data needs to be stored (one sensor 50Hz sensor = ~1.5
> billion rows for a year of collection)
> - SELECTs need to run as quick as possible for UI and data analysis
> - Number of clients connections = 10-20, +95% of the INSERTs are done by
> one node, +99% of the SELECTs are done by the rest of the nodes
> - Very write heavy application, reads are not nearly as frequent as writes
> but usually involve large amounts of data.
>
> My current cluster configuration is as follows
>
> Server A: GTM
> Server B: GTM Proxy, Coordinator
> Server C: Datanode
> Server D: Datanode
> Server E: Datanode
>
> My question is, in your documentation you recommend having a coordinator
> at each datanode, what is the rational for this?
>
> Do you think it would be appropriate in my situation with so few
> connections?
>
> Would i get better read performance, and not hurt my write performance too
> much (write performance is more important than read)?
>
> Thanks,
> Nick
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
>

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company