|
From: Aaron J. <aja...@re...> - 2014-04-29 06:16:52
|
That was my thinking as well. I indirectly executed the queries independently. Basically, each query takes about 5-10 seconds each for 4 data nodes - specifically, I used psql -h <datanode> -p <port> - to time the individual data node performance individually. So you figure, worst case = 10 seconds x 4 nodes = 40 seconds of aggregate time on a serial request. But I'm seeing 65 seconds which means there's some other overhead that I'm missing. The 65 second aggregate is also the reason why I asked if the requests were parallel or serial because it *feels* serial though it could be other factors. I'll retest and update. ________________________________ From: Ashutosh Bapat [ash...@en...] Sent: Tuesday, April 29, 2014 1:05 AM To: Aaron Jackson Cc: amul sul; pos...@li... Subject: Re: [Postgres-xc-general] Data Node Scan Performance Hi Aaron, Can you please take the timing of executing "EXECUTE DIRECT <query to the datanode>" to some datanode. I suspect that the delay you are seeing is added by the sheer communication between coord and datanode. Some of that would be libpq overhead and some of it will be network overhead. On Tue, Apr 29, 2014 at 10:58 AM, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: Interesting, So, I wonder why I am seeing query times that are more than the sum of the total times required to perform the process without the coordinator. For example, let's say the query was 'SELECT 500 as Id, Foo, Bar from MyTable WHERE Id = 186' - I could perform this query at all 4 nodes and they would take no more than 10 seconds to run individually. However, when performed against the coordinator, this same query takes 65 seconds. That's more than the total aggregate of all data nodes. Any thoughts - is it completely attributed to the coordinator? ________________________________________ From: amul sul [sul...@ya...<mailto:sul...@ya...>] Sent: Tuesday, April 29, 2014 12:23 AM To: Aaron Jackson; pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] Data Node Scan Performance >On Tuesday, 29 April 2014 10:38 AM, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: > my question is, does the coordinator execute the data node scan serially >or in parallel - and if it's serially, >is there any thought around how to make it parallel? IMO, scan on data happens independently i.e parallel, the scan result is collected at co-ordinator and returned to client. Referring distributed table using other than distribution key(in your case it Q instead of k), has little penalty. Regards, Amul Sul ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |