|
From: Nirmal S. <sha...@gm...> - 2014-02-01 19:13:39
|
My query uses aggregates and joins and it looks like this :
Select
Sum(...),
Sum(..),
Avg(..),
...
....
..
From tableA a inner join tableB on a.col1 =b.col1
Inner join tableC on a.col1=c.col1
All the 3 tables are distributed on hash(col1) .
I have 1 coordinator , 6 nodes, 1 GTM.
When I run this query , it takes total 23 sec.
But when I run the same query on each and individual nodes then it takes 4 sec on each and every nodes.
So since it's cluster , it should ideally take 4 sec + some overhead time to combine data from each node on coordinator ( max 2 more sec) but I don't understand why it is taking 23 sec when runs from coordinator.
Nirmal
Sent from my iPad
> On Feb 1, 2014, at 11:02 AM, Mason Sharp <ms...@tr...> wrote:
>
>
>> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:
>> Hi Koichi,
>>
>> My tables are not replicated. They all are distributed the way you explained.
>> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records.
>>
>> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time.
>
>
>> Please advise.
>
> What does your query look like? A single table? A join? Using aggregates?
>
> Thanks,
>
> Mason
>
|