|
From: Michael P. <mic...@gm...> - 2014-05-04 04:24:24
|
On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha <dor...@gm...> wrote: >> You just need commodity INTEL server runnign Linux. > Are INTEL cpu required ? If not INTEL can be removed ? (also running typo) Not really... I agree to what you mean here. >> For datawarehouse >> >> applications, you may need separate patch which devides complexed query >> into smaller >> >> chunks which run in datanodes in parallel. StormDB will provide such >> patche. > > Wasn't stormdb bought by another company ? Is there an opensource > alternative ? Fix the "patche" typo ? > > A way to make it simpler is by merging coordinator and datanode into 1 and > making it possible for a 'node' to not hold data (be a coordinator only), > like in elastic-search, but you probably already know that. +1. This would alleviate data transfer between cross-node joins where Coordinator and Datanodes are on separate servers. You could always have both nodes on the same server with the XC of now... But that's double number of nodes to monitor. > What exact things does the gtm-proxy do? For example, a single row insert > wouldn't need the gtm (coordinator just inserts it to the right > data-node)(asumming no sequences, since for that the gtm is needed)? Grouping messages between Coordinator/Datanode and GTM to reduce package interferences and improve performance. > If multiple tables are sharded on the same key (example: user_id). Will all > the rows, from the same user in different tables be in the same data-node ? Yep. Node choice algorithm is based using the data type of the key. -- Michael |