Thread: [Spock Proxy Devel] SELECT queries with USING INDEX
Status: Alpha
Brought to you by:
kaotao
From: Niv <ni...@tr...> - 2008-12-03 14:39:35
|
Hello. I have a problem with Spock Proxy. I'm trying to run a query on all shards, and the query contains an index hint ("USE INDEX"): SELECT * FROM <table> USE INDEX (index1, index2) WHERE <conditions> ORDER BY <order> LIMIT 0, 10 In the spockproxy log file, I see the warning "could not parse table alias, using default db". When I remove the "USE INDEX", the query works (but takes forever). Can you please offer any assistance? Thanks, Niv Singer |
From: Frank F. <fr...@co...> - 2008-12-04 19:40:49
|
On Dec 3, 2008, at 2:39 PM, Niv wrote: > > I have a problem with Spock Proxy. > I'm trying to run a query on all shards, and the query contains an > index hint ("USE INDEX"): > And the sharding key is actually none (shard_range_directory is > empty), the idea here is to run the query on all shards. OK currently we don't support the USE INDEX - we should (and will in our next build -whenever that is) because it's noting to to proxy to pass it along to each shard. A work around for today might be to build the indexes such that mysql would use them by default. Try adding all the columns you select into them (make it a covered query). > Since we're already on the subject, I have a related question. > > 1. If my query would look like > SELECT * FROM shard1.table WHERE key IN (x, y, z) > UNION ALL > SELECT * FROM shard2.table WHERE key IN (p, q, s); > > Will the two queries run in parallel and their results will be > combined by the proxy? Well no, that is you can't do that at all. That is you cannot specify the shard in the query - the whole point of the proxy is to hide all the sharding from your application (hide it from your programmers SQL) - things will just work. But two ways do occur to me to do what you appear to be trying to do. First off this is only useful if your have sharded data in an intelligent way - that is there is some sort of qualitative difference in the data in shard 1 and the data in shard 2. BEWARE the problem here is supposing you grow and need a shard 3 - you cannot do that, you'll have to add two new shards and change all your code to support this (changing your code is exactly what we're trying to avoid). To see an article on why what I'll call "hard loading" your data is a bad idea read http://qntm.org/?gay an amazingly in depth analysis of how to data model marriage (nothing to do with your issue but the idea that something changes - same sex marriage for example and the fields you called "husband" and "wife" are meaningless). Still if you insist: 1 - combine your query and add the shard key to the where clause (let's say your shard key is sk_id and 1 to 100000 is in shard 1 and 100001 to 200000 is shard 2) your single query becomes: SELECT * FROM shard1.table WHERE ( key IN (x, y, z) AND sk_id BETWEEN 1 AND 100000 ) OR (key IN (p, q, s) AND sk_id BETWEEN 100001 AND 200000); This works but depending on how the tables are indexed and how the optimizer looks at this it may not be very fast. 2 - Create a view in each shard that is a little bit different. in shard 1: CREATE VIEW foo AS SELECT * FROM table WHERE key IN (x, y, z); in shard 2: CREATE VIEW foo AS SELECT * FROM table WHERE key IN (p, q, s); Now through the proxy you can run: SELECT * FROM table --- this will return all data SELECT * FROM foo --- this will return key IN (x, y, z) from shard 1 and key IN (p, q, s) from shard 2 in one results set SELECT * FROM foo WHERE other_key = 10 --- same as above but with the additional other_key = 10 for both shards. > > 2. Is SQL_CALC_FOUND_ROWS supported? Will a post-query of SELECT > FOUND_ROWS() return the total number of rows that would have > returned from all shards without LIMIT? No, sorry but not supported and difficult to support because the proxy also does connection pooling which is great but any kind of post query function will not work because you're not guaranteed the same connection for the next query. > Thank you very much for your assistance and for releasing Spock > Proxy. It looks like it's going to help me a lot, and I would also > be glad to contribute back (for instance, if the features above are > not supported, I would love to help and implement them myself). > > Regards, > Niv Singer > Tracx > > On Wed, Dec 3, 2008 at 7:11 PM, Frank Flynn <fr...@co...> > wrote: > Niv, > > Interesting - I've used "USE INDEX" but I don't use it often. Let > me check with the developer - also it might help if you can send me > the exact SQL so we can try to parse it. Send it to me directly not > to the list if you'd prefer not to post it (probably safer). We > don't need the whole schema but do tell me the sharding key. > > We'll answer on the list when we have one. > > Frank > > > On Dec 3, 2008, at 6:39 AM, Niv wrote: > > Hello. > > I have a problem with Spock Proxy. > I'm trying to run a query on all shards, and the query contains an > index hint ("USE INDEX"): > > SELECT * > FROM <table> > USE INDEX (index1, index2) > WHERE <conditions> > ORDER BY <order> > LIMIT 0, 10 > > In the spockproxy log file, I see the warning "could not parse table > alias, using default db". > When I remove the "USE INDEX", the query works (but takes forever). > > Can you please offer any assistance? > > Thanks, > Niv Singer > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________ > spockproxy-devel mailing list > spo...@li... > https://lists.sourceforge.net/lists/listinfo/spockproxy-devel > > |