Re: [Spock Proxy Devel] Feature Requests
Status: Alpha
Brought to you by:
kaotao
From: Frank F. <fr...@co...> - 2009-06-11 20:18:01
|
On Jun 10, 2009, at 3:16 AM, Graham Finlayson wrote: > First of all, thanks for the development that has gone into this, as > it looks to be just what I was looking for. I have a couple of > suggestions which I think would be useful additions > > 1. I want to create shards based on a character database key. E.g. > All users from 'a' to 'b' are stored in shard1. The current > implementation only allows the sharding key to be an integer value. > I can easily 'hack' this in by mapping the first character of the > key to an integer value, but it would be nicer to have this properly > supported. Are there any plans to do this? This is less useful than you might think. Consider a schema with a 'users' table sharded as you describe, name starting with 'a' or 'b' in shard 1, 'c' or 'd' in shard 2 and so on. This works well for the users table, but only that table. If we have a 'payment' table which records each user payment that payment table must have the shard key (user_name) in it or the proxy cannot know where to insert a new payment row. That is the same for all the other tables; you would do much better to shard on user_id and index users.name, users.id so that the query to lookup a users id from a name is quick. Now it will be fast to either join this to other tables or to retrieve the users.id and query using this value. There is another aspect about this where I would be very cautious - I once worked for a library who had a reservation system where they would store your requested material by last name. Turned out there that 'm' was almost 1/3 of their customers. I don't have any kind of metrics of how typical this is - perhaps they had a large Scottish population in their customer base but my point is that it can be hard to balance your shards when your key is an attribute that you don't control. An arbitrary key such as an id column is easier to manipulate. > 2. I am looking to use SpockProxy in conjunction with the MySQL > Master Master Replication Monitor, where all the Shards are Master/ > Master Active/Passive MySQL servers (MMM). MMM can define a > writerRole, to write to the currently active MySQL server, and two > readerRoles to allow reads to be done on both MySQL servers. > SpockProxy supports this out of the box if you configure the > host_name to be the MMM writerRole. However, this means that the > Passive MySQL server sits fairly idle, which is a waste of resources > in my mind. Therefore, a useful extention, I think, to SpockProxy > would be to allow the readerRoles to be configured, and have > SpockProxy round robin any "read" requests between the Active and > Passive MySQL servers to balance the load a little when both servers > are active. My thought is that Spockproxy should become a plugin as MySQL proxy starts to support plugins (shortly after version 1 according to Jan Kneschke). Originally I thought this would be very complex but in writing this answer I'm thinking it might not be too horrible. Let's say we have some number of shards (say 4) and the same number of read slaves for a total of 8 databases - and this is a simple case it gets more complex if you have several sets of read slaves. When a query comes in the Spockproxy / Replication Monitor will have to follow some logic like this: Given the pool of all the data servers (except the universal DB which is only ever used for universal writes) Query contains a shard key? YES -> Reduce the pool of available DB servers to only those serving the one needed shard NO -> do nothing Query is READ or WRITE? READ - choose the least busy server(s) from the pool one for each shard in the pool WRITE - Are you writing UNIVERSAL or SHARDED data? UNIVERSAL - choose the one universal db (this is never in the pool) SHARDED - choose the active master server(s) from the pool one for each shard in the pool The issues is Spock proxy does not have any of the Replication Monitor features and as it is we've probably removed some vital components to implement this in our code base (remember we're a fork off of 0.6). This is why I mentioned the plug in as a goal; so folks could mix and match features in order to create a solution to meet their specific needs. In the short term you might run both in series - for each shard pair run an MMM and present these to the Spockproxy as the database shards. That ought to work (let us know if it does). > I am willing to help with any implementation/testing of these > features, but would want to make sure that the design of these > features would be acceptable before I proceed. We are not doing much development on Spockproxy right now; as I said my thought was to move into a simpler plugin but feel the freedom if you like or as MySQL proxy supports plugins help us then move to a plug in. Frank |