From: shane <sh...@lo...> - 2007-07-19 10:33:11
|
Anyone have (or know of) a complete list of the required permissions for a the mysql-user's privileges (which each site's virtual user needs to use) when doing database replication w/ multiple slaves? I ask because we ran into a bit of an issue. We added our 2nd slave. And the slave would never 'go live'. It's isalive='Yes' in the dbs table (as 'reader'). When we would query our slash-jabber-bot for database-status it'd say: lag ? bog 0.0 STOPPED! - it_read2 I had a heck of a time figuring out how to get it to the 'unstopped'. What I ended up figuring out was that it was a permissions problem. The site's VU's mysql user had relatively minimal permissions on the slave. We had it setup that way because, well, it's only ever doing select's on that server, the replication-user is doing everything else. I started looking at permissions because of this: mysql> select * from dbs_readerstatus order by ts DESC limit 4; +---------------------+------+-----------+--------------- +-------------+----------------+----------------+----------- +------------+-------------------+ | ts | dbid | was_alive | was_reachable | was_running | slave_lag_secs | query_bog_secs | bog_rsqid | had_weight | had_weight_adjust | +---------------------+------+-----------+--------------- +-------------+----------------+----------------+----------- +------------+-------------------+ | 2007-07-19 10:13:04 | 6 | yes | yes | no | NULL | 0 | NULL | 1 | 0.1 | | 2007-07-19 10:13:04 | 2 | yes | yes | yes | 0 | 0 | NULL | 1 | 1 | | 2007-07-19 10:12:59 | 6 | yes | yes | no | NULL | 0 | NULL | 1 | 0.1 | | 2007-07-19 10:12:59 | 2 | yes | yes | yes | 3 | 0 | NULL | 1 | 1 | +---------------------+------+-----------+--------------- +-------------+----------------+----------------+----------- +------------+-------------------+ (the id=6 is the new slave). It was never ending up with 'was_running='yes'' which I knew was wrong, because I could ssh into the machine and attach to the processes and see that it was replicating just fine. I ended up figuring out that the VU that is accessing the slave needs Process_priv. ie update user set Process_priv='Y' where User='foo'; Because it does a show full processlist, and it looks for the processes by the 'System user'. If the VU exec'ing that command doesn't have the priv, it won't see that, then the code will put an odd entry in your slashd logs; slashdLog("Process id $hr->{Id} on vu '$vu'" . " has unknown system user state" . " '$state'"); and you'll eventually get marked as stopped: # Determine the two big numbers we care about: how far # behind its slave sql thread is, and what's its worst # repeated bogged-down query. # First check to be sure it has a slave sql thread running. my $slave_sql_id = $slave_sql_id{$vu}; if (!$slave_sql_id) { # If not, its slave is stopped and we can't calculate # its lag. $reader_info->{$vu}{stopped} = 1; } else { # If so, pull the lag of that process in seconds. my $slave_sql_lag = $process{$vu}{$slave_sql_id}{Time}; $reader_info->{$vu}{slave_lag_secs} = $slave_sql_lag; } (I'm pulling all the code from the balance_readers.pl task in themes/ slashcode/tasks) I did look through the docs before I attempted this. I couldn't find a section discussing the specifics of replication (it'd be helpful for the weighting system to be spelled out somewhere in the docs!) though it is talked about in slashguide.pod. Thanks, Shane |