|
From: shane <sh...@lo...> - 2007-07-19 10:33:11
|
Anyone have (or know of) a complete list of the required permissions
for a the mysql-user's privileges (which each site's virtual user
needs to use) when doing database replication w/ multiple slaves?
I ask because we ran into a bit of an issue. We added our 2nd slave.
And the slave would never 'go live'. It's isalive='Yes' in the dbs
table (as 'reader'). When we would query our slash-jabber-bot for
database-status it'd say:
lag ? bog 0.0 STOPPED! - it_read2
I had a heck of a time figuring out how to get it to the 'unstopped'.
What I ended up figuring out was that it was a permissions problem.
The site's VU's mysql user had relatively minimal permissions on the
slave. We had it setup that way because, well, it's only ever doing
select's on that server, the replication-user is doing everything else.
I started looking at permissions because of this:
mysql> select * from dbs_readerstatus order by ts DESC limit 4;
+---------------------+------+-----------+---------------
+-------------+----------------+----------------+-----------
+------------+-------------------+
| ts | dbid | was_alive | was_reachable |
was_running | slave_lag_secs | query_bog_secs | bog_rsqid |
had_weight | had_weight_adjust |
+---------------------+------+-----------+---------------
+-------------+----------------+----------------+-----------
+------------+-------------------+
| 2007-07-19 10:13:04 | 6 | yes | yes |
no | NULL | 0 | NULL |
1 | 0.1 |
| 2007-07-19 10:13:04 | 2 | yes | yes |
yes | 0 | 0 | NULL |
1 | 1 |
| 2007-07-19 10:12:59 | 6 | yes | yes |
no | NULL | 0 | NULL |
1 | 0.1 |
| 2007-07-19 10:12:59 | 2 | yes | yes |
yes | 3 | 0 | NULL |
1 | 1 |
+---------------------+------+-----------+---------------
+-------------+----------------+----------------+-----------
+------------+-------------------+
(the id=6 is the new slave). It was never ending up with
'was_running='yes''
which I knew was wrong, because I could ssh into the machine and
attach to the processes and see that it was replicating just fine.
I ended up figuring out that the VU that is accessing the slave needs
Process_priv.
ie
update user set Process_priv='Y' where User='foo';
Because it does a show full processlist, and it looks for the
processes by the 'System user'. If the VU exec'ing that command
doesn't have the priv, it won't see that, then the code will put an
odd entry in your slashd logs;
slashdLog("Process id $hr->{Id} on vu '$vu'"
. " has unknown system user state"
. " '$state'");
and you'll eventually get marked as stopped:
# Determine the two big numbers we care about: how far
# behind its slave sql thread is, and what's its worst
# repeated bogged-down query.
# First check to be sure it has a slave sql thread running.
my $slave_sql_id = $slave_sql_id{$vu};
if (!$slave_sql_id) {
# If not, its slave is stopped and we can't calculate
# its lag.
$reader_info->{$vu}{stopped} = 1;
} else {
# If so, pull the lag of that process in seconds.
my $slave_sql_lag = $process{$vu}{$slave_sql_id}{Time};
$reader_info->{$vu}{slave_lag_secs} = $slave_sql_lag;
}
(I'm pulling all the code from the balance_readers.pl task in themes/
slashcode/tasks)
I did look through the docs before I attempted this. I couldn't find
a section discussing the specifics of replication (it'd be helpful
for the weighting system to be spelled out somewhere in the docs!)
though it is talked about in slashguide.pod.
Thanks,
Shane
|