From: Andrew R. <ros...@gm...> - 2007-08-21 13:05:47
|
Hi all I've recently created a patch for mysqlfs. The idea of the patch is to check for the existence of an entry by key in the data_blocks table before attempting to write (query_write_one_block). The reason? to stop mysqlfs killing MySQL replication when attempting to insert an data_block with the same inode/seq as one that already exists. Now I'm not that familiar with the mysqlfs code, and was hoping someone with more knowledge than myself could let me know if I've gone in the right direction. http://dev.iris.ac/~ody/mysqlfs-x86_64/mysqlfs-0.4.0-rc1.patch many thanks Andrew |
From: Michal L. <mi...@lo...> - 2007-08-21 13:25:41
|
Hi Andrew, > I've recently created a patch for mysqlfs. The idea of the patch is > to check for the existence of an entry by key in the data_blocks table > before attempting to write (query_write_one_block). The reason? to > stop mysqlfs killing MySQL replication when attempting to insert an > data_block with the same inode/seq as one that already exists. Interesting. What version of MySQL are you running on both master and slave? Replication should never break when you attempt to insert a record with a primary key that already exists. In fact *any* user action or query should never break MySQL replication. If it does in your case and if it's reproducible I'd consider it a MySQL server bug worth filing a bugreport. > Now I'm not that familiar with the mysqlfs code, and was hoping > someone with more knowledge than myself could let me know if I've gone > in the right direction. > > http://dev.iris.ac/~ody/mysqlfs-x86_64/mysqlfs-0.4.0-rc1.patch Problem is that your query_inode_key_exists() and the subsequent insert are two independent transactions. I.e. even if your check returns false the inode can appear in another thread before the first thread decides to run the INSERT. You'd have to make it atomic to be safe, which in case of MYISAM tables means locking the whole table and therefore resulting to a worse performance. Anyway you shouldn't experience this problem even with the current code. I suspect MySQL. What does SHOW SLAVE STATUS\G say when it breaks? Michal |
From: Andrew R. <ros...@gm...> - 2007-08-21 14:05:02
|
On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > Hi Andrew, > > > I've recently created a patch for mysqlfs. The idea of the patch is > > to check for the existence of an entry by key in the data_blocks table > > before attempting to write (query_write_one_block). The reason? to > > stop mysqlfs killing MySQL replication when attempting to insert an > > data_block with the same inode/seq as one that already exists. > > Interesting. What version of MySQL are you running on both master and > slave? Replication should never break when you attempt to insert a > record with a primary key that already exists. In fact *any* user action > or query should never break MySQL replication. If it does in your case > and if it's reproducible I'd consider it a MySQL server bug worth filing > a bugreport. > I am currently running a chained master->master setup. It is a common MySQL replication problem with master->master setups, and crops up when the PK's are not auto incrementing. > > Now I'm not that familiar with the mysqlfs code, and was hoping > > someone with more knowledge than myself could let me know if I've gone > > in the right direction. > > > > http://dev.iris.ac/~ody/mysqlfs-x86_64/mysqlfs-0.4.0-rc1.patch > > Problem is that your query_inode_key_exists() and the subsequent insert > are two independent transactions. I.e. even if your check returns false > the inode can appear in another thread before the first thread decides > to run the INSERT. You'd have to make it atomic to be safe, which in > case of MYISAM tables means locking the whole table and therefore > resulting to a worse performance. I figured there would be a problem with threading. Have you thought about maybe providing some shared memory locking functionality? ;) Many thanks for the help. For now I think I'll keep my fingers crossed and hope that what I have done has lowered the odds enough that I will rarely, if ever, see a problem, considering there is minimal writes going on to my mounted filesystems anyway. cheers Andrew |
From: Michal L. <mi...@lo...> - 2007-08-21 14:38:18
|
Andrew Rose wrote: > I am currently running a chained master->master setup. It is a common > MySQL replication problem with master->master setups, and crops up > when the PK's are not auto incrementing. So it isn't MySQL error but an improper MySQL configuration ;-) In case of master-master you have to ensure that each server generates different keys, not that apps will somehow take care of it. Common solution is to make one master generate sequence 1,3,5,... and the second master 2,4,6,... In the case you run more than two masters you'll obviously increase the stepping to accommodate them all. Check out this page for details: http://dev.mysql.com/doc/refman/5.0/en/replication-auto-increment.html > I figured there would be a problem with threading. Have you thought > about maybe providing some shared memory locking functionality? ;) Not in the immediate future, sorry. I don't do any active development on mysqlfs anymore. > Many thanks for the help. For now I think I'll keep my fingers > crossed and hope that what I have done has lowered the odds enough > that I will rarely, if ever, see a problem, considering there is > minimal writes going on to my mounted filesystems anyway. I suggest you reconfigure the DB instead of crossing fingers ;-) Michal |
From: Andrew R. <ros...@gm...> - 2007-08-21 14:50:29
|
On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > Andrew Rose wrote: > > > I am currently running a chained master->master setup. It is a common > > MySQL replication problem with master->master setups, and crops up > > when the PK's are not auto incrementing. > > So it isn't MySQL error but an improper MySQL configuration ;-) Actually it is a problem with the inode field in data_nodes being a primary key, and NOT auto_increment'ing. Not a lot you can do about that, but certainly not a "improper MySQL configuration" thank you very much. Andrew |
From: Andrew R. <ros...@gm...> - 2007-08-21 16:14:14
|
On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > Not in the immediate future, sorry. I don't do any active development on > mysqlfs anymore. So if your not actively developing mysqfs anymore, has anyone else picked or is going to pick it up that you know of? > I suggest you reconfigure the DB instead of crossing fingers ;-) Sorry about my reaction in the last post BTW, I know you where trying to be helpful. Your right though, I really shouldn't cross my fingers. Andrew |
From: Michal L. <mi...@lo...> - 2007-08-22 02:58:17
|
Andrew Rose wrote: > On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > >> Not in the immediate future, sorry. I don't do any active development on >> mysqlfs anymore. > > So if your not actively developing mysqfs anymore, has anyone else > picked or is going to pick it up that you know of? As far as I know it just sits there on SourceForge and that's it. It was an interesting experience to take over the codebase and put it into shape but as I don't use it myself I don't have much interest in spending time on it. >> I suggest you reconfigure the DB instead of crossing fingers ;-) > > Sorry about my reaction in the last post BTW, I know you where trying > to be helpful. Your right though, I really shouldn't cross my fingers. I still wonder what scenario leads to the duplicate keys and subsequent replication failure. Do you write to the same mysqlfs on both hosts? Even if you create a file on "db1" and another file on "db2" they should have assigned different inode numbers. Perhaps if you write to a single file on both DBs it the INSERT on line 558 of query.c may fail. Hmm... Try changing it to REPLACE, ie "REPLACE INTO data_blocks ...". But I'm afraid you may end up with a data corruption if blindly replacing blocks like this. But your original solution that simply didn't do the write on detected conflict wasn't much more corruption proof either. In both cases you'd lose a write that you expected to go through. I personally wouldn't run it in master-master setup, it's a bit complicated to synchronize them, not a simple "one-liner" solution. Michal |
From: Stef B. <st...@bo...> - 2007-08-22 05:43:33
|
On Wednesday 22 August 2007 04:58:02 Michal Ludvig wrote: > Andrew Rose wrote: > > On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > > I still wonder what scenario leads to the duplicate keys and subsequent > replication failure. Do you write to the same mysqlfs on both hosts? > Even if you create a file on "db1" and another file on "db2" they should > have assigned different inode numbers. Perhaps if you write to a single > file on both DBs it the INSERT on line 558 of query.c may fail. Hmm... > Try changing it to REPLACE, ie "REPLACE INTO data_blocks ...". But I'm > afraid you may end up with a data corruption if blindly replacing blocks > like this. But your original solution that simply didn't do the write on > detected conflict wasn't much more corruption proof either. In both > cases you'd lose a write that you expected to go through. > > I personally wouldn't run it in master-master setup, it's a bit > complicated to synchronize them, not a simple "one-liner" solution. > > Michal > > I would like to reply to you. I 'm not a programmer, but as far as I can see multithreading is also important. I wanted to use mysqlfs as a backup, but failed beacuse of the lack of multithreading. Stef |
From: Michal L. <mi...@lo...> - 2007-08-22 06:18:23
|
Stef Bon wrote: > I would like to reply to you. I 'm not a programmer, but as far as I can see > multithreading is also important. It is, I agree. I'm happy to check and eventually accept any relevant patches fixing any problems you observe :-) Michal |
From: Andrew R. <ros...@gm...> - 2007-08-22 08:50:36
|
On 22/08/07, Stef Bon <st...@bo...> wrote: > On Wednesday 22 August 2007 04:58:02 Michal Ludvig wrote: > > Andrew Rose wrote: > > > On 21/08/07, Michal Ludvig <mi...@lo...> wrote: > > > > > I still wonder what scenario leads to the duplicate keys and subsequent > > replication failure. Do you write to the same mysqlfs on both hosts? > > Even if you create a file on "db1" and another file on "db2" they should > > have assigned different inode numbers. Perhaps if you write to a single > > file on both DBs it the INSERT on line 558 of query.c may fail. Hmm... > > Try changing it to REPLACE, ie "REPLACE INTO data_blocks ...". But I'm > > afraid you may end up with a data corruption if blindly replacing blocks > > like this. But your original solution that simply didn't do the write on > > detected conflict wasn't much more corruption proof either. In both > > cases you'd lose a write that you expected to go through. > > > > I personally wouldn't run it in master-master setup, it's a bit > > complicated to synchronize them, not a simple "one-liner" solution. > > > > Michal > > > > > I would like to reply to you. I 'm not a programmer, but as far as I can see > multithreading is also important. I wanted to use mysqlfs as a backup, but failed beacuse of the > lack of multithreading. Correct me if I'm wrong but I was under the impression FUSE handled the threading. All that mysqlfs needs to do is maintain the mysql connections (the pool) and keep them thread safe. The locking (inode and data_blocks) as far as I can picture it should be done on the database level. Does anyone have any information on how FUSE handles inode creation? The problem with multi-master inode clashing as far as I can see it is when two servers (or more) create a file and pick the same inode (for a file that they regard as local) that the clash occurs. Andrew |
From: Stef B. <st...@bo...> - 2007-08-24 06:43:40
|
On Wednesday 22 August 2007 10:50:32 Andrew Rose wrote: > > I would like to reply to you. I 'm not a programmer, but as far as I can see > > multithreading is also important. I wanted to use mysqlfs as a backup, but failed beacuse of the > > lack of multithreading. > > Correct me if I'm wrong but I was under the impression FUSE handled > the threading. All that mysqlfs needs to do is maintain the mysql > connections (the pool) and keep them thread safe. The locking (inode > and data_blocks) as far as I can picture it should be done on the > database level. Now I'm do not have a lot of experience (I'm systemenigineer, not a C/C++ programmer) with multithreading, but when I look to the code of for example sshfs or fusesmb, threading is done in the modules as well.(look for pthread_mutex_lock). > > Does anyone have any information on how FUSE handles inode creation? > The problem with multi-master inode clashing as far as I can see it is > when two servers (or more) create a file and pick the same inode (for > a file that they regard as local) that the clash occurs. I'm sorry. I cannot help you here. Did you look into the source? Stef Bon |