From: Davies L. <dav...@gm...> - 2013-05-28 05:11:26
|
Yes,It sounds like same problem. I will confirm it later, thank you. Davies On Tue, May 28, 2013 at 12:43 PM, Anand Avati <ana...@gm...> wrote: > Sounds a lot like: > http://thread.gmane.org/gmane.comp.file-systems.fuse.devel/12796. Can you > confirm? > > Thanks, > Avati > > > On Sun, May 26, 2013 at 11:59 PM, Davies Liu <dav...@gm...> wrote: > >> Hi, >> >> I'm experiencing similar problem with [1] and [2]. >> >> MooseFS [3] is distributed file system, built with fuse client. We have >> two >> machines with MFS mounted as /mfs, then: >> 1. A # mkdir /mfs/tmp/testdb/a/ >> 2. B # touch /mfs/tmp/testdb/a/a >> 3. A # rm -rf /mfs/tmp/testdb/a/ && mkdir /mfs/tmp/testdb/a/ >> 4. B # touch /mfs/tmp/testdb/a/a & touch /mfs/tmp/testdb/a/a & touch >> /mfs/tmp/testdb/a/a >> touch: touch: touch: cannot touch ‘/mfs/tmp/testdb/a/a’cannot touch >> ‘/mfs/tmp/testdb/a/a’cannot touch ‘/mfs/tmp/testdb/a/a’: No such file or >> directory >> : No such file or directory: No such file or directory >> >> the oplog of B is following: >> >> >> # the first time touch /mfs/tmp/testdb/a/a >> 05.27 14:40:09.151020: uid:2008 gid:2008 pid:30733 cmd:getattr (1): OK >> (1.0,[drwxr-xr-x:0040755,32,0,0,1369636464,1369212021,1369212021,386635]) >> 05.27 14:40:09.151044: uid:2008 gid:2008 pid:30733 cmd:lookup (1,tmp): OK >> >> (1.0,494,1.0,[drwxrwxrwx:0040777,85,0,0,1369500316,1369431933,1369431933,489]) >> 05.27 14:40:09.151073: uid:2008 gid:2008 pid:30733 cmd:lookup >> (494,testdb): >> OK >> >> (1.0,7637946,1.0,[drwxr-xr-x:0040755,3,2008,2008,1369636463,1369636466,1369636466,0]) >> 05.27 14:40:09.151089: uid:2008 gid:2008 pid:30733 cmd:lookup (7637946,a): >> OK >> >> (1.0,2761633,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636722,1369636722,1369636722,0]) >> 05.27 14:40:09.151125: uid:2008 gid:2008 pid:30733 cmd:lookup (7637946,a): >> OK >> >> (1.0,2761633,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636722,1369636722,1369636722,0]) >> ... >> >> // the second time touch /mfs/tmp/testdb/a/a >> >> 05.27 14:40:25.289747: uid:2008 gid:2008 pid:30931 cmd:getattr (1): OK >> (1.0,[drwxr-xr-x:0040755,32,0,0,1369636464,1369212021,1369212021,386635]) >> 05.27 14:40:25.289771: uid:2008 gid:2008 pid:30932 cmd:lookup (1,tmp): OK >> >> (1.0,494,1.0,[drwxrwxrwx:0040777,85,0,0,1369500316,1369431933,1369431933,489]) >> 05.27 14:40:25.289787: uid:2008 gid:2008 pid:30931 cmd:lookup (1,tmp): OK >> >> (1.0,494,1.0,[drwxrwxrwx:0040777,85,0,0,1369500316,1369431933,1369431933,489]) >> 05.27 14:40:25.289805: uid:2008 gid:2008 pid:30931 cmd:lookup >> (494,testdb): >> OK >> >> (1.0,7637946,1.0,[drwxr-xr-x:0040755,3,2008,2008,1369636463,1369636466,1369636466,0]) >> 05.27 14:40:25.289805: uid:2008 gid:2008 pid:30931 cmd:lookup >> (494,testdb): >> OK >> >> (1.0,7637946,1.0,[drwxr-xr-x:0040755,3,2008,2008,1369636463,1369636466,1369636466,0]) >> 5.27 14:40:25.290134: uid:2008 gid:2008 pid:30931 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> 05.27 14:40:25.290140: uid:2008 gid:2008 pid:30933 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> 05.27 14:40:25.290140: uid:2008 gid:2008 pid:30933 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> >> // the concurrent lookup request, return with corrent inode 7638275 >> >> 05.27 14:40:25.290387: uid:2008 gid:2008 pid:30933 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> 05.27 14:40:25.290391: uid:2008 gid:2008 pid:30931 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> 05.27 14:40:25.290401: uid:2008 gid:2008 pid:30932 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> >> // but the following getattr request used the stale inode 2761633 >> >> // retry >> 05.27 14:40:25.290415: uid:2008 gid:2008 pid:30933 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> 05.27 14:40:25.290423: uid:2008 gid:2008 pid:30931 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> 05.27 14:40:25.290452: uid:2008 gid:2008 pid:30932 cmd:lookup (7637946,a): >> OK >> >> (1.0,7638275,1.0,[drwxr-xr-x:0040755,2,2008,2008,1369636822,1369636822,1369636822,0]) >> 05.27 14:40:25.290678: uid:2008 gid:2008 pid:30933 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> 05.27 14:40:25.290691: uid:2008 gid:2008 pid:30932 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> 05.27 14:40:25.290691: uid:2008 gid:2008 pid:30932 cmd:getattr (2761633): >> ENOENT (No such file or directory) >> >> If I touch /mfs/tmp/testdb/a/a with only one process, it will succeed >> without problem. >> So I think this problem may be caused by a risk condition bug in fuse >> module in kernel, in fuse_lookup or fuse_iget. >> >> Need your help, please. >> >> My environment: >> Gentoo Linux, kernel 3.0.36-gentoo, libfuse is 2.9.1-r1. >> moosefs 1.6.20 >> >> [1] http://comments.gmane.org/gmane.comp.file-systems.fuse.devel/9523 >> [2] http://permalink.gmane.org/gmane.comp.file-systems.fuse.devel/10724 >> [2] http://www.moosefs.org/ >> >> >> Best Regards, >> >> - Davies >> >> ------------------------------------------------------------------------------ >> Try New Relic Now & We'll Send You this Cool Shirt >> New Relic is the only SaaS-based application performance monitoring >> service >> that delivers powerful full stack analytics. Optimize and monitor your >> browser, app, & servers with just a few lines of code. Try New Relic >> and get this awesome Nerd Life shirt! >> http://p.sf.net/sfu/newrelic_d2d_may >> _______________________________________________ >> fuse-devel mailing list >> fus...@li... >> https://lists.sourceforge.net/lists/listinfo/fuse-devel >> > > -- - Davies |