From: Josef B. <jo...@re...> - 2009-11-12 20:44:14
|
Hello, There is a problem where if you have certain audit rules in place you can hang on mount of a fuse filesystem. If you follow the instructions here https://bugzilla.redhat.com/show_bug.cgi?id=493565#c44 it is easy to reproduce. The problem is after the mount request gets sent, the mounting process gets stuck going off to read xattrs to satisfy audit's curiosity, and then we get stuck because that tries to get a request, but can't because the connection is blocked. This patch fixes the problem, but I'm not entirely sold on it, it's rather quick and dirty. Basically if we haven't finished the initialization of the connection just return -EAGAIN. This fixes the problem, audit seems to be ok with getting that as an error. Thanks, Signed-off-by: Josef Bacik <jo...@re...> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 6484eb7..04d92c1 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -100,6 +100,10 @@ struct fuse_req *fuse_get_req(struct fuse_conn *fc) int intr; int err; + err = -EAGAIN; + if (!fc->conn_init) + return ERR_PTR(err); + atomic_inc(&fc->num_waiting); block_sigs(&oldset); intr = wait_event_interruptible(fc->blocked_waitq, !fc->blocked); |
From: Miklos S. <mi...@sz...> - 2009-11-13 06:59:15
|
On Thu, 12 Nov 2009, Josef Bacik wrote: > Hello, > > There is a problem where if you have certain audit rules in place > you can hang on mount of a fuse filesystem. If you follow the > instructions here > > https://bugzilla.redhat.com/show_bug.cgi?id=493565#c44 > > it is easy to reproduce. The problem is after the mount request > gets sent, the mounting process gets stuck going off to read xattrs > to satisfy audit's curiosity, and then we get stuck because that > tries to get a request, but can't because the connection is blocked. > This patch fixes the problem, but I'm not entirely sold on it, it's > rather quick and dirty. Basically if we haven't finished the > initialization of the connection just return -EAGAIN. This fixes > the problem, audit seems to be ok with getting that as an error. Why not rather fix audit not to do this? We had a similar problem in SMACK in the past, and my reasoning at the time should apply here as well: calling into filesystem code before it's fully initialized (i.e. mount is finished) is not guaranteed to work for *any* filesystem. Fuse will hang, others may just see silent memory corruption. So NACK, unless there's some fundamental reason why audit can't be properly fixed. Thanks, Miklos |
From: Eric P. <ep...@re...> - 2009-11-13 13:15:03
|
On Fri, 2009-11-13 at 07:58 +0100, Miklos Szeredi wrote: > On Thu, 12 Nov 2009, Josef Bacik wrote: > > Hello, > > > > There is a problem where if you have certain audit rules in place > > you can hang on mount of a fuse filesystem. If you follow the > > instructions here > > > > https://bugzilla.redhat.com/show_bug.cgi?id=493565#c44 > > > > it is easy to reproduce. The problem is after the mount request > > gets sent, the mounting process gets stuck going off to read xattrs > > to satisfy audit's curiosity, and then we get stuck because that > > tries to get a request, but can't because the connection is blocked. > > This patch fixes the problem, but I'm not entirely sold on it, it's > > rather quick and dirty. Basically if we haven't finished the > > initialization of the connection just return -EAGAIN. This fixes > > the problem, audit seems to be ok with getting that as an error. > > Why not rather fix audit not to do this? > > We had a similar problem in SMACK in the past, and my reasoning at the > time should apply here as well: calling into filesystem code before > it's fully initialized (i.e. mount is finished) is not guaranteed to > work for *any* filesystem. Fuse will hang, others may just see silent > memory corruption. > > So NACK, unless there's some fundamental reason why audit can't be > properly fixed. I don't buy the argument that audit did anything wrong. mount S ffff88001f86bd28 0 4126 4120 0x00000080 mount S ffff88001f86bd28 0 4126 4120 0x00000080 ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 Call Trace: [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd [<ffffffff81104789>] audit_inode+0x28/0x2a [<ffffffff811066b4>] do_path_lookup+0x63/0x8b [<ffffffff81107c9c>] user_path_at+0x56/0x93 [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b Userspace called readlink() and the audit system (after the vfs worked out the path) needs to collect any fcaps. I don't agree, but would be willing to accept your argument that things like SELinux and SMACK can't make filesystem calls during the mount(2) operation but clearly this isn't done during a mount call, it's done by calling readlink. The mount syscall is finished and we still can't use the FS? How is Audit supposed to be 'fixed' ? |
From: Miklos S. <mi...@sz...> - 2009-11-13 13:55:41
|
On Fri, 13 Nov 2009, Eric Paris wrote: > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > Call Trace: > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > [<ffffffff81104789>] audit_inode+0x28/0x2a > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > Userspace called readlink() and the audit system (after the vfs worked > out the path) needs to collect any fcaps. OK, this is a different problem than what I thought. Agreed, audit is not doing anything wrong here. The cause seems to be that fuse calls "mount -i -f ..." to add the filesystem entry to /etc/mtab. mount(8) then goes off and tries to canonicalize the path, calling readlink() on the mountpoint. Which is unnecessary, really. The path has already been canonicalized, but unfortunately there's no way currently to tell this to mount. In this light, the patch by Josef might be OK as a workaround, but I'll look at better ways to fix this. Josef, can you check whether the mtab is still correctly updated with your patch? Thanks, Miklos |
From: Josef B. <jo...@re...> - 2009-11-13 14:26:43
|
On Fri, Nov 13, 2009 at 02:55:24PM +0100, Miklos Szeredi wrote: > On Fri, 13 Nov 2009, Eric Paris wrote: > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > > Call Trace: > > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > > [<ffffffff81104789>] audit_inode+0x28/0x2a > > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > > > Userspace called readlink() and the audit system (after the vfs worked > > out the path) needs to collect any fcaps. > > OK, this is a different problem than what I thought. Agreed, audit is > not doing anything wrong here. > > The cause seems to be that fuse calls "mount -i -f ..." to add the > filesystem entry to /etc/mtab. mount(8) then goes off and tries to > canonicalize the path, calling readlink() on the mountpoint. Which is > unnecessary, really. The path has already been canonicalized, but > unfortunately there's no way currently to tell this to mount. > > In this light, the patch by Josef might be OK as a workaround, but > I'll look at better ways to fix this. > > Josef, can you check whether the mtab is still correctly updated with > your patch? > Yes mtab is correctly updated with my patch. Like I said I'm open to other options, this just happened to fix it and I figured I'd send it along to help illustrate the problem hoping you had a better idea :). Thanks, Josef |
From: Josef B. <jo...@re...> - 2009-11-20 14:56:25
|
On Fri, Nov 13, 2009 at 02:55:24PM +0100, Miklos Szeredi wrote: > On Fri, 13 Nov 2009, Eric Paris wrote: > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > > Call Trace: > > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > > [<ffffffff81104789>] audit_inode+0x28/0x2a > > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > > > Userspace called readlink() and the audit system (after the vfs worked > > out the path) needs to collect any fcaps. > > OK, this is a different problem than what I thought. Agreed, audit is > not doing anything wrong here. > > The cause seems to be that fuse calls "mount -i -f ..." to add the > filesystem entry to /etc/mtab. mount(8) then goes off and tries to > canonicalize the path, calling readlink() on the mountpoint. Which is > unnecessary, really. The path has already been canonicalized, but > unfortunately there's no way currently to tell this to mount. > > In this light, the patch by Josef might be OK as a workaround, but > I'll look at better ways to fix this. > > Josef, can you check whether the mtab is still correctly updated with > your patch? > Hey Miklos, have you come up with a better solution yet? Alot of users are hitting this in Fedora and I'd like to get a fix into place for them soon. Thanks, Josef |
From: Goswin v. B. <gos...@we...> - 2009-11-13 16:51:23
|
Eric Paris <ep...@re...> writes: > On Fri, 2009-11-13 at 07:58 +0100, Miklos Szeredi wrote: >> On Thu, 12 Nov 2009, Josef Bacik wrote: >> > Hello, >> > >> > There is a problem where if you have certain audit rules in place >> > you can hang on mount of a fuse filesystem. If you follow the >> > instructions here >> > >> > https://bugzilla.redhat.com/show_bug.cgi?id=493565#c44 >> > >> > it is easy to reproduce. The problem is after the mount request >> > gets sent, the mounting process gets stuck going off to read xattrs >> > to satisfy audit's curiosity, and then we get stuck because that >> > tries to get a request, but can't because the connection is blocked. >> > This patch fixes the problem, but I'm not entirely sold on it, it's >> > rather quick and dirty. Basically if we haven't finished the >> > initialization of the connection just return -EAGAIN. This fixes >> > the problem, audit seems to be ok with getting that as an error. >> >> Why not rather fix audit not to do this? >> >> We had a similar problem in SMACK in the past, and my reasoning at the >> time should apply here as well: calling into filesystem code before >> it's fully initialized (i.e. mount is finished) is not guaranteed to >> work for *any* filesystem. Fuse will hang, others may just see silent >> memory corruption. >> >> So NACK, unless there's some fundamental reason why audit can't be >> properly fixed. > > I don't buy the argument that audit did anything wrong. > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > Call Trace: > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > [<ffffffff81104789>] audit_inode+0x28/0x2a > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > Userspace called readlink() and the audit system (after the vfs worked > out the path) needs to collect any fcaps. I don't agree, but would be > willing to accept your argument that things like SELinux and SMACK can't > make filesystem calls during the mount(2) operation but clearly this > isn't done during a mount call, it's done by calling readlink. The > mount syscall is finished and we still can't use the FS? How is Audit > supposed to be 'fixed' ? As reported in the URL from above there is a "mount" still running. So clearly mount has not finished. The big question is not why the readlink() blocks. That is totaly expected. The big question is why is the mount hanging. Find that out and you find the bug. Once the mount finishes the readlink will continue. MfG Goswin |
From: Josef B. <jo...@re...> - 2009-11-13 16:56:50
|
On Fri, Nov 13, 2009 at 05:51:06PM +0100, Goswin von Brederlow wrote: > Eric Paris <ep...@re...> writes: > > > On Fri, 2009-11-13 at 07:58 +0100, Miklos Szeredi wrote: > >> On Thu, 12 Nov 2009, Josef Bacik wrote: > >> > Hello, > >> > > >> > There is a problem where if you have certain audit rules in place > >> > you can hang on mount of a fuse filesystem. If you follow the > >> > instructions here > >> > > >> > https://bugzilla.redhat.com/show_bug.cgi?id=493565#c44 > >> > > >> > it is easy to reproduce. The problem is after the mount request > >> > gets sent, the mounting process gets stuck going off to read xattrs > >> > to satisfy audit's curiosity, and then we get stuck because that > >> > tries to get a request, but can't because the connection is blocked. > >> > This patch fixes the problem, but I'm not entirely sold on it, it's > >> > rather quick and dirty. Basically if we haven't finished the > >> > initialization of the connection just return -EAGAIN. This fixes > >> > the problem, audit seems to be ok with getting that as an error. > >> > >> Why not rather fix audit not to do this? > >> > >> We had a similar problem in SMACK in the past, and my reasoning at the > >> time should apply here as well: calling into filesystem code before > >> it's fully initialized (i.e. mount is finished) is not guaranteed to > >> work for *any* filesystem. Fuse will hang, others may just see silent > >> memory corruption. > >> > >> So NACK, unless there's some fundamental reason why audit can't be > >> properly fixed. > > > > I don't buy the argument that audit did anything wrong. > > > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > > Call Trace: > > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > > [<ffffffff81104789>] audit_inode+0x28/0x2a > > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > > > Userspace called readlink() and the audit system (after the vfs worked > > out the path) needs to collect any fcaps. I don't agree, but would be > > willing to accept your argument that things like SELinux and SMACK can't > > make filesystem calls during the mount(2) operation but clearly this > > isn't done during a mount call, it's done by calling readlink. The > > mount syscall is finished and we still can't use the FS? How is Audit > > supposed to be 'fixed' ? > > As reported in the URL from above there is a "mount" still running. So > clearly mount has not finished. > Yeah because it needs to see the init request that fuse has sent it, but it can't because it's stuck waiting for readlink. We can't finish the mount because we're waiting on the fs to respond to the readlink, but the fs cannot respond to the mount because we're waiting on the readlink. Josef |
From: Miklos S. <mi...@sz...> - 2009-11-20 16:16:56
|
On Fri, 20 Nov 2009, Josef Bacik wrote: > On Fri, Nov 13, 2009 at 02:55:24PM +0100, Miklos Szeredi wrote: > > On Fri, 13 Nov 2009, Eric Paris wrote: > > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > > mount S ffff88001f86bd28 0 4126 4120 0x00000080 > > > ffff88001f86bc38 0000000000000086 0000000000000011 ffff88001f86bc08 > > > 0000000000000700 0000000000000000 0000000000000013 ffffffff815884cd > > > ffff8800377532c8 000000000000f8e0 ffff8800377532c8 0000000000015600 > > > Call Trace: > > > [<ffffffffa01bec71>] fuse_get_req+0xb8/0x15b [fuse] > > > [<ffffffffa01bf359>] fuse_getxattr+0x58/0x14f [fuse] > > > [<ffffffff811b9ea2>] get_vfs_caps_from_disk+0x52/0xcd > > > [<ffffffff81093809>] audit_copy_inode+0x83/0xb1 > > > [<ffffffff81093d4f>] __audit_inode+0x1ee/0x1fd > > > [<ffffffff81104789>] audit_inode+0x28/0x2a > > > [<ffffffff811066b4>] do_path_lookup+0x63/0x8b > > > [<ffffffff81107c9c>] user_path_at+0x56/0x93 > > > [<ffffffff810ffaf0>] sys_readlinkat+0x2d/0x91 > > > [<ffffffff810ffb6f>] sys_readlink+0x1b/0x1d > > > [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b > > > > > > Userspace called readlink() and the audit system (after the vfs worked > > > out the path) needs to collect any fcaps. > > > > OK, this is a different problem than what I thought. Agreed, audit is > > not doing anything wrong here. > > > > The cause seems to be that fuse calls "mount -i -f ..." to add the > > filesystem entry to /etc/mtab. mount(8) then goes off and tries to > > canonicalize the path, calling readlink() on the mountpoint. Which is > > unnecessary, really. The path has already been canonicalized, but > > unfortunately there's no way currently to tell this to mount. > > > > In this light, the patch by Josef might be OK as a workaround, but > > I'll look at better ways to fix this. > > > > Josef, can you check whether the mtab is still correctly updated with > > your patch? > > > > Hey Miklos, have you come up with a better solution yet? Alot of users are > hitting this in Fedora and I'd like to get a fix into place for them soon. Well, two solutions come to mind: 1) add a "--no-canonicalize" option to mount(8) and umount(8) 2) copy the /etc/mtab manipulating code from util-linux to fuse and omit the path canonicalization I think 1) is simpler and easier to maintain, but also requires updates to util-linux-ng in addition to fuse userspace. Karel, would you accept such an option? Thanks, Miklos |
From: Karel Z. <kz...@re...> - 2009-11-20 19:28:55
|
On Fri, Nov 20, 2009 at 05:16:40PM +0100, Miklos Szeredi wrote: > On Fri, 20 Nov 2009, Josef Bacik wrote: > > On Fri, Nov 13, 2009 at 02:55:24PM +0100, Miklos Szeredi wrote: > > Hey Miklos, have you come up with a better solution yet? Alot of users are > > hitting this in Fedora and I'd like to get a fix into place for them soon. > > Well, two solutions come to mind: > > 1) add a "--no-canonicalize" option to mount(8) and umount(8) > > 2) copy the /etc/mtab manipulating code from util-linux to fuse and > omit the path canonicalization > > I think 1) is simpler and easier to maintain, but also requires > updates to util-linux-ng in addition to fuse userspace. > > Karel, would you accept such an option? Yes, I think the option seems usable in some others cases too. I'll add this option to 2.17-rc2. Karel -- Karel Zak <kz...@re...> |
From: Miklos S. <mi...@sz...> - 2009-12-16 14:10:42
|
On Fri, 20 Nov 2009, Karel Zak wrote: > Yes, I think the option seems usable in some others cases too. I'll > add this option to 2.17-rc2. Karel, thanks for adding this. One more request: could you please add the --no-canonicalize option to umount(8) as well? Also I'm wondering whether it is a good idea to allow this for user mounts/umounts, as is currently the case. Or should it be restricted to root-only? I guess allowing it shouldn't hurt, since the path that will ultimately be checked against /etc/fstab is the one after canonicalization. I'll leave the decision up to you, for my purpose it's OK if it's root-only. Thanks, Miklos |
From: Karel Z. <kz...@re...> - 2009-12-16 14:04:19
|
On Wed, Dec 16, 2009 at 02:37:39PM +0100, Miklos Szeredi wrote: > On Fri, 20 Nov 2009, Karel Zak wrote: > > Yes, I think the option seems usable in some others cases too. I'll > > add this option to 2.17-rc2. > > Karel, thanks for adding this. > > One more request: could you please add the --no-canonicalize option to > umount(8) as well? OK. > Also I'm wondering whether it is a good idea to allow this for user > mounts/umounts, as is currently the case. Or should it be restricted > to root-only? Good point. Yes, it does not make sense for normal users. Karel -- Karel Zak <kz...@re...> |