|
From: Patrick O. <pat...@in...> - 2016-02-18 15:08:37
|
On Thu, 2016-02-18 at 07:32 -0500, Mimi Zohar wrote: > On Mon, 2016-02-15 at 11:27 +0100, Patrick Ohly wrote: > > On Tue, 2015-09-22 at 20:10 +0200, Patrick Ohly wrote: > > > On Tue, 2015-09-22 at 09:04 -0400, Mimi Zohar wrote: > > > > On Tue, 2015-09-22 at 14:58 +0200, Patrick Ohly wrote: > > > > > Would it be possible to intercept the in-kernel implementation of > > > > > fdatasync() and trigger a hash update *and* a flushing of the xattr? > > > > > > > > That sounds like a good compromise, but would it be enough to resolve > > > > the problem? > > > > > > I suspect that there's still a time window where a file content change > > > has hit the disk while the corresponding xattr change has not, for > > > example between writes and the fdatasync(). It would be much better, but > > > probably not good enough. > > > > > > Primarily I wanted to raise the problem here and get opinions. My > > > conclusion is that writing databases probably should be done where it is > > > not in the IMA policy, even if the risk was reduced by also updating the > > > hash on fdatasync(). > > > > Let me come back to this. > > > > I noticed that even for simple files, like /etc/machine-id, it is very > > easy to corrupt the system. /etc/machine-id contains a 32 byte unique > > ID, written by systemd in machine_id_commit() [1]. That code does a > > normal close(), i.e. no explicit fdatasync() and no fsync(). > > > > [1] https://github.com/systemd/systemd/blob/master/src/core/machine-id-setup.c > > > > When using on-device hashing and ext4, the new file and its data are > > stored by ext4 right away (at least in the journal), but the > > corresponding security.ima remains cached in memory for surprisingly > > long periods of time (minutes or longer - haven't tried to determine > > that more precisely). > > > > Power off during that time and the system becomes unusable due to the > > "permission denied" on a core system file. > > > > Any recommendations for addressing the problem, not just > > for /etc/machine-id, but also in general? > > The first question is whether the xattr not being saved/restored is > limited to those stored in the extra block (i_file_acl) or in the inode > as well. Defining EXT4_XATTR_DEBUG in fs/ext4/xattr.c will show where > the xattrs are being stored. I assume they are being stored in the > extra block. While testing this, I noticed that my earlier comment about xattr not getting flushed is wrong: it is the other way around. File *content* isn't getting flushed to disk, whereas the modified xattr is. That also explains why adding fsync() helps: it forces the content to disk, so content and xattr match after a powerloss. But the underlying problem remains the same: with IMA, it is much more important that meta data and file content remain in sync than it is without, and that's currently not working well. > > For machine-id, I can add an explicit "sync" shell command invocation > > after writing the file, but that's not a general solution. > > > > Implementing the enhanced fdatasync() mentioned before and relying on > > programs to call fdatasync() would help somewhat, but not all programs > > call it. Would calling fsync() in systemd have helped? > > > > ext4 mount options also don't look promising. commit=nrsec flushes data > > after 5 seconds by default, but does not seem to include xattrs. > > > > Journaling is already using data=ordered, so meta data should be as safe > > as it can be, and yet it still doesn't include the modified xattr. Given these settings, it is surprising that the data does not get flushed to disk even after minutes. Could this be a result of running under qemu? I kill the qemu process instead of properly shutting down the virtual machine. No, that's not it either. Even when resetting with "system_reset", I get the same failure. "data=journal" helps. It avoids the problem completely and might be the best solution despite the impact ("disables delayed allocation and O_DIRECT"). If anyone has suggestions for debugging the long delay until data gets flushed, then I am open for suggestions. I'm still seeing that even with "data=journal", it's only that the effect is less drastic. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. |