[SSI-devel] Re: CFS chard optional sync for performance with DRBD protocol C
Brought to you by:
brucewalker,
rogertsang
From: John B. <joh...@hp...> - 2005-05-27 22:43:57
|
Roger Tsang wrote: > Sounds like ext3 journal=writeback. Are you saying CFS is interested > in getting the metadata commited to disk before file data? Well would > such a thing occur? I believe it's the other way around in the > default ext3 journaling mode or ordered mode. > > Roger > My understanding (I could be wrong, this is how it was explained to me) is that ext3 journalling only guarantees things are consistent, not that operations won't get lost over failures. If you delete 1000 files and the system goes down, ext3 doesn't guarantee that you will see those 1000 files deleted when it comes back because it doesn't write the log after each transaction. ext3 just guarantees that the filesystem will be consistent after the log is played back and how many files you see deleted depends on how many transactions got logged to disk before the failure. However, if a remote CFS client deletes 1000 files, CFS must guarantee that that after failover those 1000 files are deleted on the disk. Without hooks in the log itself so we can know which transactions were logged and completed, we need to force each transaction to be logged before we return status to the remote client. Our data guarantee is similar, but not quite as rigorous: we guarantee that as if a process writes a page of data, that page will be kept in memory until it is committed to disk; so there is no way a process can see the data revert after a failover. However, if the node with the process and the dirty data dies before the page is committed, we make no guarantees what other nodes in the cluster might see. I need to do some stuff for Bruce, so I won't have any time to answer e-mail for the rest of the day and I don't plan to be online again until Tuesday. John |