|
From: Nathan S. <na...@sg...> - 2005-06-10 02:15:47
|
On Thu, Jun 09, 2005 at 07:12:58PM +0200, Martin Schaff?ner wrote: > Hello, > > a while ago I added an EVMS snapshot feature to the XFS-formatted /home volume > of our cluster. The volume is shared via NFS. Every once in a while all nfsd > instances are stuck in state D, and the clients are dead, as their roots are > mounted via NFS from the same server. This usually happens during some heavy > read/write load. The snapshots are not full, as they are quite large and > reset every day. > > After searching for a while, I managed to break things deterministically by > resetting the snapshot volume during NFS read/write activity. After switching > the /home volume to EXT3, the problem does not occur anymore. No specific > entry is found in the log. > > I am using SuSE Enterprise Server 9.0 with the very latest updates > (kernel_default-2.6.5-7.159) and evms 2.5.2. > > Is this a know problem, and if so, what can be done to make this scenario work > with XFS? There was a snapshotting bug fixed not too long ago in SLES9; -159 sounds a little bit old, but that may be the current released one (I'm looking at a -172 kernel here, but that may be a snapshot of their testing kernel rather than last released). Definately SLES9 service pack 2 will have that fix, so you should be able to grab that once its released (I'm not sure exactly when that will be). Other options are mainline kernels, or CVS from oss.sgi.com as they also contain the fix. cheers. -- Nathan |