On Fri, 25 Mar 2005 11:02:03 -0500, Luke Schierer
<lu...@ac...> wrote:
> Attached is a simple Perl script that I can use to tank the system.
> The script uses blocking NFS file locking (a great, simple way to
> coordinate jobs across a cluster), and works fine
> on other computers. For example, if you spawn a bunch of them at once
>
> for i in `seq 1 8 ` ; do filelocktest name_of_existing_file & done
>
> The last script will finish 8 seconds later, each script taking
> a turn holding the lock on the file for 1 second. It also
> works across multiple (non-clustermatic) machines if the
> name_of_existing_file is on a commonly NFS mounted directory.
>
> However, if you try the script on our cluster (where all
> the nodes have /home NFS mounted and /proc/sys/bproc/shell_hack
> is off):
>
> bpsh 1-30 filelocktest name_of_existing_file
>
> It does not run in 30 seconds as expected. The locks are obtained
> much more slowly than 1/sec and after little while the whole
> system freezes up and dumps the message that I sent earlier.
> Note that while using ~10 nodes takes
> much longer than 10 seconds, it usually succeeds after a certain
> amount of time, and doesn't crash. 30 nodes and more crashes pretty
> reliably.
>
> On another note our final piece of cluster weirdness that I've
> detected is also NFS related, though not as important.
> When I read a file off a
> master NFS server drive from a node I get 50 MB/s, which
> is how fast the drive goes (Yay! The 2.4 kernel maxed out at
> ~20MB/s over NFS for a single client.) But then I read the
> same file from the master NFS server again from a different node
> now that it is cached on the server and I get only 10 MB/s.
> To make certain that I'm not nuts I read the same file over NFS from
> a non-clustermatic computer and I get 100 MB/s, the legal gigabit limit
> (Sweet!).
> Summary: NFS to clustermatic nodes is much slower if the file is
> cached in the master NFS server.
>
> It seems very odd that I'm getting these NFS problems. Shouldn't that
> be pretty much be independent of the bproc changes to the kernel?
> Would having an NFS server separate from the bproc master fix things?
Yeah, that is weird. BProc doesn't touch NFS code at all and it
shouldn't get in the way of scheduling or the RPC threads or anything
like that. Are you running a lockd and/or statd on the nodes?
(node_up doesn't deal with that kind of stuff right now which prevents
locking from working) I would expect you to just get errors in that
case though. Other than that the only thing I can thing to look for
would be to make sure that all the mount options and server options
are the same. It's possible that node_up won't have the same defaults
as the normal mount program.
- Erik
|