Howdy,
I'm running ClusterNFS 3.0rc2, and the rpc.nfsd appears
to have serious memory leaks. After running for only a
few hours, the process is 32M. I've seen it up over
100M quite a few times. I restart it often to keep the
size down, but it grows rather quickly.
The machine also crashes fairly often. I don't know
for sure that ClusterNFS is to blame, but I run the
exact same OS and kernel on a bunch of other machines
with no problems. I recently moved the ClusterNFS
server to entirely new hardware, so I am reasonably
confident that the hardware is not to blame. I'm not
sure that a user space program should actually be
crashing an otherwise-stable machine, but with root
perms it's not impossible.
Has anyone else had these problems? More importantly,
is there a newer version of ClusterNFS in the works?
Thanks,
Jim Stewart
Mars Space Flight Facility
Arizona State University
Logged In: YES
user_id=9316
Try backing down to cNFS 1.0.0. I'll have to take some
time soon to go through the code in v3.0.0-rc2 to see what
might be causing this.
Logged In: NO
An old NEWS entry at
http://www.beowulf.org/pipermail/beowulf/2000-
December/010710.html
says
Voila!
I have already found the memory leak in the code, and here is
the fix (it
was actually no so difficult...):
--- nfsd.c.orig Thu Dec 7 17:06:58 2000
+++ nfsd.c Thu Dec 7 17:07:41 2000
@@ -579,7 +579,7 @@
memcpy( argp, &test_argp, sizeof( test_argp ));
return status;
}
-
+ free (test_argp.name);
}
}
This entry was from year 2000. Is it still relevant?
Logged In: YES
user_id=9316
The patch from the beowulf mailing list was/is for cNFS
1.0.0. It does/did? fix the memory leak there.
Logged In: YES
user_id=9316
I've just checked a patch by Pavel Sakov into CVS that
should resolve some memory-leak problems. Can you test the
patch out to see if it resolves your problems?