|
From: Ian H. <ian...@as...> - 2008-07-16 11:21:05
|
2008/7/15 Darren Dale <dsd...@gm...>: > Hi Ian, > > On Tuesday 15 July 2008 10:13:02 am Ian Harry wrote: > > Thanks for helping with this problem. > > > > I have investigated further this issue and here is what I have found out: > > > > I have traced the errors themselves back to two functions in > texmanager.py > > (matplotlib.texmanager), make_dvi and make_png. Most of the errors seem > to > > mention 'Stale NFS file handles' and crop up at a variety of different > > places throughout these functions. I guess this is because on our > clusters > > /home/[username] is not a local directory, we have seen issues before > with > > other code if a lot of nodes try to access the same directory on the NFS > > file system simultaneously. I tried altering the __init__.py to force the > > code to put the .matplotlib directory on filesystems local to each node. > > Moving the .matplotlib directory to a local drive solves almost all of > > these errors. > > I suggest you try backing out those changes you just described, and instead > try setting a MPLCONFIGDIR environment variable to point somewhere on the > local filesystem. If MPLCONFIGDIR is not defined, we use ~/.matplotlib. Brilliant! This works perfectly and should be easy to implement on different systems! > > > One error that remained was the one about file opening > > fh = file(outfile) > > I added a 'w' to this and this seemed to solve this problem, I also > > commented out some of the verbose generating commands (specifically > > fh.read() was causing a problem (probably expected with 'w')) within > these > > functions and the errors go away. I guess 'a' would be better but the > > commands only seem to be called if the file doesn't exist? > > Out of curiosity, if you added 'a' instead of 'w', does the error go away? > Either way, please let me know exactly what changes need to be made and I > will > commit the changes to svn. No, using 'a' gives the same errors as using 'w' (again in fh.read()). Here are the changes I made to stop the errors that didn't seem to be due to 'stale NFS file handle': --snip-- [spxiwh@sugar 07:14 AM matplotlib]$ diff texmanager.py /usr/lib64/python2.4/site-packages/matplotlib/texmanager.py 248c248 < fh = file(outfile,'a') --- > fh = file(outfile) 252,254c252 < else: < try: verbose.report(fh.read(), 'debug') < except: pass --- > else: verbose.report(fh.read(), 'debug') 259,261c257,258 < else: < try: os.remove(fname) < except: pass --- > else: os.remove(fname) > 280c277 < fh = file(outfile,'a') --- > fh = file(outfile) 285,287c282 < else: < try: verbose.report(fh.read(), 'debug') < except: pass --- > else: verbose.report(fh.read(), 'debug') 289,290c284 < try: os.remove(outfile) < except: pass --- > os.remove(outfile) 314c308 < # else: verbose.report(fh.read(), 'debug') --- > else: verbose.report(fh.read(), 'debug') --snip-- Once again, thanks for the help. Ian > > > > As we have a lot of users running this code a solution like this is > > unworkable (as a lot of our users are unfamiliar with python/Linux and > want > > to run a simple command). Do you have any ideas of how we could solve > this > > issue? > > Please try the environment variable I mentioned and let me know what > happens. > > Darren > -- --------------------------------------------------------------------------- Ian Harry School of Physics & Astronomy Queens Buildings, The Parade Cardiff, CF24 3AA Email: Ian...@as... Phone: (+44) 29 208 75120 Mobile: (+44) 7890 479090 --------------------------------------------------------------------------- |