Re: [Osgmm-discuss] MatchMaker ranks
Brought to you by:
mats_rynge
From: Mats R. <ry...@re...> - 2009-06-17 21:17:14
|
Peter Doherty wrote: > What's the best way to determine why a site is getting a low rank? > We have two CEs, and condor_grid_overview shows them like this: > > SBGrid-Harvard-East 0 0 0 0 0 0 > 0 952 100% > SBGrid-Harvard-Exp 0 0 0 0 0 0 > 0 1 100% > > > I had assumed that the Exp CE was getting a low rank because it's > queue has been full for several weeks. But the queue cleared up the > past day, yet it still only has a rank of 1. But I know jobs can run > successfully, so why does it have a low rank? Hi Peter, In this case, it is both issues. > I looked in ~osgmm/var/verification-runs/SiteName and looked through > the error files. > fork.err shows: > + echo 'More than 5G of $HOME used!' Many sites have quotas on $HOME, so the idea is to disable the site if we are using more than 5GB of space. This test is a little bit Engage specific so maybe we should disable it. You can do that by editing libexec/verification-script.fork > jm.err shows: > ++ MANPATH=:/opt/osg-shared/wn-1.0/vdt/man > ++ export MANPATH > ++ . /opt/osg-shared/wn-1.0/vdt/etc/vdt-man-setup.sh > /opt/osg-shared/wn/setup.sh: line 47: /opt/osg-shared/wn-1.0/vdt/etc/ > vdt-man-setup.sh: No such file or directory > > > The second error is curious, the file is there. > > #ls -l /opt/osg-shared/wn-1.0/vdt/etc/vdt-man-setup.sh > -rw-r--r-- 1 root root 51 May 13 2008 /opt/osg-shared/wn-1.0/vdt/etc/ > vdt-man-setup.sh The file seems to exist on the head node, but not the compute nodes. The WN install should be on a shared file system as the purpose is for the tools to be available to the jobs. -- Mats Rynge Renaissance Computing Institute <http://www.renci.org> |