From: Pawel S. <Paw...@bc...> - 2009-05-20 12:51:18
|
Hello, I guess similar issue has been mentioned once on the mailing list, but doesn't look like solved: http://sourceforge.net/mailarchive/forum.php?thread_name=002701c64d08%2449efd790%24db421080%40cs.ucl.ac.uk&forum_name=gridsam-discuss This also looks relevant, but I think it has been included into version of GridSAM I am using (2.1.4): http://sourceforge.net/mailarchive/forum.php?thread_name=20080617145418.605536413%40soton.ac.uk&forum_name=gridsam-developer So what is the problem? I configured GridSAM to run jobs using PBS and then I am submitting them using GridSAM's Web Services. Jobs start OK, but their status freezes at state 'active' (response from getJobStatus method below). I checked jobs on the cluster and they had finished, so it seems like GridSAM doesn't check properly job's status on the cluster. It worked properly while running simple POSIX programs like cat and echo, so it is something with PBS. I double checked pbs.PBSJobStatusCommand in jobmanager-pbs.xml file and it is correct. Is there anything additional that GridSAM uses to check if job completed on the cluster? maybe something is wrong with my setup, like directories with wrong permissions it can't access? I am using GridSAM 2.1.4 with OMII toolkit 3.4.4. Thank you in advance for help, Pawel ------------------------ <getJobStatusResponse xmlns="http://www.icenigrid.org/service/gridsam"> <JobStatus> <JobIdentifier> <ID>urn:gridsam:0131f86a215956de01215965031d000b</ID> </JobIdentifier> <Stage> <State>pending</State> <Description>job is being scheduled</Description> <Time>2009-05-19T17:02:20.701+02:00</Time> </Stage> <Stage><State>staging-in</State><Description>staging files...</Description><Time>2009-05-19T17:02:20.803+02:00</Time></Stage><Stage><State>staged-in</State><Description>no file needs to be staged in</Description><Time>2009-05-19T17:02:20.807+02:00</Time></Stage> <Stage> <State>active</State> <Description>job is being launched through PBS</Description> <Time>2009-05-19T17:02:20.933+02:00</Time> </Stage> <Property name="urn:pbs:script"> #! /bin/sh # PBS batch job script built by Globus job manager # #PBS -S /bin/sh #PBS -Wx=NODESET:ONEOF:FEATURE:switch10:switch11:ecs #PBS -o /work/pawels/blast-runs/blast1.out #PBS -e /work/pawels/blast-runs/blast1.err cd ${PBS_O_WORKDIR} /local/blast-2.2.18/bin/blastall \ -i /work/pawels/blast-runs/seq/batch0.fasta -o /work/pawels/blast-runs/seq/batch0.fasta.blast -p blastp -d /net/compute-2-23.local/scratch/andersl/uniref90 -m7 touch .complete </Property> <Property name="urn:pbs:launched">true</Property> <Property name="urn:gridsam:principal">OU=BCCS, O=UiB, EMAILADDRESS=paw...@bc..., C=Norway, ST=Hordaland, CN=gridsam_certificate</Property> <Property name="urn:gridsam:pbs:jobid">2116221.hpcmaster.bccs.uib.no</Property> </JobStatus> </getJobStatusResponse> |