|
From: Jos K. <jo...@ne...> - 2007-05-14 10:59:49
|
Hi, I experienced the same problem (I posted a bug report about it). In my case it occurred when multiple GridSAM instances were running in one container. I think it has to do with a known problem with the Globus Cog kit and its use of classloaders. According to the Cog Kit FAQ: http://wiki.cogkit.org/index.php/FAQ#While_developing_Grid_Portals_with_Java_COG_Kit.2C_I_get_a_ClassCastException_for_BouncyCastle_classes._Why.3F The solution in the cogkit FAQ worked for me - i.e. move the cryptix*.jar and puretls.jar libraries to $OMII/shared/lib and make sure they are removed from the GridSAM instances. Regards, Jos. On Monday 14 May 2007 11:09:43 Mark HB wrote: > Dear Vesselin, > > I am sending this query again as I have not yet received a reply. If you > need anymore info from me to help diagnose this error, then please ask. > > Thanks for your reply. After looking into this further, I think we have > boiled it down to the exact problem. I am now certain that it is not a > certificate problem, as I have successfully run a job on one of the grid > nodes using my set-up on Friday. However the error below is still a > problem. I will elaborate on my setup. > > I am running four instances of grid-sam. One points to the local system, > and the other three point to respective NGS nodes (man, leeds & oesc). > When I restart the OMII server, I am then able to submit a job to ONE of > the NGS nodes successfully, however, the other two respond with the same > error as before! > > "Description: cannot initialise working directory: Could not connect to > FTP server on"gsiftp://grid-data.man.ac.uk/ - User globus credential is > required but not specified in the context". > > It CANNOT be credentials, as I have just successfully run a job on the > leeds node, but the exact same job fails on the man and oesc nodes. > > I set up the multiple instances exactly as instructed on the gridsam > site (moving jar files etc). > > Have you any ideas > Regards > Mark > > Vesselin Novov wrote: > > Mark, > > > > I should have pointed out, I have never used the AHE in OMII. > > All my tests use directly installed/managed GridSAM instance. > > It's worth checking with the AHE developers what exactly goes on > > when AHE instantiates/manages a GridSAM instance with regard to any > > security credentials. > > > > -Vesso > > > > Mark HB wrote: > >> Vesselin, > >> > >> Yes, sorry about that. The error is exactly the same no matter which > >> node I send the job to, and hence I just grabbed the first log file to > >> hand. > >> > >> Cheers > >> Mark > >> > >> Novov wrote: > >>> Mark, > >>> > >>> I recently had this exact error, but, in my case missing proxy > >>> credentials were definitely > >>> the cause of it. > >>> > >>> I am also a bit confused: > >>> > >>> - the pasted "GridSAM state is . . ." below indicates the job was > >>> attempted on the 27th March > >>> and it was Not submitted because no connection was established with > >>> grid-compute.leeds.ac.uk. > >>> - the gram_job_mgr_9570.log entries are from 13th March and indicate > >>> that job Was at least submitted. > >>> - the catalina.out section below also indicates a job Was submitted, > >>> the staging-in of input files was successful > >>> but failed at staging-out(after execution) phase and the machine is > >>> grid-data.man.ac.uk. > >>> > >>> regards > >>> Vesso > >>> > >>> Mark HB wrote: > >>>> Dear GRIDSam list, > >>>> > >>>> I am attempting to run an application on the NGS using AHE/GRIDSam > >>>> bundled with the OMII-stack. I get the following error when trying to > >>>> run the job: > >>>> > >>>> GridSAM state is: failed > >>>> Time: 2007-03-22T08:23:48.172Z > >>>> Description: cannot initialise working directory: Could not connect > >>>> to FTP server on"gsiftp://grid-compute.leeds.ac.uk/ - User globus > >>>> credential is required but not specified in the context". > >>>> > >>>> I can assure you that I have both a GRIDSam proxy certificate and > >>>> proxy user certificate running. > >>>> > >>>> The GRAM log file found on the NGS machine can be found here: > >>>> http://igrid-ext.cryst.bbk.ac.uk/gram_job_mgr_9570.log > >>>> > >>>> The main point in this log is "GRAM_SCRIPT_ERROR = 26" > >>>> > >>>> ABelow you will find the output produced in the catalina.log file. > >>>> > >>>> Has anyone come across this error before? I would appreciate any > >>>> comments? > >>>> > >>>> Cheers > >>>> Mark > >>>> > >>>> - principal obtained from Axis transport - > >>>> EMA...@ma..., > >>>> CN=igrid.cryst.bbk.ac.uk, L=EISD, OU=UCL, O=eScience, C=UK > >>>> - principal obtained from Axis transport - > >>>> EMA...@ma..., > >>>> CN=igrid.cryst.bbk.ac.uk, L=EISD, OU=UCL, O=eScience, C=UK > >>>> - state {pending} reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fec > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7feb > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - principal obtained from Axis transport - > >>>> EMA...@ma..., > >>>> CN=igrid.cryst.bbk.ac.uk, L=EISD, OU=UCL, O=eScience, C=UK > >>>> - principal obtained from Axis transport - > >>>> EMA...@ma..., > >>>> CN=igrid.cryst.bbk.ac.uk, L=EISD, OU=UCL, O=eScience, C=UK > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fea > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - state {staging-in} reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Proxy Credentials received : > >>>> /C=UK/O=eScience/OU=Imperial/L=LeSC/CN=mark halling-brown > >>>> - basic authentication scheme selected > >>>> - staging (copy) file > >>>> http://test:to...@ig...:18080/filestage/5512182021072 > >>>>78041688/datPlain12 > >>>> > >>>> -> gsiftp://grid-data.man.ac.uk/551218202107278041688/datPlain12 > >>>> - basic authentication scheme selected > >>>> - total byte write 2508 > >>>> - datPlain12 staged > >>>> - state {staged-in} reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe9 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - executing groovy script > >>>> org/icenigrid/gridsam/core/plugin/connector/globus/rsl.groovy > >>>> - executed groovy script > >>>> org/icenigrid/gridsam/core/plugin/connector/globus/rsl.groovy > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe8 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - RSL &( arguments = "-f" "datPlain12" )( directory = > >>>> "551218202107278041688" )( executable = "/home/ngs0386/bin/cimmsim" )( > >>>> stderr = "stderr.txt" )( stdout = "stdout.txt" )( count = "1" )( > >>>> jobType = "mpi" ) > >>>> - Submitting globus job to grid-data.man.ac.uk/jobmanager-pbs > >>>> - Globus job submitted > >>>> https://grid-data.man.ac.uk:64304/25112/1174414409/ > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe7 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) WseSourceProcessor: No destinations to route the event > >>>> - globus job is active - > >>>> https://grid-data.man.ac.uk:64304/25112/1174414409/ - state {active} > >>>> reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe6 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - state {staging-out} reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> - staging file > >>>> gsiftp://grid-data.man.ac.uk/551218202107278041688/stdout.txt -> > >>>> webdav://test:to...@ig...:18080/filestage/55121820210 > >>>>7278041688/stdout.txt [Fatal Error] :-1:-1: Premature end of file. > >>>> - total byte write 4353 > >>>> [Fatal Error] :-1:-1: Premature end of file. > >>>> - stdout.txt staged > >>>> - staging file > >>>> gsiftp://grid-data.man.ac.uk/551218202107278041688/stderr.txt -> > >>>> webdav://test:to...@ig...:18080/filestage/55121820210 > >>>>7278041688/stderr.txt [Fatal Error] :-1:-1: Premature end of file. > >>>> - total byte write 194 > >>>> [Fatal Error] :-1:-1: Premature end of file. > >>>> - stderr.txt staged > >>>> - staging file > >>>> gsiftp://grid-data.man.ac.uk/551218202107278041688/_th-details.out -> > >>>> webdav://test:to...@ig...:18080/filestage/55121820210 > >>>>7278041688/_th-details.out [Fatal Error] :-1:-1: Premature end of file. > >>>> - total byte write 32075 > >>>> [Fatal Error] :-1:-1: Premature end of file. > >>>> - _th-details.out staged > >>>> - state {staged-out} reached > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe5 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) - state {done} reached > >>>> - done 1174414390889 1174414391097 1174414406236 1174414430046 > >>>> 1174414449984 1174414470047 1174414482532 1174414482572 > >>>> WseSourceProcessor: No destinations to route the event > >>>> - Job ff80808211705e810111708d76680005.-6a0625b2:111705f74ad:-7fe4 > >>>> threw a JobExecutionException: > >>>> org.quartz.JobExecutionException > >>>> at > >>>> org.icenigrid.gridsam.core.plugin.manager.DefaultJobManagerContext$Sta > >>>>geTask.execute(DefaultJobManagerContext.java:525) at > >>>> org.quartz.core.JobRunShell.run(JobRunShell.java:191) at > >>>> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.ja > >>>>va:516) |