|
From: Ryan D. <rp...@im...> - 2006-08-03 21:55:46
|
Hi,
I belatedly realised that I am using GridSAM in fork mode, which means
it is not submitting my jobs to the cluster, but rather is running them
all on a single machine.
This is a major problem for my MSc project which is supposed to be
nearing completion, but is in fact not doing the precise thing it is
supposed to - i.e. submit jobs to various clusters.
Looking through the documentation, I found that I should change
jobmanager.xml to this:
<?xml version="1.0" encoding="UTF-8"?>
<module id="jobmanager.drmaa" version="1.0.0">
<!-- dependent modules -->
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/common.xml"/>
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/shell.xml"/>
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/drmaa.xml"/>
<sub-module
descriptor="org/icenigrid/gridsam/resource/config/embedded.xml"/>
<sub-module descriptor="database.xml"/>
<sub-module descriptor="authorisation.xml"/>
</module>
If I understand correctly, this should be all that is required.
However, when I start up GridSAM, it seems to complain about a missing
file:
ERROR [DrmaaDRMConnectorManager] failed to load the DRMAA library -
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so:
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so: cannot open
shared object file: No such file or directory
However, this file is present on the filesystem:
bash-3.00$ ls -l /vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so
-rwxr-xr-x 1 root root 1635857 May 8 09:17
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so
I also tried on another cluster (mars) and got the same error (except
about its version of libdrmaa.so).
I've no idea where to go from here to sort it out.
Can anyone help?
-------------------------------------------------------------
log of the error from gridsam.log:
2006-08-03 22:28:49,147 DEBUG [DRMConnectorManager] Creating
SingletonProxy for service drmaa.DRMConnectorManager
2006-08-03 22:28:49,181 DEBUG [DRMConnectorManager] Constructing core
service implementation for service drmaa.DRMConnectorManager
2006-08-03 22:28:49,360 DEBUG [DrmaaDRMConnectorManager] failed to load
the DRMAA library -
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so:
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so: cannot open
shared object file: No such file or directory
java.lang.UnsatisfiedLinkError:
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so:
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so: cannot open
shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1751)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1676)
at java.lang.Runtime.loadLibrary0(Runtime.java:822)
at java.lang.System.loadLibrary(System.java:992)
at com.sun.grid.drmaa.SessionImpl$1.run(SessionImpl.java:58)
at java.security.AccessController.doPrivileged(Native Method)
at com.sun.grid.drmaa.SessionImpl.<clinit>(SessionImpl.java:56)
at
com.sun.grid.drmaa.SessionFactoryImpl.getSession(SessionFactoryImpl.java:59)
at
org.icenigrid.gridsam.core.plugin.connector.drmaa.DrmaaDRMConnectorManager.getDrmaaSession(DrmaaDRMConnectorManager.java:61)
at
org.icenigrid.gridsam.core.plugin.connector.drmaa.DrmaaDRMConnectorManager.<clinit>(DrmaaDRMConnectorManager.java:46)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at
org.apache.hivemind.impl.DefaultClassResolver.lookupClass(DefaultClassResolver.java:101)
at
org.apache.hivemind.impl.DefaultClassResolver.checkForClass(DefaultClassResolver.java:108)
at
org.apache.hivemind.impl.ModuleImpl.resolveType(ModuleImpl.java:191)
at
org.apache.hivemind.service.impl.BuilderFactoryLogic.instantiateCoreServiceInstance(BuilderFactoryLogic.java:100)
at
org.apache.hivemind.service.impl.BuilderFactoryLogic.createService(BuilderFactoryLogic.java:75)
at
org.apache.hivemind.service.impl.BuilderFactory.createCoreServiceImplementation(BuilderFactory.java:42)
at
org.apache.hivemind.impl.InvokeFactoryServiceConstructor.constructCoreServiceImplementation(InvokeFactoryServiceConstructor.java:84)
at
org.apache.hivemind.impl.servicemodel.AbstractServiceModelImpl.constructCoreServiceImplementation(AbstractServiceModelImpl.java:107)
at
org.apache.hivemind.impl.servicemodel.AbstractServiceModelImpl.constructNewServiceImplementation(AbstractServiceModelImpl.java:157)
at
org.apache.hivemind.impl.servicemodel.AbstractServiceModelImpl.constructServiceImplementation(AbstractServiceModelImpl.java:139)
at
org.apache.hivemind.impl.servicemodel.SingletonServiceModel.getActualServiceImplementation(SingletonServiceModel.java:68)
at
$DRMConnectorManager_10cd5f02336._service($DRMConnectorManager_10cd5f02336.java)
at
$DRMConnectorManager_10cd5f02336.initialise($DRMConnectorManager_10cd5f02336.java)
at
$DRMConnectorManager_10cd5f02335.initialise($DRMConnectorManager_10cd5f02335.java)
at
org.icenigrid.gridsam.core.plugin.manager.DelegatingDRMConnectorManager.initialise(DelegatingDRMConnectorManager.java:58)
at
$DRMConnectorManager_10cd5f0232c.initialise($DRMConnectorManager_10cd5f0232c.java)
at
$DRMConnectorManager_10cd5f0232b.initialise($DRMConnectorManager_10cd5f0232b.java)
at
org.icenigrid.gridsam.core.plugin.manager.DefaultJobManager.initialise(DefaultJobManager.java:150)
at
org.icenigrid.gridsam.core.plugin.manager.DefaultJobManager.sanityCheck(DefaultJobManager.java:184)
at
org.icenigrid.gridsam.core.plugin.manager.DefaultJobManager.<init>(DefaultJobManager.java:95)
at
org.icenigrid.gridsam.core.plugin.manager.DefaultJobManager.<init>(DefaultJobManager.java:84)
at
org.icenigrid.gridsam.core.plugin.manager.DefaultJobManager.<init>(DefaultJobManager.java:74)
at
org.icenigrid.gridsam.webservice.servlet.JobManagerConfigurator.contextInitialized(JobManagerConfigurator.java:46)
at
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:3805)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4321)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:823)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:807)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:595)
at
org.apache.catalina.core.StandardHostDeployer.install(StandardHostDeployer.java:277)
at
org.apache.catalina.core.StandardHost.install(StandardHost.java:832)
at
org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:683)
at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:432)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:964)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:349)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1091)
at
org.apache.catalina.core.StandardHost.start(StandardHost.java:789)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1083)
at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:478)
at
org.apache.catalina.core.StandardService.start(StandardService.java:476)
at
org.apache.catalina.core.StandardServer.start(StandardServer.java:2298)
at org.apache.catalina.startup.Catalina.start(Catalina.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:284)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:422)
2006-08-03 22:28:49,368 ERROR [DrmaaDRMConnectorManager] failed to load
the DRMAA library -
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so:
/vol/grail/software/sge-6.0/lib/lx24-amd64/libdrmaa.so: cannot open
shared object file: No such file or directory
2006-08-03 22:28:49,368 FATAL [DrmaaDRMConnectorManager] DRMAA
DRMConnector fails to initialise. Please consult the log for advice.
2006-08-03 22:28:49,416 INFO [JobManagerConfigurator] GridSAM machinery
initialised
|