From: nadya w. <na...@sd...> - 2015-01-23 17:34:30
|
HI Charles, can you submit and execute jobs successfully outside of opal using age submission? If not, first need to make sure that your age configuration is correct. If yes, check how is the opal-jobs/ created. This directory needs to be NFS-mounted on all the nodes from the fronted (or opal server) node of the cluster. For example, /opt/tomcat/webapps/opal-jobs -> /share/opal/opal-jobs And /share/opal/opal-jobs is NFS mounted. Check your in opal.properties: tomcat.url - FQDN of your cluster fronted drmaa.queue - set to your default SGE queue drmaa.pe - set to parallel environment of your SGe configuration Are you running on ec2? The starcluster is not the same “cluster" as we have in rocks. It is a group of VMs, and in rocks we have a computing cluster. I don’t know how the SGE configuration and inter-node communication is handled. My guess would be if the SGE is working from the command line and SGE obs are running correctly, then the opal configuration should just follow your SGE specifics and the above 4 variables should take care of it. Nadya > On Jan 23, 2015, at 1:36 AM, Charles Grant <ce...@uw...> wrote: > > Hi, > > I'm trying to configure Opal to use the DRMAA job manager on an SGE cluster (2011.11) configured using Starcluster. > > We're running Tomcat 7.0.57. Our application uses the Opal Java API. I can submit jobs to the cluster using the command line, and my Opal installation works fine if I have > > opal.jobmanager=edu.sdsc.nbcr.opal.manager.ForkJobManager > > But if I use > > opal.jobmanager=edu.sdsc.nbcr.opal.manager.DRMAAJobManager > > The submission to Opal fails with the trace included below. It seems to saying that the SOAP used to describe the job is malformed, but I'm not sure why it would work with one job manager but not another. > > Jan 23, 2015 9:01:59 AM org.apache.catalina.core.StandardWrapperValve invoke > SEVERE: Servlet.service() for servlet [SubmitMemeJob] in context with path [] threw exception > AxisFault > faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException > faultSubcode: > faultString: java.lang.reflect.InvocationTargetException > faultActor: > faultNode: > faultDetail: > {http://xml.apache.org/axis/}hostname:master > > java.lang.reflect.InvocationTargetException > at org.apache.axis.message.SOAPFaultBuilder.createFault(SOAPFaultBuilder.java:221) > at org.apache.axis.message.SOAPFaultBuilder.endElement(SOAPFaultBuilder.java:128) > at org.apache.axis.encoding.DeserializationContext.endElement(DeserializationContext.java:1087) > at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) > at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1781) > at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2957) > at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) > at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117) > at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) > at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) > at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) > at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) > at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) > at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) > at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:333) > at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227) > at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696) > at org.apache.axis.Message.getSOAPEnvelope(Message.java:424) > at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62) > at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206) > at org.apache.axis.client.Call.invokeEngine(Call.java:2765) > at org.apache.axis.client.Call.invoke(Call.java:2748) > at org.apache.axis.client.Call.invoke(Call.java:2424) > at org.apache.axis.client.Call.invoke(Call.java:2347) > at org.apache.axis.client.Call.invoke(Call.java:1804) > at edu.sdsc.nbcr.opal.AppServicePortTypeSoapBindingStub.launchJob(AppServicePortTypeSoapBindingStub.java:624) > at au.edu.uq.imb.memesuite.servlet.SubmitJob.submitOpalJob(SubmitJob.java:655) > at au.edu.uq.imb.memesuite.servlet.SubmitJob.doPost(SubmitJob.java:699) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:646) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) > at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) > at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) > at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) > at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:503) > at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) > at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) > at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) > at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) > at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421) > at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070) > at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611) > at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.lang.Thread.run(Thread.java:744) > > > > > ------------------------------------------------------------------------------ > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. > GigeNET is offering a free month of service with a new server in Ashburn. > Choose from 2 high performing configs, both with 100TB of bandwidth. > Higher redundancy.Lower latency.Increased capacity.Completely compliant. > http://p.sf.net/sfu/gigenet > _______________________________________________ > Opaltoolkit-users mailing list > Opa...@li... > https://lists.sourceforge.net/lists/listinfo/opaltoolkit-users Nadya Williams University of California, San Diego na...@sd... 9500 Gilman Dr. MC 0444 +1 858 534 1820 (ofc) La Jolla, CA 92093-0444 +1 858 822 1619 (fax) USA |