From: Guillermo M. P. <gui...@si...> - 2013-03-05 11:25:43
|
Guillermo, what is your xml file now? Your xml file should define a wrapper script that will call your bwa.sh and define a Fork job manager. On Mar 4, 2013, at 7:52 AM, Guillermo Marco Puche wrote: > Hello, > > I've been following your guideline. > > I don't get opal errors now, even though job stays on state 1: > > Date and time : *3/4/2013 4:45:09 PM* > JobId : appBWA_SH13623864602351552853141 > Status code: 1 > Message: Launching executable > > > Here's my opal/etc/condor.expr: > > universe = grid > grid_resource = batch sge mastablasta@cacique > output = test.out > error = test.error > log = test.log > should_transfer_files = YES > transfer_output = true > stream_output = true > when_to_transfer_output = ON_EXIT_OR_EVICT > queue It looks like you are mixing calling condor and sge which should not be the case. you need to look at the resulting submit file that is produced by opal. Opal "knows" only vanila and parallel. I don't know how condor will treat a submit file where universe or queue is defined multiple times. This is why i think the condor job manager will not work for your specific case and you need to submit your condor job via Fork job manager > > And here my bwa.sh (executable called inside bwa_sh.xml): > > #!/bin/bash > #$ -V > ### nombre > #$ -N bwa_bosco > ### directorio de trabajo > #$ -cwd > ### juntar los output > #####$ -j y > ### seleccionar all.q > #$ -q all.q > > cd /home/mastablasta > bwa aln /home/mastablasta/ref/hg19.fa /home/mastablasta/input/HapMap_2.fastq -t 8 > /home/mastablasta/output/tmp/HapMap.right.sai not quite right. This is a submit file for sge job, not for a fork fork job. Your wrapper script should check for the input file (your condor submit file) that you upload via opal dashboard and call condor submit with it. > This should work. I've tested submitting it with condor_submit and > works well, job is queued on remote SGE cluster. the command you use here for a command line needs to be reproduced in your wrapper script. > > But with Opal I’m getting glideinjobs in condor_q when no glidein > universe has been specified in condor.expr and 4 jobs being spawned in > remote SGE queue, which is very weird. But no signs of BWA process. > > local condor_q Opal machine: > > $ condor_q > -- Submitter: brugal : <192.168.6.2:11000?sock=2009_e522_3> : brugal > ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD > 9.0 mastablasta 3/4 09:41 0+00:00:00 I 0 0.0 bwa.sh > 9.1 mastablasta 3/4 09:41 0+00:00:00 I 0 0.0 bwa.sh you submitted 2 jobs here via your opal interface (different job ids) > > 10.0 mastablasta 3/4 09:41 0+00:16:39 R 0 0.0 glidein_wrapper.sh > 11.0 mastablasta 3/4 09:41 0+00:16:39 R 0 0.0 glidein_wrapper.sh > 12.0 mastablasta 3/4 09:41 0+00:16:11 R 0 0.0 glidein_wrapper.sh > 13.0 mastablasta 3/4 09:41 0+00:16:10 R 0 0.0 glidein_wrapper.sh i can only guess here, but your condor submit file may have two queue statements, one form your condor.expr and one that opal writes, hence condor submits 2 jobs per 2 queue statements. > 6 jobs; 0 completed, 0 removed, 2 idle, 4 running, 0 held, 0 suspended > > Here's remote SGE queue: > > $ qstat > job-ID prior name user state submit/start at queue slots ja-task-ID > ----------------------------------------------------------------------------------------------------------------- > 61 0.55500 bl_a29aa29 mastablasta r 03/04/2013 17:31:30all.q@compute-0-0.local 1 > 62 0.55500 bl_f1cbb6c mastablasta r 03/04/2013 17:31:30all.q@compute-0-0.local 1 > 63 0.55500 bl_1dc49f4 mastablasta r 03/04/2013 17:32:00all.q@compute-0-0.local 1 > 64 0.55500 bl_ced1f94 mastablasta r 03/04/2013 17:32:00all.q@compute-0-0.local 1 what is the output of "qstat -j 61" ? nadya > > On 03/01/2013 09:02 AM, Guillermo Marco Puche wrote: >> Hello Nadya, >> >> Thank you for the information. This starts to make sense. >> I had no idea on how to pass Opal my Condor config. >> >> I'm going to try to make this work with Grid universe. I'll report asap. >> >> >> Once again, thank you very much. >> >> Best regards, >> Guillermo. >> >> On 02/28/2013 06:32 PM, nadya williams wrote: >>> Hi Guillermo, >>> >>> there are a few issues here. >>> On Feb 28, 2013, at 12:47 AM, Guillermo Marco Puche wrote: >>> >>>> Hello Luca, >>>> >>>> I currently can't run Opal jobs with Condor job scheduler: >>>> >>>> Here's my basic app: bwa.xml --> http://pastebin.com/uGqvvBki >>>> I know it has empty parameters but it's for testing purposes at >>>> this moment. All parameters and flags are run in a shell script >>>> invoked by condor job file. >>> your xml file is not correct. You are using your condor.submit file >>> in place of an executable: >>> <binaryLocation>/opt/web/opal_scripts/bwa/bwa.condor</binaryLocation> >>> Instead, here you need to use your /opt/web/opal_scripts/bwa/bwa.sh >>> and any other parameters from which opal will make condor_submit >>> file. Opal v 2.5 has condor.expr.file >>> variable in opal.properties file: >>> # Enable if there are server-specific condor expressions. Put >>> expressions in the file >>> #condor.expr.file=/opt/opal/etc/condor.expr >>> >>> This is a file (condor submit syntax) that needs to be used to tell >>> opal to add extra parameters to every submit file that is generated. >>> This allows for server-side specific variables to be added. >>> >>>> >>>> I want opal to execute my bwa.condor file (condor job file) which >>>> currently works with condor_submit command but not with Opal. >>> this is not going to happen using your current xml file. The way >>> condor+opal work is opal generates condor submit file >>> from the parameters given in xml file. You have a universe that we >>> did not test with condor before. >>> Currently, opal+condor is working with vanila or parallel universe >>> only. >>> >>> You can use a workaround: >>> >>> I suggest you try to make a wrapper script and use it in >>> <binaryLocation> in xml file. >>> In you xml file add properties like >>> (1) condor submit file (untagged parameter for upload of the submit >>> file) as >>> <param> >>> <id>submitFile</id> >>> <paramType>FILE</paramType> >>> <ioType>INPUT</ioType> >>> <required>true</required> >>> <textDesc>upload a condor submit file </textDesc> >>> </param> >>> >>> (2) use Fork Job Manager as >>> <jobManagerFQCN>edu.sdsc.nbcr.opal.manager.ForkJobManager</jobManagerFQCN> >>> (3) use <parallel>false</parallel> >>> >>> please see apbs_parallel_1.3.xml in the opal distro for an example. >>> Your wrapper script need to understand that it supposed to look for >>> a submit file (check for presence) >>> and then condor submission via "condor_submit yourfile" >>> >>> This way you will have a flexibility to create "any" submit file >>> and use any universe and other specifics of the >>> condor submission that we currently don't handle for simple cases. >>> >>> Regards, >>> Nadya >>> >>> >>>> >>>> Here's the content of bwa.condor: http://pastebin.com/P9GNwriJ >>>> And here the bwa.sh invoked by bwa.condor to be executed on remote >>>> cluster: http://pastebin.com/dWzQa92E >>>> >>>> Best regards, >>>> Guillermo. >>>> >>>> >>>> On 02/27/2013 11:17 PM, Luca Clementi wrote: >>>>> On Wed, Feb 27, 2013 at 4:46 AM, Guillermo Marco Puche >>>>> <gui...@si...> wrote: >>>>>> Hello, >>>>>> >>>>>> I would like to know what are the benefits/extras of using Condor >>>>>> job.scheduler with Opal. >>>>> The jobs you submit to Opal will be executed using Condor. >>>>> Opal simply gives you a web service interface and then it has >>>>> different back-end to actually execute your jobs (condor, sge, pbs, >>>>> etc.). >>>>> >>>>>> What's the difference between job scheduler and submitting jobs to Condor >>>>>> straight with condor_submit? >>>>> you mean the difference between using Opal vs using condor_submit? >>>>> If you use Opal you can invoke the lanuchJob operation using >>>>> web-service standard (we provide python and java client side >>>>> libraries). >>>>> If you use condor you have to ssh to a machine (aka have an account) >>>>> and then you need to create a submission script and execute >>>>> condor_submit. >>>>> >>>>> >>>>> In NBCR We use Opal to submit job from web portal (where you have 1 >>>>> user the web portal which is in charge of running different simulation >>>>> and opal takes care of creating working directories staging input and >>>>> outputs etc.). >>>>> >>>>> >>>>> Luca >>>> >>>> >>>> -- >>>> <1MjpCpe.png> <http://i.imgur.com/1MjpCpe.png> *g.marco*: >>>> Informatician at Sistemas Genómicos S.L <x-msg://69/#> >>>> phone: 0034635197460 <callto:0034635197460> >>>> web: www.sistemasgenomicos.com <http://www.sistemasgenomicos.com/> >>>> ------------------------------------------------------------------------------ >>>> Everyone hates slow websites. So do we. >>>> Make your web apps faster with AppDynamics >>>> Download AppDynamics Lite for free today: >>>> http://p.sf.net/sfu/appdyn_d2d_feb_______________________________________________ >>>> Opaltoolkit-users mailing list >>>> Opa...@li... >>>> https://lists.sourceforge.net/lists/listinfo/opaltoolkit-users >>> >>> Nadya Williams University of California, San Diego >>> na...@sd... <mailto:na...@sd...> 9500 Gilman Dr. >>> MC 0446 >>> +1 858 534 1820 (ofc) La Jolla, CA 92093-0446 >>> +1 858 822 1619 (fax) USA >>> >>> >>> >> >> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_d2d_feb >> >> >> _______________________________________________ >> Opaltoolkit-users mailing list >> Opa...@li... >> https://lists.sourceforge.net/lists/listinfo/opaltoolkit-users > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb_______________________________________________ > Opaltoolkit-users mailing list > Opa...@li... > https://lists.sourceforge.net/lists/listinfo/opaltoolkit-users Nadya Williams University of California, San Diego na...@sd... <mailto:na...@sd...> 9500 Gilman Dr. MC 0446 +1 858 534 1820 (ofc) La Jolla, CA 92093-0446 +1 858 822 1619 (fax) USA |