Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0
Status: Beta
Brought to you by:
bhandsaker
From: Ashish K. <as...@we...> - 2011-10-24 23:14:58
|
Thanks Bob. On the parallelisation issue, could you please clarify more. 1. When this -windowSize is for e.g. 3 Mb, then does each window run in parallel or that's only for chunking, and we could optionally make them run separately and join the outputs later? Is it same for the SVGenotyper's -parallelJobs option? 2. On various runs, I've noticed that the no. of optimal cores required by the program on a multi-core architecture is 2 cores. Is this correct or the program can use more cores on a 8-core node in some different settings? Best, Ashish. From: Bob Handsaker [mailto:han...@br...] Sent: 21 October 2011 16:49 To: Ashish Kumar Cc: svt...@li... Subject: Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0 Yes, you can use -L to process just a specific interval (or can pass a file with a list of intervals - the file extension must be .list). Note that the SVDiscovery queue script also does parallelization internally, based on the -windowSize parameter. For example, here are typical parameters to process the genome (or just the intervals selected with -L) in 3Mb windows for events between 100bp and 100Kb: -windowSize 3000000 -windowPadding 100000 -minimumSize 100 -maximumSize 100000 You can invoke the queue script without '-run' to preview the chunking. -Bob On 10/21/11 11:30 AM, Ashish Kumar wrote: Hi Bob, On the same issue, if we want to use the -L option, would it be safe to presume that we can chunk up the chromosomes. So, say something like "-L 20:1250000-2500000" would be a valid option, assuming that this sequence exists in my reference genome? Thanks, Ashish From: Bob Handsaker [mailto:han...@br...] Sent: 06 October 2011 14:40 To: svt...@li...<mailto:svt...@li...> Subject: Re: [svtoolkit-help] Genome strip - Unrecognized sequence: 1:0-0 This is because the example script uses "-L 1" (only process chromosome 1) to make it faster, but you likely don't have a sequence named "1" in your reference genome. To process the whole genome, simply remove the -L argument. -Bob On 10/5/11 2:50 PM, Axel Ericsson wrote: Hi I have get the following error message when I run the Genome strip: I forwarded the modified shell script, hope you could point me in the right direction. Best regards Axel INFO 14:37:52,013 QScriptManager - Compiling 2 QScripts INFO 14:37:59,172 QScriptManager - Compilation complete INFO 14:38:02,654 HelpFormatter - --------------------------------------------------------- INFO 14:38:02,655 HelpFormatter - Program Name: org.broadinstitute.sting.queue.QCommandLine INFO 14:38:02,655 HelpFormatter - Program Args: -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVDiscovery.q -S /seq/vsag/axel/tools/Genome_strip/svtoolkit/qscript/SVQScript.q -gatk /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -cp /seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/SVToolkit.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/seq/vsag/axel/tools/Genome_strip/svtoolkit/lib/gatk/Queue.jar -configFile conf/genstrip_HCSMA_parameters.txt -tempDir ./tmpdir -R /seq/references/Canis_lupus_familiaris_assembly2/v0/Canis_lupus_familiaris_assembly2.fasta -genomeMaskFile /seq/vsag/hyunji/CCD/capture/genomestrip/canFam2_1_index/work/Canis_lupus_familiaris_assembly2.mask.fasta -genderMapFile data/HCSMA_gender.map -runDirectory HCSMA -md HCSMA/metadata -jobLogDir HCSMA/logs -L 1 -minimumSize 100 -maximumSize 1000000 -I /seq/vsag/axel/bamfiles/HCSMCRealignment.HCSMA_B90_Homo_1.clean.dedup.recal.bam -O HCSMA.discovery.vcf -run INFO 14:38:02,656 HelpFormatter - Date/Time: 2011/10/05 14:38:02 INFO 14:38:02,656 HelpFormatter - --------------------------------------------------------- INFO 14:38:02,657 HelpFormatter - --------------------------------------------------------- INFO 14:38:02,661 QCommandLine - Scripting SVDiscovery ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR stack trace java.lang.IllegalArgumentException: Unrecognized sequence: 1:0-0 at org.broadinstitute.sv.queue.ComputeDiscoveryPartitions.computePartitions(ComputeDiscoveryPartitions.java:96) at org.broadinstitute.sv.qscript.SVQScript.computeDiscoveryPartitions(SVQScript.q:132) at SVDiscovery.script(SVDiscovery.q:19) at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:46) at org.broadinstitute.sting.queue.QCommandLine$$anonfun$execute$1.apply(QCommandLine.scala:43) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549) at scala.collection.IterableLike$class.foreach(IterableLike.scala:79) at scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:596) at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:43) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:117) at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala) ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M): ##### ERROR ##### ERROR Please visit to wiki to see if this is a known problem ##### ERROR If not, please post the error, with stack trace, to the GATK forum ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa ##### ERROR ##### ERROR MESSAGE: Unrecognized sequence: 1:0-0 ##### ERROR ------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ svtoolkit-help mailing list svt...@li...<mailto:svt...@li...> https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |