svtoolkit-help Mailing List for Structural Variation Toolkit (Page 10)
Status: Beta
Brought to you by:
bhandsaker
You can subscribe to this list here.
2011 |
Jan
|
Feb
(2) |
Mar
(7) |
Apr
(3) |
May
|
Jun
(9) |
Jul
(2) |
Aug
(2) |
Sep
(15) |
Oct
(13) |
Nov
(11) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(14) |
Feb
(5) |
Mar
(4) |
Apr
(8) |
May
(2) |
Jun
(8) |
Jul
(6) |
Aug
(4) |
Sep
(8) |
Oct
(5) |
Nov
(2) |
Dec
(4) |
2013 |
Jan
(4) |
Feb
|
Mar
(4) |
Apr
|
May
(8) |
Jun
(3) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
|
Dec
(3) |
2014 |
Jan
|
Feb
|
Mar
(3) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2015 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(6) |
May
|
Jun
(7) |
Jul
|
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(1) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(3) |
Sep
(3) |
Oct
(1) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
(3) |
Nov
(1) |
Dec
|
2018 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
(6) |
Jul
|
Aug
(4) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: Philine F. <p.f...@un...> - 2011-07-15 15:47:42
|
Hi Bob, sorry about my slow reply, we had some issues with our computing cluster. But now I managed to try out the latest interim release you suggested previously, unfortunately the error message stays the same (as you can see below). I also double checked that my input is a sorted bam. It was successfully run through realigning and recalibration using GATK, and SNPs and haplotypes can be called on this file utilizing GATK. Besides I also split the combined bam (paired end and mate pair libraries) into separate bams, which still gives the same error. Thanks for your help again, Philine ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR stack trace java.lang.IllegalArgumentException: Left read of read pair fails left read test: HWI-ST143_0294:7:1:19426:37893#0 97 groupXXI 4523762 37 46M groupXXI 4529101 5385 CACTAAGTGCTTCCTCGATTTCGCCAAGATTTGTTCAGCATGGAAC 7767687877776676387776376787877768767777776886 X0:i:1 X1:i:0 MD:Z:46 RG:Z:BS25pair XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 OQ:Z:IIIIIHIIIIEIIIIIHIIIIIIIIIGIHIIIIHIIGIGIIIBHII XT:A:U at org.broadinstitute.sv.util.ReadPair.create(ReadPair.java:135) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.createReadPair(ReadPairRecordFilter.java:300) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.generateReadPairs(ReadPairRecordFilter.java:221) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.filterReadPairs(ReadPairRecordFilter.java:97) at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.finishReadPairSelection(DeletionDiscoveryAlgorithm.java:216) at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:166) at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:165) at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:44) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:85) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:236) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:116) at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221) at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72) at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21) ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5718M): ##### ERROR ##### ERROR Please visit the wiki to see if this is a known problem ##### ERROR If not, please post the error, with stack trace, to the GATK forum ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa ##### ERROR ##### ERROR MESSAGE: Left read of read pair fails left read test: HWI-ST143_0294:7:1:19426:37893#0 97 groupXXI 4523762 37 46M groupXXI 4529101 5385 CACTAAGTGCTTCCTCGATTTCGCCAAGATTTGTTCAGCATGGAAC 7767687877776676387776376787877768767777776886 X0:i:1 X1:i:0 MD:Z:46 RG:Z:BS25pair XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 OQ:Z:IIIIIHIIIIEIIIIIHIIIIIIIIIGIHIIIIHIIGIGIIIBHII XT:A:U ##### ERROR ------------------------------------------------------------------------------------------ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Philine Feulner Westfälische Wilhelms University Institute for Evolution and Biodiversity Evolutionary Bioinformatics Group Hüfferstrasse 1 48149 Münster Germany Tel: +49 (0) 251 83 21636 Fax: +49 (0) 251 83 24668 Email: p.f...@un... ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
From: Bob H. <han...@br...> - 2011-06-30 02:15:42
|
This is a question about GATK, not about SVToolkit. You should post this at GetSatisfaction at http://getsatisfaction.com/gsa -Bob On 6/29/11 1:41 AM, sunchangyue wrote: > Hi, > Why do I still get this erro even though all reads have RG tag in sam. > > ##### ERROR MESSAGE: SAM/BAM file > SAMFileReader{/share/data/staff/sunchy/BFC2011010/HL040/GATK/test_HL040_1_pair_NM2_header.sorted.rmdup.bam} > is malformed: The input .bam file contains reads with no read group. > First observed at read with name = > HWI-ST298:171:81MJKABXX:6:1101:11277:2454 Users must set both the > default read group using the --default_read_group <String> argument > and the default platform using the --default_platform <String> argument. > > and here is the sam: > @SQ SN:1 LN:249250621 > @SQ SN:2 LN:243199373 > @SQ SN:3 LN:198022430< /span> > @SQ SN:4 LN:191154276 > @SQ SN:5 LN:180915260 > @SQ SN:6 LN:171115067 > @SQ SN:7 LN:159138663 > @SQ SN:8 LN:146364022 > @SQ SN:9 LN:141213431 > @SQ SN:10 LN:135534747 > @SQ SN:11 LN:135006516 > @SQ SN:12 LN:133851895 > @SQ SN:13 LN:115169878 > @SQ SN:14 LN:107349540 > @SQ SN:15 LN:102531392 > @SQ SN:16 LN:90354753 > @SQ SN:17 LN:81195210 > @SQ SN:18 LN:78077248 > @SQ SN:19 LN:59128983 > @SQ &nb sp; SN:20 LN:63025520 > @SQ SN:21 LN:48129895 > @SQ SN:22 LN:51304566 > @SQ SN:X LN:155270560 > @SQ SN:Y LN:59373566 > HWI-ST298:171:81MJKABXX:6:1101:1664:2484 83 10 > 42832500 37 63M = 42832371 -192 > AATTATATTTAGTAAAGCTTAACAACCAATAAAAGGCTTTACCACATTC > TTCGAATTTGTAAG > DC<48**?0*@?**?B9*??3@GFCAE9IIIIGEEHGEHCF9B<<GHGHGHHHBHFFFFFCC@ > RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:1 SM:i:37 AM:i:0 X0:i:1 X1:i:0 > XM:i:1 XO:i:0 XG:i:0 M > D:Z:6G56 RG:Z:READ_GROUP_1 > HWI-ST298:171:81MJKABXX:6:1101:1664:2484 163 10 > 42832371 36 50M = 42832500 192 > AGCTTTGCCACATTCTTCACATTTGCAGGGTTTCTCTCCCGTACGAATT > C @@BDFB>?B?BHHIFEGCEHF9EE><AFEGCEH9<???F*00))00?FHG > RG:Z:FLOWCELL1-LINE1 XT:A:R NM:i:1 SM:i:0 &n bsp;AM:i:0 X0:i:5 > X1:i:3 XM:i:1 XO:i:0 XG:i:0 MD:Z:43T6 R > G:Z:READ_GROUP_1 > HWI-ST298:171:81MJKABXX:6:1101:2442:2401 99 13 > 101256483 60 49M = 101256650 230 > GTAACAAAAATAAAGATGTGAGGCTGCCTGCTCTTGCCTAAAGCATGGC > @@@FFFFDDDHHHBHBGGC4AFGICB;AFG>3?<D@F;CD@GH**09?@ > RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 > XM:i:0 XO:i:0 XG:i:0 MD:Z:49 RG:Z:READ > _GROUP_1 > HWI-ST298:171:81MJKABXX:6:1101:2442:2401 147 13 > 101256650 60 63M = 101256483 -230 > GAGAAAAGCATATAGATATTCTATGTTAAAACTTCCATTCCTCATTCGA > TTATTTGCCCTATT > HC83<GFB899FB499?00*EFGF>GB???1HEEGE9E@EIHHCGGFCCAFADD;;FFFD@@< > RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:2 SM:i:37 AM:i:37 X0:i:1 X1:i:0 > XM:i:2 XO:i:0 XG:i:0 M > D:Z:1G17A43 RG:Z:READ_GROUP_1 > HWI-ST298:171:81MJKABXX:6:1101:2496:2427 83 2 0 > 49195046 60 65M = 49194943 -168 > TCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGCCCGGAGCACGGGCCC > GTTGTGGTGCACTGCA > ?:5(55;3ABA:FFCBECHC?>>ACDF@IGHBF6IJJJJIGGGGHBHJIGHGFGHHHDDDFFC@@ > RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 > X1:i:0 XM:i:0 X > O:i:0 XG:i:0 MD:Z:65 > HWI-ST298:171:81MJKABXX:6:1101:2496:2427 163 20 > 49194943 60 60M = 49195046 168 > TTTTTCTCTTTCAGACCCAAGAAACTCGAGAGATCTTACATTTCCACTA > TACCACATGGC > @@@FFE?DDHFFFGDEEBEE;BCE<C9CAC@F19CF**:*?****0*00?9B9B328B>D > RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 > X1:i:0 XM:i:0 XO:i:0 XG:i:0 M > D:Z:60 RG:Z:READ_GROUP_1 > > My commander line is : > /usr/java/jdk1.6.0_16/bin/java -jar > /share/apps/GenomeAnalysisTK-1.0.5777/GenomeAnalysisTK.jar -l INFO -R > /share/data/staff/sunchy/data/GATK/human_g1k_v37.fasta --DBS NP > /share/data/staff/sunchy/data/GATK/dbsnp_129_b37.rod -I > sorted.rmdup.bam -T CountCovariates -cov ReadGroupCovariate -cov > QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate > -recalFile var.csv > > Thank you > Cheo > > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: sunchangyue <cha...@ho...> - 2011-06-29 05:41:10
|
Hi,Why do I still get this erro even though all reads have RG tag in sam. ##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/share/data/staff/sunchy/BFC2011010/HL040/GATK/test_HL040_1_pair_NM2_header.sorted.rmdup.bam} is malformed: The input .bam file contains reads with no read group. First observed at read with name = HWI-ST298:171:81MJKABXX:6:1101:11277:2454 Users must set both the default read group using the --default_read_group <String> argument and the default platform using the --default_platform <String> argument. and here is the sam:@SQ SN:1 LN:249250621@SQ SN:2 LN:243199373@SQ SN:3 LN:198022430@SQ SN:4 LN:191154276@SQ SN:5 LN:180915260@SQ SN:6 LN:171115067@SQ SN:7 LN:159138663@SQ SN:8 LN:146364022@SQ SN:9 LN:141213431@SQ SN:10 LN:135534747@SQ SN:11 LN:135006516@SQ SN:12 LN:133851895@SQ SN:13 LN:115169878@SQ SN:14 LN:107349540@SQ SN:15 LN:102531392@SQ SN:16 LN:90354753@SQ SN:17 LN:81195210@SQ SN:18 LN:78077248@SQ SN:19 LN:59128983@SQ SN:20 LN:63025520@SQ SN:21 LN:48129895@SQ SN:22 LN:51304566@SQ SN:X LN:155270560@SQ SN:Y LN:59373566HWI-ST298:171:81MJKABXX:6:1101:1664:2484 83 10 42832500 37 63M = 42832371 -192 AATTATATTTAGTAAAGCTTAACAACCAATAAAAGGCTTTACCACATTCTTCGAATTTGTAAG DC<48**?0*@?**?B9*??3@GFCAE9IIIIGEEHGEHCF9B<<GHGHGHHHBHFFFFFCC@ RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:1 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:6G56 RG:Z:READ_GROUP_1HWI-ST298:171:81MJKABXX:6:1101:1664:2484 163 10 42832371 36 50M = 42832500 192 AGCTTTGCCACATTCTTCACATTTGCAGGGTTTCTCTCCCGTACGAATTC @@BDFB>?B?BHHIFEGCEHF9EE><AFEGCEH9<???F*00))00?FHG RG:Z:FLOWCELL1-LINE1 XT:A:R NM:i:1 SM:i:0 AM:i:0 X0:i:5 X1:i:3 XM:i:1 XO:i:0 XG:i:0 MD:Z:43T6 RG:Z:READ_GROUP_1HWI-ST298:171:81MJKABXX:6:1101:2442:2401 99 13 101256483 60 49M = 101256650 230 GTAACAAAAATAAAGATGTGAGGCTGCCTGCTCTTGCCTAAAGCATGGC @@@FFFFDDDHHHBHBGGC4AFGICB;AFG>3?<D@F;CD@GH**09?@ RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:49 RG:Z:READ_GROUP_1HWI-ST298:171:81MJKABXX:6:1101:2442:2401 147 13 101256650 60 63M = 101256483 -230 GAGAAAAGCATATAGATATTCTATGTTAAAACTTCCATTCCTCATTCGATTATTTGCCCTATT HC83<GFB899FB499?00*EFGF>GB???1HEEGE9E@EIHHCGGFCCAFADD;;FFFD@@< RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:2 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:1G17A43 RG:Z:READ_GROUP_1HWI-ST298:171:81MJKABXX:6:1101:2496:2427 83 20 49195046 60 65M = 49194943 -168 TCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCA ?:5(55;3ABA:FFCBECHC?>>ACDF@IGHBF6IJJJJIGGGGHBHJIGHGFGHHHDDDFFC@@ RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:65HWI-ST298:171:81MJKABXX:6:1101:2496:2427 163 20 49194943 60 60M = 49195046 168 TTTTTCTCTTTCAGACCCAAGAAACTCGAGAGATCTTACATTTCCACTATACCACATGGC @@@FFE?DDHFFFGDEEBEE;BCE<C9CAC@F19CF**:*?****0*00?9B9B328B>D RG:Z:FLOWCELL1-LINE1 XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:60 RG:Z:READ_GROUP_1 My commander line is :/usr/java/jdk1.6.0_16/bin/java -jar /share/apps/GenomeAnalysisTK-1.0.5777/GenomeAnalysisTK.jar -l INFO -R /share/data/staff/sunchy/data/GATK/human_g1k_v37.fasta --DBSNP /share/data/staff/sunchy/data/GATK/dbsnp_129_b37.rod -I sorted.rmdup.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile var.csv Thank you Cheo |
From: Bob H. <han...@br...> - 2011-06-22 16:33:50
|
Hi, Eric, I apologize for not responding sooner - the mailing list was messed up and I wasn't receiving these emails. You need to have the bwa shared library on your library path (libbwa.so). There's a suitable library in the bwa directory. For example, in the scripts in the installtest directory, they have a line like this: export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH} Running ComputeGenomeMask will only work on linux. -Bob On 3/28/11 6:34 PM, svt...@li... wrote: > Subject: > Java Library Path > From: > "Fritz, Eric R [AN S]" <er...@ia...> > Date: > Mon, 28 Mar 2011 17:32:48 -0500 > > To: > "svt...@li..." > <svt...@li...> > > > Hello, > > I am getting the following error when trying to run the ComputeGenomeMask portion of the SVToolkit. > > INFO 17:32:10,456 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,459 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ComputeGenomeMask > INFO 17:32:10,460 HelpFormatter - Program Args: -R Chr12.fa -O Chr12.mask.fa -readLength 83 > INFO 17:32:10,460 HelpFormatter - Date/Time: 2011/03/28 17:32:10 > INFO 17:32:10,460 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,461 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,525 GenomeMaskAlgorithm - Initializing bwa ... > Exception in thread "main" java.lang.UnsatisfiedLinkError: no bwa in java.library.path > at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1754) > at java.lang.Runtime.loadLibrary0(Runtime.java:823) > at java.lang.System.loadLibrary(System.java:1045) > at org.broadinstitute.sting.alignment.bwa.c.BWACAligner.<clinit>(BWACAligner.java:21) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAligner(GenomeMaskAlgorithm.java:206) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAlignmentCode(GenomeMaskAlgorithm.java:119) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMaskForSequence(GenomeMaskAlgorithm.java:93) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMask(GenomeMaskAlgorithm.java:75) > at org.broadinstitute.sv.apps.ComputeGenomeMask.run(ComputeGenomeMask.java:63) > at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:33) > at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) > at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23) > at org.broadinstitute.sv.apps.ComputeGenomeMask.main(ComputeGenomeMask.java:49) > > > If anyone could tell me how to solve this issue that would be great. > > -Eric > > > Eric Fritz > Assistant Scientist II > Iowa State University > 2255G Kildee Hall > Ames, IA 50014 |
From: Bob H. <han...@br...> - 2011-06-22 16:22:13
|
Hi, Michael, Sorry for not responding sooner - the mailing list was messed up and I wasn't receiving these emails. In the interim since you wrote, I updated the wiki page here: http://www.broadinstitute.org/gsa/wiki/index.php/ComputeGenomeMask If that's not enough information, let me know. -Bob > Subject: > SVtoolkit question > From: > "Michael Parfenov" <par...@ge...> > Date: > Thu, 3 Mar 2011 13:57:23 -0500 (EST) > > To: > svt...@li... > > > Hi, > > Thanks for the SV toolkit! > Could you tell me, please, how to run the ComputeGenomeMask utility? I see > required arguments in the wiki page, but how to invoke the utility? > > best regards, > Michael. > > |
From: Bob H. <han...@br...> - 2011-06-22 16:14:18
|
Hi, Gayle, I apologize for not responding sooner - there was something messed up in the mailing list so I wasn't getting these emails. You are correct that SVAltAlign requires a sequence for the non-reference allele. Genome STRiP currently doesn't have any tool to generate these, so you need to run tigra_sv or some other assembly tool to generate the non-reference alleles. This in, in part, because even if the confidence intervals are zero, we don't know if there is additional non-template sequence present in the deletion allele. -Bob On 4/7/11 1:12 PM, svt...@li... wrote: > Subject: > Input to SVAltAlign > From: > Gayle Leen <gay...@de...> > Date: > Thu, 7 Apr 2011 16:52:33 +0000 > > To: > svt...@li... > > > Hi, > > I'm trying to run Genome STRIP on a dataset but the output from the > SVDiscovery module (the vcf file) can't be used as input to > SVAltAlign. I think this is because the potential variants are tagged > as 'imprecise', even though the confidence intervals around the ends > are 0 for some events - consequently the alternative allele sequence > isn't generated. Is there some way of getting around this? Or should I > use some other algorithm such as tigra_sv to process the vcf file? > > If you could help me with this that would be great, > > Gayle |
From: Bob H. <han...@br...> - 2011-06-22 16:10:31
|
I don't see any obvious problem with this SAM record. Could you download the latest interim release from here: ftp://ftp.broadinstitute.org/pub/svtoolkit/releases/interim/svtoolkit_1.04.683.tar.gz and run with that. This will produce a little bit of additional debugging information, but mostly I'd like to see if the problem happens with more recent code. -Bob On 6/22/11 8:14 AM, Philine Feulner wrote: > Hi, > > I am trying to run SVDiscovery on a non human data set (stickleback illumina paired end data). > The installation test was successful and the bam files I am using are also already run through the GATK framework. > The preprocessing of the reads also works but the SVDiscovery always fails with following error message: > Left read of read pair fails left read test > I attach the stack trace below. > > Sorry, I don't have any idea what this is suggesting and how to solve this issue. > > I would greatly appriciate any suggestions or help. > Thanks in advance. > > Kind regards, > Philine > > > > > > ##### ERROR stack trace > java.lang.IllegalArgumentException: Left read of read pair fails left read test: HWI-ST143_0294:7:1:11533:163054#0 145 groupXXI 3556356 0 46M groupXXI 3556505 104 TAATAGACGTACCGGGAGTTTAAGGGAGAGGTGCCACGGCTGTTAA 6776767466773666768867766676766777783677768677 X0:i:4 X1:i:2 XA:Z:scaffold_27,-2047944,46M,0;scaffold_27,-2047901,46M,0;scaffold_27,-2047858,46M,0;scaffold_108,+243775,46M,1;groupXX,+949641,46M,1; MD:Z:46 RG:Z:BS27pair.2 XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 OQ:Z:?DGDDGEGFGCIHHIIIIIHIBHIIIIIIIIIGIIHIIHIIIIHII XT:A:R > at org.broadinstitute.sv.util.ReadPair.create(ReadPair.java:135) > at org.broadinstitute.sv.discovery.ReadPairRecordFilter.createReadPair(ReadPairRecordFilter.java:228) > at org.broadinstitute.sv.discovery.ReadPairRecordFilter.generateReadPairs(ReadPairRecordFilter.java:149) > at org.broadinstitute.sv.discovery.ReadPairRecordFilter.filterReadPairs(ReadPairRecordFilter.java:80) > at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.finishReadPairSelection(DeletionDiscoveryAlgorithm.java:206) > at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:156) > at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150) > at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43) > at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) > at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75) > at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217) > at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111) > at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110) > at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) > at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72) > at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21) > ##### ERROR ------------------------------------------------------------------------------------------ > ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M): > ##### ERROR > ##### ERROR Please visit to wiki to see if this is a known problem > ##### ERROR If not, please post the error, with stack trace, to the GATK forum > ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki > ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa > ##### ERROR > ##### ERROR MESSAGE: Left read of read pair fails left read test: HWI-ST143_0294:7:1:11533:163054#0 145 groupXXI > 3556356 0 46M groupXXI 3556505 104 TAATAGACGTACCGGGAGTTTAAGGGAGAGGTGCCACGGCTGTTAA 6776767466773666768867766676766777783677768677 X0:i:4 X1:i:2 XA:Z:scaffold_27,-2047944,46M,0;scaffold_27,-2047901,46M,0;scaffold_27,-2047858,46M,0;scaffold_108,+243775,46M,1;groupXX,+949641,46M,1; MD:Z:46 RG:Z:BS27pair.2 XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 OQ:Z:?DGDDGEGFGCIHHIIIIIHIBHIIIIIIIIIGIIHIIHIIIIHII XT:A:R > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Dr Philine Feulner > Westfälische Wilhelms University > Institute for Evolution and Biodiversity > Evolutionary Bioinformatics Group > Hüfferstrasse 1 > 48149 Münster > Germany > Tel: +49 (0) 251 83 21636 > Fax: +49 (0) 251 83 24668 > Email: p.f...@un... > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > ------------------------------------------------------------------------------ > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Bob H. <han...@br...> - 2011-06-22 15:56:23
|
Hi, Agata, This sounds like an issue with your particular operating system / environment. I don't know anything about JRockit. Perhaps you could try running Sun java instead or ask your local system admin. -Bob On 6/17/11 8:07 AM, Agata Wesolowska wrote: > Hi, > > I am trying to run the installtest discovery.sh, but unfortunately I > am getting an error/warning: > > [WARN ] You are running JRockit with limited virtual memory. This > might lead to unexpected failures later on > Could not create the Java virtual machine. > > If anyone could tell me how to solve this issue that would be great. > > Agata > > > > ------------------------------------------------------------------------------ > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Philine F. <p.f...@un...> - 2011-06-22 12:14:28
|
Hi, I am trying to run SVDiscovery on a non human data set (stickleback illumina paired end data). The installation test was successful and the bam files I am using are also already run through the GATK framework. The preprocessing of the reads also works but the SVDiscovery always fails with following error message: Left read of read pair fails left read test I attach the stack trace below. Sorry, I don't have any idea what this is suggesting and how to solve this issue. I would greatly appriciate any suggestions or help. Thanks in advance. Kind regards, Philine ##### ERROR stack trace java.lang.IllegalArgumentException: Left read of read pair fails left read test: HWI-ST143_0294:7:1:11533:163054#0 145 groupXXI 3556356 0 46M groupXXI 3556505 104 TAATAGACGTACCGGGAGTTTAAGGGAGAGGTGCCACGGCTGTTAA 6776767466773666768867766676766777783677768677 X0:i:4 X1:i:2 XA:Z:scaffold_27,-2047944,46M,0;scaffold_27,-2047901,46M,0;scaffold_27,-2047858,46M,0;scaffold_108,+243775,46M,1;groupXX,+949641,46M,1; MD:Z:46 RG:Z:BS27pair.2 XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 OQ:Z:?DGDDGEGFGCIHHIIIIIHIBHIIIIIIIIIGIIHIIHIIIIHII XT:A:R at org.broadinstitute.sv.util.ReadPair.create(ReadPair.java:135) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.createReadPair(ReadPairRecordFilter.java:228) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.generateReadPairs(ReadPairRecordFilter.java:149) at org.broadinstitute.sv.discovery.ReadPairRecordFilter.filterReadPairs(ReadPairRecordFilter.java:80) at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.finishReadPairSelection(DeletionDiscoveryAlgorithm.java:206) at org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:156) at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:150) at org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:43) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:75) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:217) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:111) at org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:110) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:72) at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21) ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0.5039M): ##### ERROR ##### ERROR Please visit to wiki to see if this is a known problem ##### ERROR If not, please post the error, with stack trace, to the GATK forum ##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki ##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa ##### ERROR ##### ERROR MESSAGE: Left read of read pair fails left read test: HWI-ST143_0294:7:1:11533:163054#0 145 groupXXI 3556356 0 46M groupXXI 3556505 104 TAATAGACGTACCGGGAGTTTAAGGGAGAGGTGCCACGGCTGTTAA 6776767466773666768867766676766777783677768677 X0:i:4 X1:i:2 XA:Z:scaffold_27,-2047944,46M,0;scaffold_27,-2047901,46M,0;scaffold_27,-2047858,46M,0;scaffold_108,+243775,46M,1;groupXX,+949641,46M,1; MD:Z:46 RG:Z:BS27pair.2 XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 OQ:Z:?DGDDGEGFGCIHHIIIIIHIBHIIIIIIIIIGIIHIIHIIIIHII XT:A:R ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Philine Feulner Westfälische Wilhelms University Institute for Evolution and Biodiversity Evolutionary Bioinformatics Group Hüfferstrasse 1 48149 Münster Germany Tel: +49 (0) 251 83 21636 Fax: +49 (0) 251 83 24668 Email: p.f...@un... ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
From: Agata W. <ag...@cb...> - 2011-06-17 12:23:32
|
Hi, I am trying to run the installtest discovery.sh, but unfortunately I am getting an error/warning: [WARN ] You are running JRockit with limited virtual memory. This might lead to unexpected failures later on Could not create the Java virtual machine. If anyone could tell me how to solve this issue that would be great. Agata |
From: Bob H. <han...@br...> - 2011-04-07 18:01:31
|
Hi, Gayle, Also, if you haven't looked at it, there is a presentation here: http://www.broadinstitute.org/gsa/wiki/index.php/Main_Page#GSA_Feb._17th_2011_next-generation_sequencing_workshop On slide 5, I tried to give a roadmap for a "soup to nuts" pipeline along the lines of what we used for the 1000 Genomes pilot. -Bob On 4/7/11 12:52 PM, Gayle Leen wrote: > Hi, > > I'm trying to run Genome STRIP on a dataset but the output from the > SVDiscovery module (the vcf file) can't be used as input to > SVAltAlign. I think this is because the potential variants are tagged > as 'imprecise', even though the confidence intervals around the ends > are 0 for some events - consequently the alternative allele sequence > isn't generated. Is there some way of getting around this? Or should I > use some other algorithm such as tigra_sv to process the vcf file? > > If you could help me with this that would be great, > > Gayle > > > ------------------------------------------------------------------------------ > Xperia(TM) PLAY > It's a major breakthrough. An authentic gaming > smartphone on the nation's most reliable network. > And it wants your games. > http://p.sf.net/sfu/verizon-sfdev > > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Bob H. <han...@br...> - 2011-04-07 17:56:03
|
Hi, Gayle, You are correct: the problem is that SVAltAlign requires base-pair-exact breakpoints and SVDiscovery does not currently produce these. If you want to genotype using breakpoints, you need to run tigra_sv or some other tool (I think at Sanger they are using velvet). Note that a confidence interval of zero is not exactly the same as knowing the alt allele at base pair resolution. The confidence interval is actually based on the confidence in the difference in length between the ref and alt alleles. This is different than the number of deleted bases from the reference if the alt allele contains non-template sequence. -Bob On 4/7/11 12:52 PM, Gayle Leen wrote: > Hi, > > I'm trying to run Genome STRIP on a dataset but the output from the > SVDiscovery module (the vcf file) can't be used as input to > SVAltAlign. I think this is because the potential variants are tagged > as 'imprecise', even though the confidence intervals around the ends > are 0 for some events - consequently the alternative allele sequence > isn't generated. Is there some way of getting around this? Or should I > use some other algorithm such as tigra_sv to process the vcf file? > > If you could help me with this that would be great, > > Gayle > > > ------------------------------------------------------------------------------ > Xperia(TM) PLAY > It's a major breakthrough. An authentic gaming > smartphone on the nation's most reliable network. > And it wants your games. > http://p.sf.net/sfu/verizon-sfdev > > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Gayle L. <gay...@de...> - 2011-04-07 17:12:57
|
<font face="Default Sans Serif,Verdana,Arial,Helvetica,sans-serif" size="2">Hi,<br><br>I'm trying to run Genome STRIP on a dataset but the output from the SVDiscovery module (the vcf file) can't be used as input to SVAltAlign. I think this is because the potential variants are tagged as 'imprecise', even though the confidence intervals around the ends are 0 for some events - consequently the alternative allele sequence isn't generated. Is there some way of getting around this? Or should I use some other algorithm such as tigra_sv to process the vcf file?<br><br>If you could help me with this that would be great,<br><br>Gayle <br><div></div></font> |
From: Bob H. <han...@br...> - 2011-03-29 00:00:20
|
Hi, Eric, The ${SV_DIR}/bwa directory is not on your LD_LIBRARY_PATH. I.e. you need to do something like: export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH} If you were following the script here: http://sourceforge.net/mailarchive/forum.php?thread_name=4D700372.9060201%40broadinstitute.org&forum_name=svtoolkit-help the location of libbwa.so changed since that was posted. I will put a cleaned up script in the FAQ. Also note that the correct way to create a mask for one chromosome is to use the entire reference genome and the -sequence argument. Otherwise, you will fail to mask locations that unique on chr12 but not unique across the whole genome. -Bob On 3/28/11 6:32 PM, Fritz, Eric R [AN S] wrote: > Hello, > > I am getting the following error when trying to run the ComputeGenomeMask portion of the SVToolkit. > > INFO 17:32:10,456 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,459 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ComputeGenomeMask > INFO 17:32:10,460 HelpFormatter - Program Args: -R Chr12.fa -O Chr12.mask.fa -readLength 83 > INFO 17:32:10,460 HelpFormatter - Date/Time: 2011/03/28 17:32:10 > INFO 17:32:10,460 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,461 HelpFormatter - ---------------------------------------------------------- > INFO 17:32:10,525 GenomeMaskAlgorithm - Initializing bwa ... > Exception in thread "main" java.lang.UnsatisfiedLinkError: no bwa in java.library.path > at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1754) > at java.lang.Runtime.loadLibrary0(Runtime.java:823) > at java.lang.System.loadLibrary(System.java:1045) > at org.broadinstitute.sting.alignment.bwa.c.BWACAligner.<clinit>(BWACAligner.java:21) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAligner(GenomeMaskAlgorithm.java:206) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAlignmentCode(GenomeMaskAlgorithm.java:119) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMaskForSequence(GenomeMaskAlgorithm.java:93) > at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMask(GenomeMaskAlgorithm.java:75) > at org.broadinstitute.sv.apps.ComputeGenomeMask.run(ComputeGenomeMask.java:63) > at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:33) > at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) > at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23) > at org.broadinstitute.sv.apps.ComputeGenomeMask.main(ComputeGenomeMask.java:49) > > > If anyone could tell me how to solve this issue that would be great. > > -Eric > > > Eric Fritz > Assistant Scientist II > Iowa State University > 2255G Kildee Hall > Ames, IA 50014 > ------------------------------------------------------------------------------ > Create and publish websites with WebMatrix > Use the most popular FREE web apps or write code yourself; > WebMatrix provides all the features you need to develop and publish > your website. http://p.sf.net/sfu/ms-webmatrix-sf > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Fritz, E. R [AN S] <er...@ia...> - 2011-03-28 22:34:18
|
Hello, I am getting the following error when trying to run the ComputeGenomeMask portion of the SVToolkit. INFO 17:32:10,456 HelpFormatter - ---------------------------------------------------------- INFO 17:32:10,459 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ComputeGenomeMask INFO 17:32:10,460 HelpFormatter - Program Args: -R Chr12.fa -O Chr12.mask.fa -readLength 83 INFO 17:32:10,460 HelpFormatter - Date/Time: 2011/03/28 17:32:10 INFO 17:32:10,460 HelpFormatter - ---------------------------------------------------------- INFO 17:32:10,461 HelpFormatter - ---------------------------------------------------------- INFO 17:32:10,525 GenomeMaskAlgorithm - Initializing bwa ... Exception in thread "main" java.lang.UnsatisfiedLinkError: no bwa in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1754) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1045) at org.broadinstitute.sting.alignment.bwa.c.BWACAligner.<clinit>(BWACAligner.java:21) at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAligner(GenomeMaskAlgorithm.java:206) at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.getAlignmentCode(GenomeMaskAlgorithm.java:119) at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMaskForSequence(GenomeMaskAlgorithm.java:93) at org.broadinstitute.sv.mask.GenomeMaskAlgorithm.generateMask(GenomeMaskAlgorithm.java:75) at org.broadinstitute.sv.apps.ComputeGenomeMask.run(ComputeGenomeMask.java:63) at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:33) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:23) at org.broadinstitute.sv.apps.ComputeGenomeMask.main(ComputeGenomeMask.java:49) If anyone could tell me how to solve this issue that would be great. -Eric Eric Fritz Assistant Scientist II Iowa State University 2255G Kildee Hall Ames, IA 50014 |
From: Bob H. <han...@br...> - 2011-03-28 19:47:53
|
Hi, Eric, There's a pretty good answer to this here: http://sourceforge.net/mailarchive/forum.php?thread_name=4D700372.9060201%40broadinstitute.org&forum_name=svtoolkit-help I also updated the documentation with an example on the wiki here: http://www.broadinstitute.org/gsa/wiki/index.php/ComputeGenomeMask SVToolkit.jar is a separate jar file from the GATK. If that doesn't get you going, let me know. -Bob On 3/28/11 3:38 PM, Fritz, Eric R [AN S] wrote: > Hello, > > On the wiki it says you can use ComputeGenomeMask to create a genome mask file. Where is this particular script located? I am unable to find it. Thanks. > > -Eric > > > Eric Fritz > Assistant Scientist II > Iowa State University > 2255G Kildee Hall > Ames, IA 50014 > ------------------------------------------------------------------------------ > Create and publish websites with WebMatrix > Use the most popular FREE web apps or write code yourself; > WebMatrix provides all the features you need to develop and publish > your website. http://p.sf.net/sfu/ms-webmatrix-sf > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |
From: Fritz, E. R [AN S] <er...@ia...> - 2011-03-28 19:38:34
|
Hello, On the wiki it says you can use ComputeGenomeMask to create a genome mask file. Where is this particular script located? I am unable to find it. Thanks. -Eric Eric Fritz Assistant Scientist II Iowa State University 2255G Kildee Hall Ames, IA 50014 |
From: Bob H. <han...@br...> - 2011-03-03 21:09:15
|
> *[svtoolkit-help] SVtoolkit question > <http://sourceforge.net/mailarchive/message.php?msg_id=27148621>* > From: Michael Parfenov <parfenov@ge...> - 2011-03-03 20:07 > Hi, > > Thanks for the SV toolkit! > Could you tell me, please, how to run the ComputeGenomeMask utility? I see > required arguments in the wiki page, but how to invoke the utility? > > best regards, > Michael. > Hi, Michael, ComputeGenomeMask is a regular java command line program and is in SVToolkit.jar, so basically you just need to do "java -cp SVToolkit.jar:GenomeAnalysisTK.jar org.broadinstitute.sv.apps.ComputeGenomeMask". But there are a couple of wrinkles with ComputeGenomeMask in particular. 1. You need to index the reference for bwa alignment. 2. ComputeGenomeMask uses libbwa.so, so you need that on your library path. 3. To do any reasonably large genome, you need to run it in parallel and then merge. Here is a shell script I used to do a 75bp mask for hg19 that you can use as an example. It's designed to use the local lsf environment at the Broad and knows where to find the reference fasta locally, etc. It also assumes the reference has already been indexed with "samtools faidx". The script just runs the parallel jobs, it doesn't merge the results back together (you will need to cat together the resulting fasta files in the same order as the reference). At some point I will likely write a queue script to automate this, but it's not a high priority right now because I'm also working on some changes to the mask file format. Hope this helps, -Bob #!/bin/bash outdir=Homo_sapiens_assembly19/75 readLength=75 reference=/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta export SV_DIR=/humgen/cnp04/bobh/sv/stable # These executables must be on your path. which java > /dev/null || exit 1 which bwa > /dev/null || exit 1 export LD_LIBRARY_PATH=${SV_DIR}/bwa/lib:${LD_LIBRARY_PATH} classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar" mkdir -p ${outdir}/work localReference=${outdir}/work/`echo ${reference} | awk -F / '{ print $NF }'` if [ ! -e ${localReference} ]; then (cd ${outdir}/work && ln -s ${reference} .) || exit 1 (cd ${outdir}/work && ln -s ${reference}.fai .) || exit 1 fi bwa index -a bwtsw ${localReference} || exit 1 chroms=`cat ${localReference}.fai | cut -f 1` for chr in ${chroms}; do bsub -o ${outdir}/work/svmask_${chr}.log \ -R "rusage[mem=5000]" \ java -cp ${classpath} -Xmx4g \ org.broadinstitute.sv.apps.ComputeGenomeMask \ -R ${localReference} \ -O ${outdir}/work/svmask_${chr}.fasta \ -readLength ${readLength} \ -sequence ${chr} \ || exit 1 done |
From: Bob H. <han...@br...> - 2011-03-03 20:50:31
|
Hi, Mark, I apologize for the tardy response - due to a configuration problem, I wasn't receiving these emails. It sounds like you visited the web site when it was still under construction, the link to the genome mask files should be working now. -Bob *[svtoolkit-help] genome mask files <http://sourceforge.net/mailarchive/message.php?msg_id=27019976>* From: Mark Cowley <m.cowley@ga...> - 2011-02-07 06:28 *Attachments:* Message as HTML <http://sourceforge.net/mailarchive/attachment.php?list_name=svtoolkit-help&message_id=89F21640-FFCD-4C6C-994A-C85B236A907E%40garvan.org.au&counter=1> Hi, I saw the 1KG SV Nature paper& I would like to run the genome STRiP algo on some cancer data that i've generated. Installation went fairly smoothly, but i'm looking here:https://www.broadinstitute.org/gsa/wiki/index.php/Genome_STRiP_Genome_Mask_Files and the link to precomputed mask files is dead& I can't see a 'ComputeGenomeMask' cmdline util in the svtoolkit dir. Can you please let me know how i can find these? cheers, Mark ----------------------------------------------------- Mark Cowley, PhD Garvan Cancer Program& Peter Wills Bioinformatics Centre Garvan Institute of Medical Research, Sydney, Australia ----------------------------------------------------- |
From: Michael P. <par...@ge...> - 2011-03-03 20:07:11
|
Hi, Thanks for the SV toolkit! Could you tell me, please, how to run the ComputeGenomeMask utility? I see required arguments in the wiki page, but how to invoke the utility? best regards, Michael. |
From: Bob H. <han...@br...> - 2011-02-16 14:40:21
|
Hi, Mingfu, I'm going to copy this to the support mailing list too. I suspect the problem is that your @RG (read group) headers do not contain a LB (library) tag. Either that or the RG tags are missing from the reads. As a result, Genome STRiP can't group the reads into libraries. Is it possible for you to reheader your bam files? -Bob On 2/15/11 11:32 PM, Mingfu Zhu wrote: > Hi Bob, > > I input a file with path of bam files (along with a list of genders). > It seems > running now. The screen log says > > Web site: http://www.broadinstitute.org/gsa/wiki/index.php/Genome_STRiP > INFO 23:05:10,551 QScriptManager - Compiling 2 QScripts > INFO 23:05:15,793 QScriptManager - Compilation complete > INFO 23:05:18,426 HelpFormatter - > --------------------------------------------------------- > INFO 23:05:18,426 HelpFormatter - Program Name: > org.broadinstitute.sting.queue.QCommandLine > INFO 23:05:18,427 HelpFormatter - Program Args: -S > /nfs/seqsata07/Mingfu/svtoolkit/qscript/SVPreprocess.q -S > /nfs/seqsata07/Mingfu/svtoolkit/qscript/SVQScript.q -gatk > /nfs/seqsata07/Mingfu/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -cp > /nfs/seqsata07/Mingfu/svtoolkit/lib/SVToolkit.jar:/nfs/seqsata07/Mingfu/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/seqsata07/Mingfu/svtoolkit/lib/gatk/Queue.jar > > -configFile conf/genstrip_installtest_parameters.txt -tempDir > /nfs/seqsata07/Mingfu/svtoolkit/tmpdir -R > /nfs/seqsata07/Mingfu/svtoolkit/data/human_ref_36_50.fa -genomeMaskFile > /nfs/seqsata07/Mingfu/svtoolkit/data/Homo_sapiens_assembly18.mask.101.fasta > > -genderMapFile /nfs/seqsata07/Mingfu/svtoolkit/data/sample.gender > -runDirectory > /nfs/seqsata07/Mingfu/svtoolkit/test1 -md > /nfs/seqsata07/Mingfu/svtoolkit/test1/metadata -jobLogDir > /nfs/seqsata07/Mingfu/svtoolkit/test1/logs -I > /nfs/seqsata07/Mingfu/svtoolkit/data/sample.list -run > INFO 23:05:18,427 HelpFormatter - Date/Time: 2011/02/15 23:05:18 > INFO 23:05:18,427 HelpFormatter - > --------------------------------------------------------- > INFO 23:05:18,427 HelpFormatter - > --------------------------------------------------------- > INFO 23:05:18,428 QCommandLine - Scripting SVPreprocess > INFO 23:05:18,517 QCommandLine - Added 7 functions > INFO 23:05:18,518 QGraph - Generating graph. > INFO 23:05:18,541 QGraph - Running jobs. > INFO 23:05:18,604 ShellJobRunner - Starting: java -Xmx4g > -Djava.io.tmpdir=/nfs/seqsata07/Mingfu/svtoolkit/tmpdir -cp > /nfs/seqsata07/Mingfu/svtoolkit/lib/SVToolkit.jar:/nfs/seqsata07/Mingfu/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/seqsata07/Mingfu/svtoolkit/lib/gatk/Queue.jar > > org.broadinstitute.sting.gatk.CommandLineGATK -T > ComputeInsertSizeDistributions > -R /nfs/seqsata07/Mingfu/svtoolkit/data/human_ref_36_50.fa -I > /nfs/seqsata07/Mingfu/svtoolkit/data/sample.list -O > /nfs/seqsata07/Mingfu/svtoolkit/test1/metadata/isd/sample.list.hist.bin -md > > /nfs/seqsata07/Mingfu/svtoolkit/test1/metadata -createEmpty > INFO 23:05:18,604 ShellJobRunner - Output written to > /nfs/seqsata07/Mingfu/svtoolkit/test1/logs/Q-23740@sva-1.out > > > The I checked the output file Q-23740@sva-1.out. It says > > > INFO 23:05:22,562 HelpFormatter - > --------------------------------------------------------------------------- > > INFO 23:05:22,564 HelpFormatter - The Genome Analysis Toolkit (GATK) > v1.0.5039M, Compiled 2011/01/20 22:58:34 > INFO 23:05:22,564 HelpFormatter - Copyright (c) 2010 The Broad Institute > INFO 23:05:22,565 HelpFormatter - Please view our documentation at > http://www.broadinstitute.org/gsa/wiki > INFO 23:05:22,565 HelpFormatter - For support, please view our > support site at > http://getsatisfaction.com/gsa > INFO 23:05:22,565 HelpFormatter - Program Args: -T > ComputeInsertSizeDistributions -R /nfs/seqsata07/Mingfu/svtoolkit/data/hu > man_ref_36_50.fa -I /nfs/seqsata07/Mingfu/svtoolkit/data/sample.list -O > /nfs/seqsata07/Mingfu/svtoolkit/test1/metadata/isd/sa > mple.list.hist.bin -md /nfs/seqsata07/Mingfu/svtoolkit/test1/metadata > -createEmpty > INFO 23:05:22,565 HelpFormatter - Date/Time: 2011/02/15 23:05:22 > INFO 23:05:22,565 HelpFormatter - > --------------------------------------------------------------------------- > > INFO 23:05:22,565 HelpFormatter - > --------------------------------------------------------------------------- > > INFO 23:05:22,570 GenomeAnalysisEngine - Strictness is SILENT > Error: Cannot determine library identifier for read ERR001698.1039047 > INFO 23:05:25,706 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL > STARTING] > INFO 23:05:25,707 TraversalEngine - Location processed.reads > runtime > per.1M.reads completed total.runtime remaining > Error: Cannot determine library identifier for read ERR001699.2594249 > Error: Cannot determine library identifier for read ERR001699.7228308 > Error: Cannot determine library identifier for read ERR001705.1973559 > Error: Cannot determine library identifier for read ERR001705.5344781 > Error: Cannot determine library identifier for read ERR001706.368556 > Error: Cannot determine library identifier for read ERR001706.880536 > Error: Cannot determine library identifier for read ERR001706.5437519 > Error: Cannot determine library identifier for read ERR001710.3090891 > Error: Cannot determine library identifier for read ERR001710.4326898 > Error: Cannot determine library identifier for read ERR001712.4452569 > > > Is this expected? I killed a job earlier because it generated 100G of > such log. > Given 20 samples with 40X coverage each, how many hours do you > estimate it > takes? Can it run in cluster by some way? > > Thanks, > Mingfu > > |
From: Mark C. <m.c...@ga...> - 2011-02-07 06:28:41
|
Hi, I saw the 1KG SV Nature paper & I would like to run the genome STRiP algo on some cancer data that i've generated. Installation went fairly smoothly, but i'm looking here: https://www.broadinstitute.org/gsa/wiki/index.php/Genome_STRiP_Genome_Mask_Files and the link to precomputed mask files is dead & I can't see a 'ComputeGenomeMask' cmdline util in the svtoolkit dir. Can you please let me know how i can find these? cheers, Mark ----------------------------------------------------- Mark Cowley, PhD Garvan Cancer Program & Peter Wills Bioinformatics Centre Garvan Institute of Medical Research, Sydney, Australia ----------------------------------------------------- |