Re: [svtoolkit-help] Read pair records have different read groups
Status: Beta
Brought to you by:
bhandsaker
From: Bob H. <han...@br...> - 2013-01-21 14:30:45
|
Hi, The root of the problem is in your data: There are two reads with the same ID (read name) but two different read groups. I don't know if this is the problem, but Genome STRiP does require read names to be unique across all of the bams you are analyzing together. The read IDs are: 588658 The read groups are: 561924SRR063635,998169SRR063658 -Bob On 1/20/13 3:18 AM, Wen Yao wrote: > > Hi Bob, > I have been using genome-strip these days. I have fixed several > errors with your kindly help. I encountered a new error recently. Here > is the error messages: > java -Xmx4g > -Djava.io.tmpdir=/home/wbxie/software/svtoolkit/installtest/all_tmpdir > -cp > /home/wbxie/software/svtoolkit/lib/SVToolkit.jar:/home/wbxie/software/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/home/wbxie/software/svtoolkit/lib/gatk/Queue.jar > -verbose:gc org.broadinstitute.sv.main.SVDiscovery -T SVDiscovery -R > /home/wbxie/software/svtoolkit/installtest/data/rice_all_genomes_v7.fasta > -I /home/wbxie/wild_rice/bwa_map/100.bam -I > /home/wbxie/wild_rice/bwa_map/101.bam -I > /home/wbxie/wild_rice/bwa_map/102.bam -I > /home/wbxie/wild_rice/bwa_map/103.bam > .........<truncated>.......... > -I /home/wbxie/bam_file_MSUv7/W321.bam -I > /home/wbxie/bam_file_MSUv7/W322.bam -I > /home/wbxie/bam_file_MSUv7/W323.bam -I > /home/wbxie/bam_file_MSUv7/W324.bam -I > /home/wbxie/bam_file_MSUv7/W325.bam -I > /home/wbxie/bam_file_MSUv7/W326.bam -I > /home/wbxie/bam_file_MSUv7/W327.bam -I > /home/wbxie/bam_file_MSUv7/W328.bam -I > /home/wbxie/bam_file_MSUv7/W329.bam -I > /home/wbxie/bam_file_MSUv7/W330.bam -O > /home/wbxie/software/svtoolkit/installtest/wild_rice_test2/P0008.discovery.vcf > -md wild_rice_test2/metadata -disableGATKTraversal -configFile > conf/genstrip_installtest_parameters.txt -runDirectory wild_rice_test2 > -genomeMaskFile data/rice_all_genomes_v7.mask.fasta -partitionName > P0008 -filePrefix P0008 -L chr06:1-31248787 -searchLocus > chr06:1-31248787 -searchWindow chr06:1-31248787 -searchMinimumSize 100 > -searchMaximumSize 1000000 > org.broadinstitute.sting.queue.util.JobExitException: Failed to run job. > Command line: > sh > /home/wbxie/software/svtoolkit/installtest/all_tmpdir/.exec1738692847535129737 > Exit code: 1 > Standard error contained: > [GC 524288K->72586K(2009792K), 0.7542970 secs] > INFO 15:53:17,194 HelpFormatter - > ----------------------------------------------------------------------------------- > > INFO 15:53:17,215 HelpFormatter - The Genome Analysis Toolkit (GATK) > v1.0-6121-g40e3165, Compiled 2011/09/04 20:57:29 > INFO 15:53:17,217 HelpFormatter - Copyright (c) 2010 The Broad Institute > INFO 15:53:17,218 HelpFormatter - Please view our documentation at > http://www.broadinstitute.org/gsa/wiki > INFO 15:53:17,219 HelpFormatter - For support, please view our > support site at http://getsatisfaction.com/gsa > INFO 15:53:17,221 HelpFormatter - Program Args: -T SVDiscovery -R > /home/wbxie/software/svtoolkit/installtest/data/rice_all_genomes_v7.fasta > -O > /home/wbxie/software/svtoolkit/installtest/wild_rice_test2/P0008.discovery.vcf > -md wild_rice_test2/metadata -disableGATKTraversal -configFile > conf/genstrip_installtest_parameters.txt -runDirectory wild_rice_test2 > -genomeMaskFile data/rice_all_genomes_v7.mask.fasta -partitionName > P0008 -filePrefix P0008 -L chr06:1-31248787 -searchLocus > chr06:1-31248787 -searchWindow chr06:1-31248787 -searchMinimumSize 100 > -searchMaximumSize 1000000 > INFO 15:53:17,223 HelpFormatter - Date/Time: 2012/11/10 15:53:17 > INFO 15:53:17,224 HelpFormatter - > ----------------------------------------------------------------------------------- > > INFO 15:53:17,225 HelpFormatter - > ----------------------------------------------------------------------------------- > > INFO 15:53:17,414 GenomeAnalysisEngine - Strictness is SILENT > INFO 15:53:19,702 SVDiscovery - Initializing SVDiscovery ... > INFO 15:53:19,764 SVDiscovery - Opening reference sequence ... > INFO 15:53:19,772 SVDiscovery - Opened reference sequence. > INFO 15:53:19,773 SVDiscovery - Opening genome mask ... > INFO 15:53:19,781 SVDiscovery - Opened genome mask. > INFO 15:53:19,782 SVDiscovery - Initializing input data set ... > [GC 596874K->78691K(2534080K), 0.8322460 secs] > INFO 15:53:27,431 SVDiscovery - Initialized data set: 713 files, 811 > read groups, 713 samples. > INFO 15:53:27,432 SVDiscovery - Opening metadata from > wild_rice_test2/metadata ... > INFO 15:53:27,437 SVDiscovery - Opened metadata. > INFO 15:53:27,444 SVDiscovery - Initializing discovery algorithm ... > INFO 15:53:27,451 SVDiscovery - Loading insert size histograms ... > [GC 1127267K->98598K(2534080K), 0.9073720 secs] > [GC 1147174K->24483K(2708864K), 0.1410060 secs] > [GC 1247843K->26795K(2637056K), 0.1760810 secs] > INFO 15:54:32,991 SVDiscovery - Loaded 713 histograms. > INFO 15:54:33,011 SVDiscovery - Discovery alt home filtering is > disabled. > INFO 15:54:38,649 SVDiscovery - Processing locus: > chr06:1-31248787:100-1000000 > INFO 15:54:38,650 SVDiscovery - Locus search window: chr06:1-31248787 > INFO 15:54:38,651 SVDiscovery - Selecting read pairs ... > INFO 15:54:38,665 SVDiscovery - Reading input file > /home/wbxie/wild_rice/bwa_map/100.bam ... > [GC 1250155K->122384K(2632768K), 0.3954950 secs] > [GC 1234768K->38536K(2653376K), 0.2583210 secs] > INFO 15:55:02,373 SVDiscovery - Reading input file > /home/wbxie/wild_rice/bwa_map/101.bam ... > [GC 1150920K->55080K(2641856K), 0.2100770 secs] > [GC 1168232K->62044K(2648000K), 0.2142580 secs] > [GC 1175196K->68624K(2655360K), 0.2988430 secs] > INFO 15:55:35,214 SVDiscovery - Reading input file > /home/wbxie/wild_rice/bwa_map/102.bam ... > [GC 1197840K->79312K(2658240K), 0.2794220 secs] > [GC 1208528K->86788K(2660160K), 0.1888030 secs] > INFO 15:56:05,021 SVDiscovery - Reading input file > /home/wbxie/wild_rice/bwa_map/103.bam ... > [GC 1224900K->104476K(2663296K), 0.1776500 secs] > [GC 1242588K->112492K(2664448K), 0.1901910 secs] > ........<truncated>............ > > INFO 20:37:59,770 SVDiscovery - Reading input file > /home/wbxie/bam_file_MSUv7/W329.bam ... > [GC 2485525K->1119669K(2798528K), 0.2848940 secs] > INFO 20:38:13,593 SVDiscovery - Reading input file > /home/wbxie/bam_file_MSUv7/W330.bam ... > [GC 2494325K->1125669K(2798528K), 0.0883570 secs] > [GC 437423K->181927K(1772288K), 0.0068700 secs] > #DBG: RC Cache fill chr06:1-110000 110000 713 124.640302 sec > INFO 20:51:51,800 SVDiscovery - Clustering: Generating clusters for 3 > read pairs. > INFO 20:51:51,857 SVDiscovery - Clustering: Generating clusters for 2 > read pairs. > INFO 20:51:51,863 SVDiscovery - Clustering: Generating clusters for > 27 read pairs. > INFO 20:51:51,865 SVDiscovery - Clustering: Generating clusters for 2 > read pairs. > INFO 20:51:51,902 SVDiscovery - Clustering: Generating clusters for 3 > read pairs. > INFO 20:51:51,919 SVDiscovery - Clustering: Generating clusters for 2 > read pairs. > INFO 20:51:51,919 SVDiscovery - Processing cluster chr06:19014-19110 > chr06:96702-96800 LR 3 > [GC 435751K->182805K(1771008K), 0.0077180 secs] > INFO 20:51:54,382 SVDiscovery - Clustering: Generating clusters for 7 > read pairs. > INFO 20:51:54,383 SVDiscovery - Clustering: LR split size 7 / 7 > maximal clique size 4 clique count 1 > ........<truncated>............ > > #DBG: RC Cache miss chr06:19217935-19684171 466237 713 93.022620 sec > INFO 17:56:41,850 SVDiscovery - Clustering: Generating clusters for 7 > read pairs. > INFO 17:56:41,850 SVDiscovery - Processing cluster > chr06:19220034-19220159 chr06:19221402-19221547 LR 8 > INFO 17:56:43,322 SVDiscovery - Clustering: Generating clusters for > 56 read pairs. > INFO 17:56:43,322 SVDiscovery - Clustering: LR split size 56 / 56 > maximal clique size 49 clique count 1 > INFO 17:56:43,323 SVDiscovery - Clustering: LR split size 7 / 56 > maximal clique size 2 clique count 4 > INFO 17:56:43,323 SVDiscovery - Clustering: LR split size 5 / 56 > maximal clique size 2 clique count 2 > INFO 17:56:43,323 SVDiscovery - Clustering: LR split size 3 / 56 > maximal clique size 2 clique count 1 > INFO 17:56:43,330 SVDiscovery - Clustering: Generating clusters for 7 > read pairs. > INFO 17:56:43,332 SVDiscovery - Clustering: Generating clusters for > 16 read pairs. > INFO 17:56:43,332 SVDiscovery - Processing cluster > chr06:19231617-19231729 chr06:19234427-19234538 LR 7 > INFO 17:56:44,579 SVDiscovery - Clustering: Generating clusters for > 47 read pairs. > INFO 17:56:44,580 SVDiscovery - Processing cluster > chr06:19241754-19242136 chr06:19242646-19243023 LR 16 > INFO 17:56:45,946 SVDiscovery - Clustering: Generating clusters for 6 > read pairs. > INFO 17:56:45,946 SVDiscovery - Processing cluster > chr06:19246397-19246741 chr06:19257339-19257677 LR 47 > INFO 17:56:48,264 SVDiscovery - Clustering: Generating clusters for > 61 read pairs. > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 61 / 61 > maximal clique size 24 clique count 2 > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 37 / 61 > maximal clique size 8 clique count 1 > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 29 / 61 > maximal clique size 6 clique count 1 > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 23 / 61 > maximal clique size 4 clique count 12 > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 19 / 61 > maximal clique size 4 clique count 1 > INFO 17:56:48,265 SVDiscovery - Clustering: LR split size 15 / 61 > maximal clique size 3 clique count 1 > INFO 17:56:48,266 SVDiscovery - Clustering: LR split size 12 / 61 > maximal clique size 2 clique count 5 > INFO 17:56:48,266 SVDiscovery - Clustering: LR split size 10 / 61 > maximal clique size 2 clique count 3 > INFO 17:56:48,266 SVDiscovery - Clustering: LR split size 8 / 61 > maximal clique size 2 clique count 1 > INFO 17:56:48,266 SVDiscovery - Clustering: LR split size 6 / 61 > maximal clique size 1 clique count 7 > INFO 17:56:48,266 SVDiscovery - Processing cluster > chr06:19251412-19251521 chr06:19251687-19251798 LR 6 > [GC 2142252K->1436452K(2264448K), 0.0112090 secs] > ........<truncated>............ > > [GC 1890370K->1480970K(1981440K), 0.0178220 secs] > #DBG: RC Cache fill chr06:19469143-19579142 110000 713 40.510878 sec > INFO 18:03:09,242 SVDiscovery - Clustering: Generating clusters for > 71 read pairs. > INFO 18:03:09,252 SVDiscovery - Clustering: Generating clusters for > 29 read pairs. > INFO 18:03:09,253 SVDiscovery - Processing cluster > chr06:19487235-19487648 chr06:19487750-19488179 LR 71 > [GC 1885898K->1481505K(1976960K), 0.0271590 secs] > INFO 18:03:11,443 SVDiscovery - Clustering: Generating clusters for > 2242 read pairs. > INFO 18:03:11,447 SVDiscovery - Processing cluster > chr06:19491405-19491795 chr06:19491950-19492233 LR 29 > INFO 18:03:14,039 SVDiscovery - Clustering: Generating clusters for > 38 read pairs. > INFO 18:03:14,039 SVDiscovery - Processing cluster > chr06:19494543-19494998 chr06:19495229-19495651 LR 2242 > INFO 18:03:21,277 SVDiscovery - Clustering: Generating clusters for > 18 read pairs. > INFO 18:03:21,278 SVDiscovery - Processing cluster > chr06:19496317-19496648 chr06:19497509-19497827 LR 38 > INFO 18:03:23,408 SVDiscovery - Clustering: Generating clusters for > 55 read pairs. > INFO 18:03:23,408 SVDiscovery - Processing cluster > chr06:19498084-19498232 chr06:19498470-19498540 LR 18 > INFO 18:03:26,501 SVDiscovery - Clustering: Generating clusters for > 26 read pairs. > INFO 18:03:26,501 SVDiscovery - Processing cluster > chr06:19498190-19498545 chr06:19499094-19499431 LR 55 > INFO 18:03:28,669 SVDiscovery - Clustering: Generating clusters for > 37 read pairs. > INFO 18:03:28,670 SVDiscovery - Clustering: LR split size 37 / 37 > maximal clique size 35 clique count 1 > INFO 18:03:28,670 SVDiscovery - Processing cluster > chr06:19498703-19499092 chr06:19555751-19556165 LR 26 > INFO 18:03:31,080 SVDiscovery - Clustering: Generating clusters for > 448 read pairs. > INFO 18:03:31,081 SVDiscovery - Processing cluster > chr06:19499200-19499561 chr06:19500107-19500495 LR 35 > INFO 18:03:33,336 SVDiscovery - Clustering: Generating clusters for 6 > read pairs. > INFO 18:03:33,336 SVDiscovery - Processing cluster > chr06:19503888-19504073 chr06:19504474-19504672 LR 448 > [GC 1858707K->1476197K(1971200K), 0.0162020 secs] > [Full GC 1476197K->235694K(1971200K), 0.1011760 secs] > ##### ERROR > ------------------------------------------------------------------------------------------ > ##### ERROR stack trace > java.lang.IllegalArgumentException: Read pair records have different > read groups: 588658: 561924SRR063635,998169SRR063658 > at org.broadinstitute.sv.util.ReadPair.create(ReadPair.java:120) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm.createReadPair(ReadPairClusteringAlgorithm.java:318) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm.loadReadPairs(ReadPairClusteringAlgorithm.java:572) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm.access$600(ReadPairClusteringAlgorithm.java:45) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm$ClusterIterator.advance(ReadPairClusteringAlgorithm.java:1170) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm$ClusterIterator.next(ReadPairClusteringAlgorithm.java:1123) > at > org.broadinstitute.sv.discovery.ReadPairClusteringAlgorithm$ClusterIterator.next(ReadPairClusteringAlgorithm.java:1096) > at > org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.processClusters(DeletionDiscoveryAlgorithm.java:337) > at > org.broadinstitute.sv.discovery.DeletionDiscoveryAlgorithm.runDiscovery(DeletionDiscoveryAlgorithm.java:192) > at > org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:174) > at > org.broadinstitute.sv.discovery.SVDiscoveryWalker.onTraversalDone(SVDiscoveryWalker.java:45) > at > org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) > at > org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:76) > at > org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:234) > at > org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) > at > org.broadinstitute.sv.main.SVCommandLine.execute(SVCommandLine.java:105) > at > org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221) > at org.broadinstitute.sv.main.SVCommandLine.main(SVCommandLine.java:67) > at org.broadinstitute.sv.main.SVDiscovery.main(SVDiscovery.java:21) > ##### ERROR > ------------------------------------------------------------------------------------------ > ##### ERROR A GATK RUNTIME ERROR has occurred (version 1.0-6121-g40e3165): > ##### ERROR > ##### ERROR Please visit the wiki to see if this is a known problem > ##### ERROR If not, please post the error, with stack trace, to the > GATK forum > ##### ERROR Visit our wiki for extensive documentation > http://www.broadinstitute.org/gsa/wiki > ##### ERROR Visit our forum to view answers to commonly asked > questions http://getsatisfaction.com/gsa > ##### ERROR > ##### ERROR MESSAGE: Read pair records have different read groups: > 588658: 561924SRR063635,998169SRR063658 > ##### ERROR > ------------------------------------------------------------------------------------------ > at org.broadinstitute.sting.queue.util.ShellJob.run(ShellJob.scala:24) > at > org.broadinstitute.sting.queue.engine.shell.ShellJobRunner.start(ShellJobRunner.scala:54) > at > org.broadinstitute.sting.queue.engine.FunctionEdge.start(FunctionEdge.scala:56) > at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:383) > at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:123) > at > org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scala:111) > at > org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:221) > at > org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala:57) > at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala) > > > ------------------------------------------------------------------------------ > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > MVPs and experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122412 > > > _______________________________________________ > svtoolkit-help mailing list > svt...@li... > https://lists.sourceforge.net/lists/listinfo/svtoolkit-help |