From: Heng Li <lh...@sa...> - 2013-07-18 18:19:56
|
That sounds a bwa-backtrack bug. Could you reproduce the problem with a smaller sample file? Thanks, Heng On Jul 18, 2013, at 1:47 PM, Gavin Sherlock <gsh...@st...> wrote: > Alas, as soon as it's hits the first problem read, FixMateInformation dies with a SAM validation error: > > Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 1, Read name MAGNUM:3:25:5065:10022#0, Alignment start should != 0 because reference name != *. > > The data were originally aligned using bwa 0.75a > > I can run FixMateInformation with VALIDATION_STRINGENCY set to silent, but the underlying problem then remains in place, and the resulting file still has the same issue that seems to be unfixable unfortunately. > > When I look at the data for the offending read, I see: > > MAGNUM:3:25:5065:10022#0 97 NC_002516.2 0 37 36M = 455 491 ACCTGGGTTGACGACTTGAGGTCGCAGTGACCCCGT IIIIIIIIIIIIIIIIHIIIIHIIIGIIGIIIIIII X0:i:1 X1:i:0 MD:Z:0 RG:Z:GSRG000012 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:1 XO:i:0 MQ:i:37 XT:A:U > MAGNUM:3:25:5065:10022#0 145 NC_002516.2 455 37 36M = 0 -491 CCGCCTTTCCAATCTTTGGGGGATATCCGTGTCCGT IHGGIIHGIDIHGEIIIHIIIIIIIIIIIIIIIIII X0:i:1 X1:i:0 MD:Z:36 RG:Z:GSRG000012 XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 MQ:i:37 XT:A:U > > Cheers, > Gavin > > On Jul 18, 2013, at 10:27 AM, Alec Wysoker wrote: > >> Try Picard FixMateInformation >> >> On Jul 18, 2013, at 12:38 PM, Gavin Sherlock <gsh...@st...> wrote: >> >>> Thanks - I ran CleanSam, but still get exactly the same error when I try to run CollectAlignmentSummaryMetrics on the cleaned file. I should note, that when I ran CleanSam, there were a handful (~30) of errors that ended with: >>> >>> Alignment start should be != 0 because reference name != *. >>> >>> or: >>> >>> Mate Alignment start should be != 0 because reference name != *. >>> >>> The same reads were mentioned for the two error types. >>> >>> Cheers, >>> Gavin >>> >>> On Jul 18, 2013, at 8:43 AM, Alec Wysoker wrote: >>> >>>> Hi Gavin, >>>> >>>> Running ValidateSamFile should tell you what the problem is. My guess is that you have a read that maps off the end of the reference. You could run CleanSam, as described here: http://sourceforge.net/apps/mediawiki/picard/index.php?title=Main_Page#Q:___A_Picard_program_complains_that_CIGAR_M_operator_maps_off_the_end_of_reference.__I_want_this_record_to_be_treated_as_valid_despite_the_fact_that_the_alignment_end_is_greater_than_the_length_of_the_reference_sequence. >>>> >>>> Setting VALIDATION_STRINGENCY=SILENT won't help in this case because CollectAlignmentSummaryMetrics tries to count mismatches. >>>> >>>> -Alec >>>> >>>> >>>> On Jul 16, 2013, at 12:18 PM, Gavin Sherlock <gsh...@st...> wrote: >>>> >>>>> Hi, >>>>> >>>>> When running: >>>>> >>>>> java -jar /usr/local/bin/CollectMultipleMetrics.jar I=/Volumes/Data/sherlock/Pseudomonas/110218_MAGNUM_00056_FC62DJ5_L3.sf.dedup.bam O=/Volumes/Data/sherlock/Pseudomonas/110218_MAGNUM_00056_FC62DJ5_L3.sf.dedup.statistics REFERENCE_SEQUENCE=/Volumes/Data/sherlock/Pseudomonas/PAO1_NC_002516.fasta VALIDATION_STRINGENCY=SILENT PROGRAM=CollectAlignmentSummaryMetrics PROGRAM=QualityScoreDistribution PROGRAM=MeanQualityByCycle >>>>> >>>>> I get the following error: >>>>> >>>>> [Tue Jul 16 09:16:28 PDT 2013] net.sf.picard.analysis.CollectMultipleMetrics INPUT=/Volumes/Data/sherlock/Pseudomonas/110218_MAGNUM_00056_FC62DJ5_L3.sf.dedup.bam REFERENCE_SEQUENCE=/Volumes/Data/sherlock/Pseudomonas/PAO1_NC_002516.fasta OUTPUT=/Volumes/Data/sherlock/Pseudomonas/110218_MAGNUM_00056_FC62DJ5_L3.sf.dedup.statistics PROGRAM=[CollectAlignmentSummaryMetrics, CollectInsertSizeMetrics, QualityScoreDistribution, MeanQualityByCycle, CollectAlignmentSummaryMetrics, QualityScoreDistribution, MeanQualityByCycle] VALIDATION_STRINGENCY=SILENT ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false >>>>> [Tue Jul 16 09:16:28 PDT 2013] Executing as <name_removed> on Mac OS X 10.8.4 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_51-b11-457-11M4509; Picard version: 1.95(1496) >>>>> [Tue Jul 16 09:16:28 PDT 2013] net.sf.picard.analysis.CollectMultipleMetrics done. Elapsed time: 0.00 minutes. >>>>> Runtime.totalMemory()=85000192 >>>>> To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp >>>>> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1 >>>>> at net.sf.picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.collectQualityData(AlignmentSummaryMetricsCollector.java:326) >>>>> at net.sf.picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.addRecord(AlignmentSummaryMetricsCollector.java:224) >>>>> at net.sf.picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:147) >>>>> at net.sf.picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:122) >>>>> at net.sf.picard.metrics.MultiLevelCollector$AllReadsDistributor.acceptRecord(MultiLevelCollector.java:174) >>>>> at net.sf.picard.metrics.MultiLevelCollector.acceptRecord(MultiLevelCollector.java:277) >>>>> at net.sf.picard.analysis.AlignmentSummaryMetricsCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:66) >>>>> at net.sf.picard.analysis.CollectAlignmentSummaryMetrics.acceptRead(CollectAlignmentSummaryMetrics.java:112) >>>>> at net.sf.picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:119) >>>>> at net.sf.picard.analysis.CollectMultipleMetrics.doWork(CollectMultipleMetrics.java:136) >>>>> at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177) >>>>> at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119) >>>>> at net.sf.picard.analysis.CollectMultipleMetrics.main(CollectMultipleMetrics.java:100) >>>>> >>>>> Does anyone have any suggestions as to what I might be doing wrong, or how I might troubleshoot it? >>>>> >>>>> Many thanks, >>>>> Gavin >>>>> ------------------------------------------------------------------------------ >>>>> See everything from the browser to the database with AppDynamics >>>>> Get end-to-end visibility with application monitoring from AppDynamics >>>>> Isolate bottlenecks and diagnose root cause in seconds. >>>>> Start your free trial of AppDynamics Pro today! >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>>>> _______________________________________________ >>>>> Samtools-help mailing list >>>>> Sam...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/samtools-help >>>> >>> >> > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |