From: David N. <Dav...@hc...> - 2012-12-11 18:41:54
|
Hello Jesse, Yes I've seen this error before. You'll need to replace the header with one consistent across all bam files. There's a picard app that does this I believe. The issue is STP generates a custom header on the fly based on the end of the last alignment in each chromosome. So the chromosome length differs for each file. You can have STP use a particular header with the -r argument for future parsings. -cheers, D On 12/11/12 11:35 AM, "Jesse Rowley" <jes...@u2...> wrote: >Hi David, >I aligned my RNA-seq reads with >@align -novoalign [-o SAM -r All 50 -a >AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC] -g >hg19EnsTransRad46Num100kMin10SplicesChrPhiXAdaptr -i *.txt.gz" > cmd.txt >this was followed by samtranscriptome parser. Now I ma running GATK. > >I get the following error with GATK: >##### ERROR MESSAGE: Input files reads and reference have incompatible >contigs: Found contigs with the same name but different lengths: >##### ERROR contig reads = chr20 / 62940766 >##### ERROR contig reference = chr20 / 63025520. >##### ERROR reads contigs = [chr20, chr21, chr22, chr5, chr6, chr7, >chr8, chr9, chr1, chr2, chr3, chr4, chr10, chr11, chr12, chr13, chrY, >chrX, chr15, chr14, chr17, chr16, chr19, chr18, chrM] >##### ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, >chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, >chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM] > >These are different as shown below - is it because these are changed by >maketranscriptome or was this transcriptome made with a different >reference? should I just revise my Bam header to match or would that be >problematic? My strategy is to use GATK in the @annot pipeline to >annotate variants (first splitting out splice junction reads prior to >indel realignment since GATK doesn't like those, and then remerging them >before annotation) >Fasta.fai: >chr1 249250621 6 50 51 >chr2 243199373 254235646 50 51 >chr3 198022430 502299013 50 51 >chr4 191154276 704281898 50 51 >chr5 180915260 899259266 50 51 >chr6 171115067 1083792838 50 51 >chr7 159138663 1258330213 50 51 >chr8 146364022 1420651656 50 51 >chr9 141213431 1569942965 50 51 >chr10 135534747 1713980672 50 51 >chr11 135006516 1852226121 50 51 >chr12 133851895 1989932775 50 51 >chr13 115169878 2126461715 50 51 >chr14 107349540 2243934998 50 51 >chr15 102531392 2353431536 50 51 >chr16 90354753 2458013563 50 51 >chr17 81195210 2550175419 50 51 >chr18 78077248 2632994541 50 51 >chr19 59128983 2712633341 50 51 >chr20 63025520 2772944911 50 51 >chr21 48129895 2837230949 50 51 >chr22 51304566 2886323449 50 51 >chrX 155270560 2938654113 50 51 >chrY 59373566 3097030091 50 51 >chrM 16571 3157591135 50 51 > >Bam header: >@HD VN:1.0 SO:coordinate >@SQ SN:chr20 LN:62940766 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr21 LN:48129227 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr22 LN:51253224 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr5 LN:180910700 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr6 LN:170933915 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr7 LN:159102511 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr8 LN:146307037 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr9 LN:141156475 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr1 LN:249250020 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr2 LN:243198952 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr3 LN:197957767 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr4 LN:191041935 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr10 LN:135528848 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chr11 LN:134900590 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chr12 LN:133811418 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chr13 LN:115117205 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chrY LN:59014245 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAda >ptr >@SQ SN:chrX LN:155267065 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr15 LN:102530814 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chr14 LN:107295415 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXA >daptr >@SQ SN:chr17 LN:81198313 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr16 LN:90291808 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr19 LN:59126095 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chr18 LN:78015294 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAd >aptr >@SQ SN:chrM LN:26571 AS:AS:hgEnsTransRad46Num100kMin10SplicesChrPhiXAdaptr >@RG ID:unknownReadGroup SM:unknownSample >@PG ID:SamTranscriptomeParser CL: args -f >9657X1_121116_SN141_0600_AD1FYUACXX_1.sam.gz > >thanks for your input, >Jesse Rowley >Division of Pulmonary Medicine >University of Utah School of Medicine |