|
From: Somya K. <so...@st...> - 2016-07-11 21:06:47
|
I have lots of insertions and deletions that appear in the output files that I have. From http://samtools.sourceforge.net/cns0.shtml, the sample they give is in the format seq2 156 * +AG/+AG 71 252 99 11 +AG * 3 8 0 seq2 157 A A 57 0 99 10 .$.$........ 97<<<<<<<< seq2 158 A R 18 18 99 8 GG$G..... <;;<<<<< seq2 159 T T 8 0 99 7 A$A$..... 3:<<<<< In this example, they say "The line with the 3rd column a star indicates that the AG insertion is supported by 3 reads; 8 reads agree with the reference according to the raw alignment; no reads support a third allele. However, SAMtools infers a AG homozygous insertion with a high score 252 because when we realign the reads with the prior of an insertion, we found that the 8 reads mapped without gaps are due to a tandam repeat." In my outputs however, I have more columns following the 3 8 0 that is printed in their sample. Do you know what the columns after the tab-delimited * and -G mean? Here is a sample of my output data for your reference: seq1 3404636 * */-G 198 198 58 49 * -G 46 3 0 0 0 seq1 3978142 * */+T 76 76 58 74 * +T 73 1 0 0 0 seq1 3996202 * */+G 51 51 56 63 * +G 62 1 0 0 0 -- Stanford University Class of 2019 so...@st... (408)888-1430 |