#251 -trim5 bases are reported as "M"atching in CIGAR

Ian Davis

I'm using Bowtie 2.0.0-beta7 on 64-bit Ubuntu Linux. I have single-end Illumina reads with an 11bp adapter on the 5' end of each read. I passed "--trim5 11" to Bowtie. In the output SAM file, the full (untrimmed) read sequence is reported. The 12th base (first base of real data) is aligned with the correct location in the genome, but the SAM record has the 11 adapter bases aligned with the 11 bases preceeding that in the genome, even though they don't match.

This would be OK if the CIGAR string reported those 11 leading bases as hard- or soft-masked (I'm not sure of the difference), but it doesn't -- they are reported as matching ("M") the genome at those positions! To me, this is extremely surprising behavior to say the least! It seems like a bug, but if it was intentional, I'd be very interested in the rationale. It seems like either excluding the trimmed bases from the output altogether or modifying the CIGAR string to mark the trimmed bases as masked would be an appropriate fix.



  • Nobody/Anonymous

    301 Moved Permanently I was recommended this blog by my cousin. Im not sure whether this post is written by him as nobody else know such detailed about my problem. Youre incredible! Thanks! your article about 301 Moved Permanently Best Regards Rolf Lisa
    <a href="http://www.madville.com/blogs/708886_wedding_dress_selection_hen_house_to_construct_a_far_more_perfect_figure" title="Wedding Dress">Wedding Dress</a>

  • Nobody/Anonymous

    So many writers today dont take pride in their work the way you obviously do. Thank you for your dedication to excellent writing and creating this wonderful content. Its as if you read my mind.
    <a href="http://www.deinekollegen.de/blog.php?user=duanemcmahan718&blogentry_id=5339575" title="Crimson">Crimson</a>

  • Ben Langmead

    Ben Langmead - 2012-12-14

    Hi Ian,

    I am having trouble re-creating this. When I use --trim5, the bases are omitted from the SAM record. I.e. the SEQ field contains the trimmed read sequence. Can you tell me exactly what version you're using, and perhaps provide some example data?


  • Ben Langmead

    Ben Langmead - 2012-12-14
    • status: open --> pending

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks