Menu

#7 Bias towards forward strand in msa.sh

1.0
open
nobody
None
2018-03-30
2018-03-30
No

The FindPrimers process method seems to have a bias towards mapping primers on the forward strand of the target.

For example, with the following template sequence…

>template
CTCGAGGGGCCTAGACATTGCCCTCCAGAGAGAGCACCCAACACCCTCCAGGCTTGACCGGCCAGGGTGTCCCCTTGACACCCGTCAAGCCTCACGA

…it should be possible to use the following commands to extract the sequence between the given primers:

msa.sh in=template.fa primer=GGGGCCTAGACATTGCCCTCCA out=fwd.sam
msa.sh in=template.fa primer=GACACCCTGGCCGGTCAAGCCT out=rev.sam
cutprimers.sh in=template.fa sam1=fwd.sam sam2=rev.sam include=t out=amplicon.fa

This outputs the following (primer alignments shown in uppercase):

> GGGGCCTAGACATTGCCCTCCAgagagagcacccaacaccctccaggcttgaccggccagggtgtccccttGACACCCGTCAAGCCT

The sequence GACACCCGTCAAGCCT is the best match for the reverse primer on the forward strand, but not the best match overall.

That is on the reverse strand, which would result in the following (primer alignments again in uppercase):

GGGGCCTAGACATTGCCCTCCAgagagagcacccaacaccctccAGGCTTGACCGGCCAGGGTGTC

This seems to be due to assignment of reverse strand mapping scores to the best-score variable for the forward strand. In BBMap_37.95.tar.gz, this is on line 183 of file 'bbmap/current/jgi/FindPrimers.java'.

It looks like this issue might be fixed by changing that line to:

score2=ss.score=ss.quickScore;

With the current code, none of my output SAM files contain any reads mapped to the reverse strand.

Discussion


Log in to post a comment.

MongoDB Logo MongoDB