Hi all,
I am analysing a dataset where the first 6bp of single-end Illumina
reads have a different behaviour than the rest of the read.
Is there a way to describe the desired penalty for mismatches in the
first 6bp to be zero, while still allowing the 6bp to contribute to
the final score if they match the reference?
These 6bp at the beginning of the read have a binary behaviour: they
either (a) belong to the contiguous region of the rest of the read, or
(b) they are 6bp mismatches not expected to align to the reference
contiguously to the rest of the read. Because a proportion of reads
will be of case (a), I would like to include them in the analysis
rather than trimming them.
Looking forward to hearing from you,
Cheers,
Albert.