I guess options --norc/--nofw are not correct. As the option's description below reads, the assumption of BOWTIE (subsequently TopHap and HISAT ) is that read1/mate1 always comes from the initial strand from which cDNA of gDNA fragment was synthesized. However, this is not always true because it depends waht library prep kit is used. For example TruSeq stranded RNA-Seq library prep, recommended by Illumina, creates a library in which read1/mate 1 maps to the anti-sense strand which is in contrast to the assumption of Bowtie's --norc/--nofw. If I am wrong please correct me.
The assumptions that BOWTIE makes:
-- fr: this means that strands of a fragement are read from the 5' end so that the; with respect to the Watson strand; upstream mate/read always maps to the ref. Watson/forward strand and the downstream one maps to the ref. Crick strand. This is a correct assumption because the polymerase's direction is 3' -> 5' upon reading sequences.
--nofw: it assumes that read1/mate1 maps to the initial strand so that when this option is combined with --fr, all (rad1, read2) pairs with the orientation (F,R) are discarded. This is not right!
For paired-end reads using --fr or --rf modes, --nofw and --norc apply to the forward and reverse-complement pair orientations. I.e. specifying --nofw and --fr will only find reads in the R/F orientation where mate 2 occurs upstream of mate 1 with respect to the forward reference strand.
typo: "cDNA of gDNA" -> "cDNA or gDNA"
Last edit: Morteza 2018-06-04