Hi Brian
Could you please guide me to removing duplicates using paired end data?
I want to modify dedupe.sh in=X.fa out=Y.fa outd=duplicates.fa
I am aware that this does not currently support Containment
However, I would get a lot more trouble later on if my paired reads don't match up after individual dedupeing.
Hello Niranjan,
You can use Dedupe like this:
dedupe.sh in1=read1.fq in2=read2.fq out=deduped.fq outd=duplicates ac=f
The pairs will be kept together. The output file will be interleaved. You can de-interleave it like this:
reformat.sh in=deduped.fq out1=deduped1.fq out2=deduped2.fq
-Beian
Log in to post a comment.
Hi Brian
Could you please guide me to removing duplicates using paired end data?
I want to modify dedupe.sh in=X.fa out=Y.fa outd=duplicates.fa
I am aware that this does not currently support Containment
However, I would get a lot more trouble later on if my paired reads don't match up after individual dedupeing.
Hello Niranjan,
You can use Dedupe like this:
dedupe.sh in1=read1.fq in2=read2.fq out=deduped.fq outd=duplicates ac=f
The pairs will be kept together. The output file will be interleaved. You can de-interleave it like this:
reformat.sh in=deduped.fq out1=deduped1.fq out2=deduped2.fq
-Beian