Menu

Home

Renato Oliveira

OverlapPER: Overlap Paired-End Reads

OverlapPER may be used to merge overlapping paired-end reads


Figure 1 - Merging process for paired-end reads. (A) OverlapPER script first finds a seed (a short sequence in one of the reads represented in bold) (B) The reads are positioned according to the seed found and the total overlap is determined. (C). The total overlap is analyzed. If there is a hit in the alignment, the identity score is incremented. If a base is aligned to a gap, the identity score is incremented. In case of a mismatch in the alignment, if the next 5 bases (tolerance) are identical, the mismatch score is incremented, otherwise a gap insertion is repeated 4 times until the next 5 bases are identical. Nucleotides in bold represent a hit in the alignment.

Requirements

  • Python 3+

How to Cite

If you use OverlapPER, please cite:

Oliveira, R. R. M., Nunes, G. L., de Lima, T. G. L., Oliveira, G., & Alves, R. (2018). PIPEBAR and OverlapPER: tools for a fast and accurate DNA barcoding analysis and paired-end assembly. BMC Bioinformatics, 19(1). doi:10.1186/s12859-018-2307-y

Usage

python overlapper.py -f <forward_reads.fastq> -r <reverse_reads.fastq> --mo <min_overlap> --ms <min_similarity>
Ex: python overlapper.py -f forward_reads.fastq -r reverse_reads.fastq --mo 15 --ms 0.9</min_similarity></min_overlap></reverse_reads.fastq></forward_reads.fastq>

OverlapPER: Overlapping Paired-End Reads
Usage: python overlapper.py [options]
Options:
-h
Show this message.
-f
Path to forward reads
-r
Path to reverse reads
--mo
Length of the minimum overlap between the paired
reads (Default 25).
--ms
Percentage of the accepted minimum similarity in an overlap
region oftwo paired reads (default is 0.9).

Contributors

  • Renato Oliveira - renato.renison@gmail.com
  • Gisele Nunes - gilopesnunes@gmail.com
  • Ronnie Alves - ronnie.alves@itv.org
  • Claudomiro Sales - claudomiro.sales@gmail.com
  • Guilherme Oliveira - guilherme.oliveira@itv.org

Contacts

  • renato.renison@gmail.com

How to cite

OverlapPER performance

**Table 1 - Results obtained by OverlapPER, PEAR, FLASH and COPE. **

Parameters: minimum overlap of 10 bp and minimum identity of 90%. Mean identity, mismatch and gap openings are shown in comparison to the reference genome.

The wiki uses Markdown syntax.

Project Members:


MongoDB Logo MongoDB