With 19.5 million paired end fragments (2x76bp) and D. melanogaster as reference (120Mbp genome) running PoPoolationTE2 has the following time requirements
step | time [minutes] |
---|---|
mapping both reads separately (bwa bwasw) using 12 threads | 33.5 |
restoring paired end information (se2pe) | 12.5 |
generating a ppileup file | 7.3 |
subsampling the ppileup file to a uniform physical coverage | 6.3 |
identifying signatures of TE insertions | 4.3 |
estimating strand of TEs | 4.2 |
estimating population frequency of TEs | 3.4 |
pairing up signatures of TE insertions | 2.4 |
total | 74.1 |
total - without mapping (= only PoPoolationTE2) | 28.1 |
To measure the computing time of PoPoolationTE2 (v1.09.03) we used real data published by http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002487
Briefly, 118 isofemale lines were established from a population of D. melanogaster collected 2008 in northern Portugal (Povoa de Varzim). Five females of each line were combined and sequenced as pool (Pool-Seq). Data are available from ENA (http://www.ebi.ac.uk/ena) with the accession number SRA035392. Reads were mapped to the D.melanogaster reference genome (v5.31) with bwa (v0.7.4) using the bwasw algorithm.
Time measurements were performed with the following shell script https://sourceforge.net/projects/popoolation-te2/files/publicationrelated/timemeasure-reviewer.sh/download