Menu

TE-Benchmark

Robert Kofler

To allow benchmarking tools for the identification of TE insertions from Pool-Seq data, we provide paired end reads for 1000 TE insertions in a region of an artificial chromosome being devoid of any repeats. The insertions are of random position, family and TE sequence. These data have been used to generate the table where we compare the performance of PoPoolationTE, PoPoolationTE2 and TEMP in the main manuscript.

known TE insertions
The position, family, strand and population frequency of the simulated TE insertions can be found in the following file. The insertions identified with a tool of interest should be compared to this data set.
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/statistics-freqrange001to10.txt/download

**the artificial reference chromosome **
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/chasis1M.fasta.zip/download

the consensus sequences of the TE insertions
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/teseq-clean-ml100noS4.fasta/download

a hierarchy of the TE insertions
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/tehier-ml100noS4.fasta/download

the simulated paired end reads
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/chi2_1.fastq.zip/download
https://sourceforge.net/projects/popoolation-te2/files/te-benchmark/chi2_2.fastq.zip/download
TE insertions identified with these reads should have the exact position, frequency, strand and family as provided in the known insertions


Related

Wiki: Home

MongoDB Logo MongoDB