Tanja Magoc - 2012-08-28

FLASH: Fast Length Adjustment of SHort reads

About FLASH

FLASH, Fast Length Adjustment of SHort reads, is a very accurate fast tool to merge paired-end reads from fragments that are shorter than twice the length of reads. The extended length of reads has a significant positive impact on improvement of genome assemblies.

Accuracy

FLASH merges reads at a very high accuracy.
FLASH accuracy on one million 100bp long synthetic pairs generated from fragments with a mean length of 180bp, normally distributed with a standard deviation of 20bp:

Parameters No error 1% error rate 2% error rate 3% error rate 5% error rate
default parameters 99.73% 99.68% 98.43% 94.76% 77.91%
more aggressive parameters 99.73% 99.68% 99.06% 98.30% 93.65%

Accuracy on real data

Data Accuracy
47,052 pairs of 101bp long reads from Staphylococcus aureus 90.77%
18,252,400 pairs of 101bo long reads from human 91.02%

Time requirements

When run in single-threaded mode:
It takes 120 seconds to process one million 100-bp long pairs on a server with 256GB of RAM and a six-core 2.4GHz AMD Opteron CPU.
It takes 129 seconds to process one million 100-bp long pairs on a desktop with 2GB of RAM and a dual-core Intel Xeon 3.00GHz CPU.
Time is linearly proportional to the read length and the number of reads.

Publication

http://bioinformatics.oxfordjournals.org/content/early/2011/09/07/bioinformatics.btr507.abstract