FLASH, Fast Length Adjustment of SHort reads, is a very accurate fast tool to merge paired-end reads from fragments that are shorter than twice the length of reads. The extended length of reads has a significant positive impact on improvement of genome assemblies.
Accuracy
FLASH merges reads at a very high accuracy.
FLASH accuracy on one million 100bp long synthetic pairs generated from fragments with a mean length of 180bp, normally distributed with a standard deviation of 20bp:
Parameters
No error
1% error rate
2% error rate
3% error rate
5% error rate
default parameters
99.73%
99.68%
98.43%
94.76%
77.91%
more aggressive parameters
99.73%
99.68%
99.06%
98.30%
93.65%
Accuracy on real data
Data
Accuracy
47,052 pairs of 101bp long reads from Staphylococcus aureus
90.77%
18,252,400 pairs of 101bo long reads from human
91.02%
Time requirements
When run in single-threaded mode:
It takes 120 seconds to process one million 100-bp long pairs on a server with 256GB of RAM and a six-core 2.4GHz AMD Opteron CPU.
It takes 129 seconds to process one million 100-bp long pairs on a desktop with 2GB of RAM and a dual-core Intel Xeon 3.00GHz CPU.
Time is linearly proportional to the read length and the number of reads.
FLASH: Fast Length Adjustment of SHort reads
About FLASH
FLASH, Fast Length Adjustment of SHort reads, is a very accurate fast tool to merge paired-end reads from fragments that are shorter than twice the length of reads. The extended length of reads has a significant positive impact on improvement of genome assemblies.
Accuracy
FLASH merges reads at a very high accuracy.
FLASH accuracy on one million 100bp long synthetic pairs generated from fragments with a mean length of 180bp, normally distributed with a standard deviation of 20bp:
Accuracy on real data
Time requirements
When run in single-threaded mode:
It takes 120 seconds to process one million 100-bp long pairs on a server with 256GB of RAM and a six-core 2.4GHz AMD Opteron CPU.
It takes 129 seconds to process one million 100-bp long pairs on a desktop with 2GB of RAM and a dual-core Intel Xeon 3.00GHz CPU.
Time is linearly proportional to the read length and the number of reads.
Publication
http://bioinformatics.oxfordjournals.org/content/early/2011/09/07/bioinformatics.btr507.abstract