PASHA: Parallelized Short Read Assembly download

PASHA is a parallel short read assembler for large genomes using de Bruijn graphs. Taking advantage of both shared-memory multi-core CPUs and distributed-memory compute clusters, PASHA has demonstrated its potential to perform high-quality de-novo assembly of large genomes in reasonable time with modest computing resources. Our evaluation using three small real paired-end datasets shows that PASHA is able to produce better assemblies with comparable genome coverage and mis-assembly rates compared to three leading assemblers: Velvet, ABySS and SOAPdenovo. Moreover, PASHA achieves the fastest speed for all three datasets on a single CPU. For the human genome, PASHA achieves competitive assembly quality with ABySS and is able to complete the assembly in about 21 hours, which is about 2.38× faster than ABySS on the same hardware configurations.

Project Activity

See All Activity >

License

Apache License V2.0, GNU General Public License version 2.0 (GPLv2)

Follow PASHA: Parallelized Short Read Assembly

PASHA: Parallelized Short Read Assembly Web Site

Other Useful Business Software

Ship Agents Faster

Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free

Rate This Project

User Reviews

Be the first to post a review of PASHA: Parallelized Short Read Assembly!

Additional Project Details

Operating Systems

BSD, Linux

User Interface

Console/Terminal

Programming Language

C++

Related Categories

C++ Bio-Informatics Software

Registered

2012-07-05

Similar Business Software

OmicsBox

OmicsBox is a leading bioinformatics solution that offers end-to-end data analysis of genomes, transcriptomes, metagenomes, and genetic variation studies. The application is used by top private and public research institutions worldwide and allows researchers to easily process large and complex...

See Software
OmnibusX

In the world of biological data science, navigating vast and complex datasets can be a daunting task. Our software toolkit is designed to empower you by accelerating analysis, promoting accessibility, and providing comprehensive visualization tools to help you make sense of these intricate...

See Software
QIAGEN CLC Genomics Workbench

QIAGEN CLC Genomics Workbench is a powerful solution that works for everyone, no matter the workflow. Cutting-edge technology and unique features and algorithms widely used by scientific leaders in industry and academia make it easy to overcome challenges associated with data analysis....

See Software
Geneious

Geneious Prime makes bioinformatics accessible by transforming raw data into visualizations that make sequence analysis intuitive and user-friendly. Simple sequence assembly and easy editing of contigs. Automatic annotation for gene prediction, motifs, translation, and variant calling. Genotype...

See Software
GenomeBrowse

This free tool delivers stunning visualizations of your genomic data that give you the power to see what is occurring at each base pair in your samples. GenomeBrowse runs as a native desktop application on your computer. No longer do you have to sacrifice speed and interface quality to obtain a...

See Software
Genome Analysis Toolkit (GATK)

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. The GATK...

See Software