Menu

Tree [c80897] master /
 History

HTTPS access


File Date Author Commit
 GAGE 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 Sample 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 Scripts 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 src 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 Config1 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 LICENSE 2015-08-21 Mike Schatz Mike Schatz [2294d3] Update docs and license file
 MANUAL 2015-09-29 Michael Schatz Michael Schatz [c80897] Add note about manual, add citation
 Makefile 2013-10-09 Michael Schatz Michael Schatz [90f67f] Commit the new code
 Metassemble_manual.pdf 2015-09-29 Michael Schatz Michael Schatz [a0648c] Add note about manual, add citation
 README 2015-09-29 Michael Schatz Michael Schatz [c80897] Add note about manual, add citation

Read Me

Metassembler: merging and optimizing de novo genome assemblies 
September 29, 2015
-------------------------------------------------------------

Alejandro Hernandez Wences and Michael C Schatz

Simons Center for Quantitiatve Biology
Cold Spring Harbor Laboratory
Cold Spring Harbor, NY


Typically de novo genome sequencing projects generate multiple assemblies of the 
same sample using different softwares and/or different parameters of the same 
software. Instead of discarding the extra assemblies, Metassembler merges them
to the top assembly using mate-pair information and whole-genome alignments, in
order to generate a single superior assembly. The final assembly will combine
the best locally superior assemblies throughout the genome. 

Please cite:

Metassembler: merging and optimizing de novo genome assemblies
Wences, AH Schatz MC (2015) Genome Biology 16:207. doi:10.1186/s13059-015-0764-4
http://www.genomebiology.com/2015/16/1/207


INSTALLATION:
-------------

Metassembler requires the following external programs to be installed:
1) MUMmer whole genome alignment pakcage
2) bowtie2
3) samtools
4) python 2.7

The argparse python module must also be installed: 
https://pypi.python.org/pypi/argparse

For general instructions on installing python packages in standard and 
non-standard locations please refer to: http://docs.python.org/2/install/

If these requirements are met then, under unix like systems, type 'make' in 
the 'Metassembler/' root directory.


MANUAL
------

Details on how to use the wrapper metassemble are given in MANUAL

Please also check the detailed Manual here:
https://sourceforge.net/projects/metassembler/files/Metassemble_manual.pdf/download


SAMPLE DATA:
------------

A sample data is provided for testing the installation and for familiarizing
 with Metassembler. It consists of two alternate assemblies A.fa and B.fa 
generated from the first ~250kb Staphylococcus aureus genome with some 
simulated differences.

There are two ways in which you can run the Metassembler, the easiest way is
 using the wrapper 'metassemble' which takes as input a configuration file.

In Sample/meta1 run:
     ./Metassemble_script.sh

This will create a configuration file and run metassemble for A.fa and B.fa
A directory MergeMetassemble/ will be created. This will contain all the 
information used in the metassembly process as well ass the final results. 
The general layout of the output directory and the description of the important 
files contained in it is found in MANUAL. In particular you will find a 
description of the *.metassem file which contains instructions on how the 
metassembly final sequence is constructed. In this sample data we expect 
that the metassembly sequence is composed of assembly A sequence revised 
with assembly B insertions. 


The other way to perform the Metassembly is running all the processes 
step-by-step. 

In Sample/meta1 run:
     ./Step_by_Step_script.sh

This will run each of processes in turn, including the computation of the 
CE-statistic for the starting assemblies and the whole genome alignment 
using the nucmer program from the MUMmer package.

The resulting metassembly should be a single contig with deletions in 
assembly A corrected using sequence from assembly B.


Special Thanks:
Paul Baranay and Scott Emrich
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.