Marc Strous

MetaWatt Binner version 3.5

This version introduces the following improvements:

  • Fixed a bug in differential coverage binning that decreased the weight of the seed contig in bins. The fix leads to improved binning.
  • The optimization and polishing modules now respect differential coverages while merging bins and moving contigs, respectively. This leads to improved binning.


[Getting Started] (including installation, dependencies)

The pipeline has a modular architecture and can be easily customized to include or exclude features analysed. Help topics for version 3.1:

For the algorithm and software please cite:

*Strous M, Kraft B, Bisdorf R, TegetMeyer H (2012) The binning of metagenomic contigs for microbial physiology of mixed cultures. Frontiers in Microbial Physiology and Metabolism 3:410. doi: 10.3389/fmicb.2012.00410.

For questions, problems and comments, please use the forum!


The Metawatt binner was developed by Marc Strous with help of Regina Bisdorf. The support of the Max Planck Society, the European Research Council, the Bundesland Nord Rhein West Falen is gratefully acknowledged. Many thanks also to Halina Tegetmeyer, Beate Kraft, Harald Gruber-Vodicka, Xiaoli Dong and Manuel Kleiner for providing valuable suggestions and metagenomes for testing. Many thanks also to Lizbeth Sayavedra for help in porting Metawatt to iOS.

(C) Marc Strous, Calgary 2015

MetaWatt Binner version 3.4

This version introduces the following improvements:

  • MetaWatt now also runs on windows. Typically, you would still compute contig properties on a unix server (because of the dependencies and computational requirements) but exploration and binning modules could be done on a windows machine

  • Fewer and less arbitrary parameters, resulting in improved binning.

  • Fixing of minor bugs.

  • Addition of a coverage distro of bin contigs.

Metawatt Binner version 3.3

This version introduces the following minor improvements:

  • Much faster loading of projects

  • Slightly faster binning and much faster optimization of bins

  • More responsiveness in user interface

  • (Optional) fetching of reference genomes with wget (enables database updates with a proxy server).

  • Fixing of minor bugs, mainly related to shortlisting of bins

Metawatt Binner version 3.2

This version introduces the following major changes:

  • Database and taxonomy file creation and maintenance/updating is now fully automated

  • With a new project folder layout, the project is created fully automatically

  • Drastic reduction of disk space required. In addition, Metawatt now also handles gzipped read files

  • Integration of gene-modules into main pipeline. Metawatt now produces a phylogenetic tree including all reasonable bins, based on concatenated protein alignments of conserved genes

  • Multiple HMM profile files can be added to a project

  • Many minor fixes and improvements to the graphical user interface

  • Major code cleanup

  • Complete support for command line, fully automated binning on large servers


Metawatt Binner version 3.1

This version introduces the following major changes:

  • Better binning because of improved interaction and arbitration between tetranucleotide and coverage binning

  • Binning is saved in SIBCI format enabling export to other software and import of binning results from other programs into MetaWatt.

  • Any set of Pfam profiles can now be used to assess completeness or other bin properties, not just the predefined set supplied with MetaWatt.

  • You can now also mouse over bin contours enabling more precise editing.

  • Several minor improvements and fixes.

Metawatt Binner version 3.0

This version introduces the following major changes:

  • Differential coverage based binning is now integrated into the binning pipeline. The implementation is effective and fast. For example, combined tetranucleotide and coverage binning of a >30 Mb metagenome is completed within 30 seconds.

  • Read mapping information that links contigs is also used during binning.

  • A Bin optimizer module has been added that destroys bins with poor quality and merges oversplit bins. Decisions are based on phylogentic consistency, single copy conserved gene complementarity, GC content, coverage, coding density and degree of connectedness based on read mapping.

  • IMM binning has been removed entirely.

  • Metawatt now generally produces good bins without user supervision, even with standard settings, so implementation of command line usage has been improved to enable automated, command line binning.

  • Changes to the user interface to improve optimizing of the binning.

Metawatt Binner version 2.3

This version adopts a number of changes "under the hood" in the way the blastx and discovery-of-conserved-genes results are processed. Also, a small error that crippled version 2.2 was fixed.

Metawatt Binner version 2.2

This version introduces a number of improvements that dramatically speed up many steps. All modules should now complete in minutes, even for large datasets.

  • Adopted diamond for ultrafast and sensitive classification of contigs (replaced blastn and blastp classification).

  • Added the option to set BBMap's "fast=t" parameter for faster mapping of reads to contigs.

  • Created a dedicated pfam database for detection of conserved genes, speeding up this module.

  • Two changes to GC versus coverage plot: Added the option to plot coding density versus coverage for detection of Eukaryotes and increased maximum scale of coverage to 400x. At present the GC versus coverage plot still has some issues with proper display of scale bars which will be fixed eventually but by dragging the scale bars/setting zoom it is already easy to work around these problems.

  • The stdbuf command that is now only used in using the glimmer programs (IMM binning) because it is not available on many platforms and complicated successful installation for many users.

Metawatt Binner version 2.1

  • Fixed a bug that caused an empty GC/coverage plot in some locales.

  • Added an additional slider to the Gc versus coverage plot to enable downsampling of contigs for plotting (useful for extremely large metagenomes).

  • Updated to RDP classifier version 2.7.

  • Updated to usearch version 7.0

  • Added an option to customize the way external commands are called (Mac OS Users, remove replace "stdbuf -o0" with " ").

What's new in version 2.0?

The Metawatt Binner is a graphical Java program for the binning of metagenomic contigs and evaluation of the binning results. Version 2.0 has the following new features:

  • Taxonomic classification performed with blastn and/or blastp.

  • Additional assessment of bin quality by analysis of the number conserved single copy genes and by counting the number of transfer RNA genes.

  • Analysis of the genetic code used in each bin.

  • Use of multiple raw sequencing read files to compute coverage for each bin in each assembly for each readset.

  • Read mapping results are used to compute connections between contigs based on paired end and/or single end reads and these connections are used to "polish" binning results by rebinning unambiguously connected contigs to the correct bins. The polishing algorithm also makes use of unambiguous taxonomic classifications of contigs.

  • Redesigned coverage versus GC content plot also allows plotting coverage versus coverage for different readsets, trimming bins to "blobs" and creation of new bins from "blobs".

  • Detection and analysis of 16S rDNA genes, including fast construction of phylogenetic trees.

  • More responsive user interface.

  • Long contigs are properly binned (sometimes long contigs, >100 kb, were ignored in previous versions).

  • Console version to enable running the pipeline separately (e.g. on a computing cluster or for benchmarking) without the user interface.

What's new in version 1.7?

  • Contig names are now listed in contig table instead of a number
  • Taxa with less than 25.000 bp of contig data are no longer displayed in taxon list panel
  • Improved layouting of gc/coverage plot and bin focus panel
  • Bin shortlist is now shared by all samples of the project

What's new in version 1.6?

  • The bin editor, logbook and contig table have been integrated into the main window.
  • More navigation features have been added to rapidly browse the binning results.
  • Much faster computation of tetranucleotides.
  • Much faster and more robust creation of the taxonomy file
  • Faster creation of the blast database
  • Taxonomic distributions are visualized more clearly and interactively
  • N50 values are now computed correctly
  • Automatic creation of "negative" models for IMM binning
  • Known issue: Sequencing coverages above 100x are problematic in the GC/coverage plot

What's new in version 1.5?

  • Bug fix (Use of IMM models did still not work properly after porting to iOS)

What's new in version 1.4?

  • Bug fix (Build IMM models did no longer work in 1.3)
  • Bug fix (N50 was not calculated correctly)
  • Bug fix (With multitreaded blasting in 1.2, logging was broken)

What's new in version 1.3?

  • Metawatt now also runs on iOS.
  • Improvements to the layout of the user interface.
  • Simplified user interface by removing several options for annotation - they were no longer necessary after the addition of the "-evalue 1e-3" parameter for blast in version 1.2.
  • Added the option to filter contigs by length in the editor (suggestion of Harald Gruber-Vodicka).

What's new in version 1.2?

  • Blast is now run with the option "-evalue 1e-3", leading to much higher annotation speeds.
  • Blast and simple-score are now run on multiple processors, leading to higher speeds.
  • Minor bugs/inconsistencies fixed.


Wiki: Explanations of files generated
Wiki: Getting Started
Wiki: Getting started with the user interface
Wiki: How the binning is done
Wiki: Installation and dependencies
Wiki: Pipeline modules
Wiki: Strategy hints
Wiki: User interface