The Metawatt 3.2 pipeline is modular. Via the user interface the entire pipeline can be run or individual modules can be run separately. Modules can also be set to "skipped", and can be run with "force" to overwrite previous results. If "force" is not applied, previous files will be used and not get recreated. When the pipeline is running interactively, progress is indicated with progress bars and a logbook is displayed next to the pipeline under the "pipeline" tab in the user interface. The logbook is also saved in the file "metawatt-logbook.txt".
For each project the pipeline consists of the following modules:
[Update databases] - Creates and updates databases used for taxonomic classification of contigs and bins, for bin completeness assessment and phylogenetic treeing.
[Compute coverage and percent GC] - Computes coverage, based on assembler output, as well as coding density and percent GC.
[Classify by diamond blastx] - Classifies contigs based on a blastx search to a protein database.
[Predict transfer RNAs] - Predicts and counts transfer RNA genes for each contig.
[Six Frame PFAM] - Performs a HMM scan for conserved genes for each contig in six frames and counts conserved single copy genes, predicts the genetic code used, for each bin.
[Map reads to contigs] - Maps filtered sequencing reads to the assembly.
[Compute tetranucleotide frequencies] - Computes tetranucleotide frequencies for each contig.
[Bin with tetranucleotides] - Bins contigs based on tetranucleotide sequences, differential coverage, and read mappings that link contigs.
[Optimize bins] - This module splits and merges bins based on taxonomic profiles, conserved single copy genes, etc.
[Polish bins] - Uses mapping results and taxonomic classifications to rebin unambiguously assigned or contigs connected by read mappings.
[Make bin shortlist] - Handles user actions to create a list of shortlisted bins.
[Calculate bin phylogeny] - Creates a phylogenetic tree of binned genomes based on concatenated alignments of conserved single copy genes.
[Export bins] - Exports bins as fasta files, as well as .csv files of bin properties, bin taxonomic distribution, phylogenetic trees in .svg format.
Wiki: Bin with Markov Models
Wiki: Bin with tetranucleotides
Wiki: Calculate bin phylogeny
Wiki: Calculate gene trees
Wiki: Classify by blastn
Wiki: Classify by blastp
Wiki: Classify by blastp#1
Wiki: Classify by diamond blastx
Wiki: Combine gene fragments
Wiki: Compute coverage and percent GC
Wiki: Compute tetranucleotide frequencies
Wiki: Export bins
Wiki: Find genes
Wiki: Gene read extractor
Wiki: Getting Started
Wiki: Home
Wiki: Make bin shortlist
Wiki: Make blastn database
Wiki: Make blastp database
Wiki: Make diamond database
Wiki: Make taxonomy file
Wiki: Map reads to contigs
Wiki: Map reads to genes
Wiki: Optimize bins
Wiki: Polish bins
Wiki: Predict transfer RNAs
Wiki: Read Filter
Wiki: Six Frame PFAM
Wiki: Update databases
Wiki: User interface