Menu

Strategy hints

Marc Strous

back to main page

Getting started

  1. Create a new project.
  2. Prepare your taxonomyfile, diamond and conserved genes databases (see [Installation and dependencies]).
  3. Add your assembled contigs and readsets in the toolbar. If you want to detect 16S rDNA genes, also add the ssu_rna HMM profile.
  4. Navigate to the "Pipeline" tab and configure all your modules with the submenu for each module. You may initially set the modules [Optimize bins] and [Polish bins] to "skipped" to inspect the raw binning results first. Otherwise, these modules may destroy bins or reduce bin qualities without you noticing.
  5. IF YOU DO NOT HAVE MULTIPLE READSETS, DISABLE COVERAGE BASED BINNING IN THE [Bin with tetranucleotides] MODULE: Set the parameter "Relative weight of coverage binning" to zero.
  6. Save your project.
  7. Run the pipeline.
  8. Save your project.
  9. Explore the results in the "Binning" tab. To identify good bins, inspect the taxonomic profile and number of conserved single copy genes. Look at the position of the bins on the GC/coverage plot. Each good bin should be visible as a single, cohesive "cloud" of contigs. When bins are close on the plot, you may expect some cross binning. When a bin has a wide GC distribution, e.g. because of short contigs, you may expect some oversplitting. Both these problems may be fixed by running [Optimize bins] and [Polish bins].
  10. If you detect cross-binning between two or more bins, inspect the binning report at the bottom of the center-lower panel. If there were a lot of conflicts between tetranucleotides and differential coverage ("ambiguous" > 10%) it is best to increase the thresholds in [Bin with tetranucleotides].
  11. For some bins differential coverage will yield better results, for others tetranucleotides. To explore this, you can run [Bin with tetranucleotides] with "Relative weight of coverage binning" set to 0 and 1. To then combine the results, you can shortlist good bins obtained with either approach. In the shortlist each contig will only be present in a single bin.
  12. If automated fixing of binning problems does not work for your data, you can manually combine and split bins after you have added them to the shortlist. For splitting bins, you can use the features of the contour plot (right click on contours for editing options). The contour plot (GC/coverage panel) usually shows bins as "blobs" of contigs with similar coverage and GC content. Metawatt creates a contour plot of the density distribution of the contigs on the GC/coverage plot. When you move the mouse over a contour in the plot, the taxonomic profiles of the contours enclosed by that contour. If you have multiple readsets you can also plot sequencing coverage of one readset against sequencing coverage of another readset. In some cases you can bin apart closely related populations because they have different coverages in different readsets.
  13. Save your project after editing.
  14. Use the "Genes" part of the pipeline and the genes panel to detect the 16S sequences associated with the individual bins. The Gene modules only appear after you added the HMM profile to the project (located next to readsets in toolbar). What you hope to find is that every good bin can be linked to a 16S rDNA sequence. Note that these 16S rDNA genes are often not binned correctly with tetranucleotides because of their atypical nucleotide compositions. But differential coverage should not have a problem with 16S sequences.
  15. You can create editable trees of 16S rDNA genes detected and also add your own fasta file of gene sequences to be included in the tree. A distance matrix shows the distances between the detected genes and between reference sequences you provide.
  16. Explore the many export options: All graphics can be exported as .svg to create (almost) publication-ready figures and both automatically generated bins and shortlisted bins can be exported as fasta files...

Related

Wiki: Bin with tetranucleotides
Wiki: Home
Wiki: Installation and dependencies
Wiki: Optimize bins
Wiki: Polish bins

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.